The Water Hub Hackathon; We won!

Well well well, we’ve only gone and won The Water Hub hackathon! Well, joint winners but the main word is WINNER. First of all we want to say thank you to all the guys at the Water Hub and the Sunderland Software Centre for organising and inviting. There was some tough competition there and we are thrilled to have been ajudged joint top! Here’s how we won: The first day started off with presentations from Antonia Scarr and Matt Starr from the enviroment agency (who we apologise profusely to for constantly harassing about their current system and data), Martin Colling from the Wear Rivers Trust and Louise Bracken of The Water Hub.

eRum Competition Winners

The Main Competition The Secondary Competition What next? The results of the eRum competition are in! Before we announce the winners we would like to thank everyone who entered. It has been a pleasure to look at all of the ideas on show. The Main Competition The winner of the main competition is Lukasz Janiszewski. Lukasz provided a fantastic visualisation of the locations of each R user/ladies group and all R conferences.

Regular Expressions Every R programmer Should Know

Regex: The backslash, \ Regex: The hat ,^, and dollar, $ Regex: Round parentheses,(), and the pipe, | Regex: Square parentheses,[], and the asterisk, * Regular expressions. How they can be cruel! Well we’re here to make them a tad easier. To do so we’re going to make use of the stringr package install.packages("stringr") library("stringr") We’re going to use the str_detect() and str_subset() functions. In particular the latter.

ReCoding the Wall: Mixing art and code

At Jumping Rivers we often collaborate with the local community. This includes attending regional events such as those run by Creative FUSE, a partnership between the North East’s five universities. I recently attended an event at the National Glass Centre called ReCoding the Wall. The artwork, Colour Field, is a large interactive LED wall currently on display at the National Glass Centre in Sunderland. This was an opportunity to investigate, experiment with, and modify a new artwork by Cate Watkinson and Colin Rennie.

Which world leaders are twitter bots?

Set-up Getting the tweets Are world leaders actually bots? Set-up Given that I do quite like twitter, I thought it would be a good idea to right about R’s interface to the twitter API; rtweet. As usual, we can grab the package in the usual way. We’re also going to need the tidyverse for the analysis, rvest for some initial webscraping of twitter names, lubridate for some date manipulation and stringr for some minor text mining.

Edinbr: Text Mining with R

During a very quick tour of Edinburgh (and in particular some distilleries), Dave Robinson (Tidytext fame), was able to drop by the Edinburgh R meet-up group to give a very neat talk on tidy text. The first part of the talk set the scene What does does text mean? Why make text tidy? What sort of problems can you solve. This was a very neat overview of the topic and gave persuasive arguments around the idea of using a data frame for manipulating text.

R & Python Machine Learning Courses

Leeds (Predictive Analytics in R) London (Tensorflow) Birmingham (Python & Machine Learning) Hi there! We’re running some courses on R, Python and Tensorflow around the UK that you might be interested in! All courses are spearheaded with lectures by one of our first-class trainers. The lectures are interspersed with practicals and coffee breaks. Attendees get a set of in depth notes to pair with the lecture. More details and information on prerequisite knowledge are available on our course description page.

Free ticket to eRum

The Main Competition The Secondary Competition So… big news. Jumping Rivers is sponsoring eRum 2018 and in light of this news we are giving away a free place at the conference! (Not to mention our very own lead consultant, Colin Gillespie, is one of the invited speakers.) The Main Competition Here at Jumping Rivers, we maintain the site meetingsR. This comprises of three comprehensive lists: All upcoming (and foregone) R conferences.

Our Logo In R

Hi all, so given our logo here at Jumping Rivers is a set of lines designed to look like a Gaussian Process, we thought it would be a neat idea to recreate this image in R. To do so we’re going to need a couple packages. We do the usual install.packages() dance (remember this step can be performed in parallel) install.packages(c("ggplot2", "ggalt", "readr")) We’re also going to need the data containing the points for the lines and which set of points belongs to which line.

Styling Base R Graphics

Publication quality base R graphics Fixing the problem Why not use ggplot2 (or something else)? Publication quality base R graphics Base R graphics get a bad press (although to be fair, they could have chosen their default values better). In general, they are viewed as a throw back to the dawn of the R era. I think that most people would agree that, in general, there are better graphics techniques in R (e.

StanCon 2018 Highlights

Highlights from StanCon 2018 This year we had the privelage of sponsoring StanCon. Unfortunately, we weren’t able to actually attend the conference. Rather than let our ticket go to waste, we ran a small competition, which Ignacio Martinez won with his very cool (but in alpha stage) R package - see gif above. Highlights from StanCon 2018 During my econ PhD I learned a lot about frequentist statistics.

SatRday in South Africa

What is SatRday? SatRday in Cape Town Be in it to win it Jumping Rivers is proud to be sponsoring the upcoming SatRday conference in Cape Town, South Africa on 17th March 2018. What is SatRday? SatRdays are a collection of free/cheap accessible R conferences organised by members of the R community at various locations across the globe. Each SatRday looks to provide talks and/or workshops by R programmers covering the language and it’s applications and is run as a not-for-profit event.

The Trouble with Tibbles

What are tibbles? Precursors Tribblemaking Tibbles vs Data Frames Disadvantages To summarise.. Let’s get something straight, there isn’t really any trouble with tibbles. I’m hoping you’ve noticed this is a play on 1967 Star Trek episode, “The Trouble with Tribbles”. I’ve recently got myself a job as a Data Scientist, here, at Jumping Rivers. Having never come across tibbles until this point, I now find myself using them in nearly every R script I compose.

Conference Cost

In last weeks post we tantalised you with upcoming R & data science conferences, but from a cost point view, not all R conferences are the same. Using the R conference site, it’s fairly easy to compare the cost of previous R conferences. I selected the main conferences over the last few years and obtained the cost for the full ticket (including any tutorial days, but ignoring any discounts). Next, I converted all prices to dollars and calculated the cost per day.

Upcoming R conferences (2018)

It’s that time of year when we need to start thinking about what R Conferences we would like to (and can!) attend. To help plan your (ahem) work trips, we thought it would be useful to list the upcoming main attractions. We maintain a list of upcoming rstats conferences. To keep up to date, just follow our twitter bot. rstudio::conf (San Diego, USA) rstudio::conf is about all things R and RStudio

Hosting RStudio Server on Azure

Can’t be bothered reading, tell me now Getting started Setting up R Opening ports ready for RStudio Installing RStudio Nicer URLs Adding SSL Can’t be bothered reading, tell me now Host RStudio server on an azure instance. Configure the instance to access RStudio with a nice url Getting started Azure is cloud computing framework provided by Microsoft, the same idea as AWS by Amazon. In this post, we’ll describe how to use Azure to run RStudio Server in the cloud.

Competition: StanCon 2018 ticket

The prize How do I enter? FAQ Today we are happy to announce our Stan contest. Something we feel very strongly at Jumping Rivers is giving back to the community. We have benefited immensely from hard work by numerous people, so when possible, we try to give something back. This year we’re sponsoring StanCon 2018. If you don’t know, Stan is freedom-respecting, open-source software for facilitating statistical inference at the frontiers of applied statistics.

Comparing plotly & ggplotly plot generation times

Prerequisites Analysis Summary The plotly package. A godsend for interactive documents, dashboard and presentations. For such documents there is no doubt that anyone would prefer a plot created in plotly rather than ggplot2. Why? Using plotly gives you neat and crucially interactive options at the top, where as ggplot2 objects are static. In an app we have been developing here at Jumping Rivers, we found ourselves asking the question would it be quicker to use plot_ly() or wrapping a ggplot2 object in ggplotly()?

Official StanCon Sponsor

Stan is freedom-respecting, open-source software for facilitating statistical inference at the frontiers of applied statistics. Or to put it another way, it makes Bayesian inference fast and (a bit) easier. StanCon is the premier conference for all things Stan related and this year it will take place at the Asilomar Conference Grounds, a National Historic Landmark on the Monterey Peninsula right on the beach. RStan and other interfaces One of the great features about Stan is that you can use Stan via R (or Python or …).

Timing in R

Nested timings 1) Sys.time() 2) The tictoc package Comparing functions 1) system.time() 2) The microbenchmark package Conclusion As time goes on, your R scripts are probably getting longer and more complicated, right? Timing parts of your script could save you precious time when re-running code over and over again. Today I’m going to go through the 4 main functions for doing so. Nested timings 1) Sys.

Speeding up package installation

Can’t be bothered reading, tell me now The wonder of CRAN Parallel package installation: Ncpus Does it work? A permanent change: .Rprofile References Can’t be bothered reading, tell me now A simple one line tweak can significantly speed up package installation and updates. The wonder of CRAN One of the best features of R is CRAN. When a package is submitted to CRAN, not only is it checked under three versions of R