Speeding up your Continuous Integration Builds
Continuous integration is an amazing tool when developing R packages. We push a change to the server, and a process is spawned that checks we haven’t done something silly. It protects us from ourselves! However this process can become slow, as typically the CI process starts with a blank virtual machine (VM).
If you are using R, then the current most popular CI pipeline is Travis CI, but there’s also Jenkins, GitHub Actions, GitLab CI, Circle CI and a few others. They all follow the same idea. Start a VM, install your R package, then run a bunch of checks. One obvious bottle neck is the “install your R package” step, as any R package may have a large number of dependencies.
In a recent post, we showed the different ways of speeding up package installation (worth checking this out if you find package installation/updating slow). In this post, we’ll discuss leveraging some of those techniques for our CI pipeline.
RStudio Package Manager (RSPM)
The RStudio package manager is perhaps the easiest way of speeding up your CI process. RSPM provides precompiled binaries for CRAN packages, which should ensure a faster install. To test this I made a simple package, with no functions, but a dependency on the {tidyverse}, .i.e. Imports: tidyverse
in the DESCRIPTION file. Then I started two travis CI jobs. The first had a .travis.yml
file
language: r
cache: packages
The total time for this travis job was around twelve minutes.
The second job had same two lines, but also an additional before_install:
line
before_install:
- echo "options(repos = c(CRAN = 'https://packagemanager.rstudio.com/all/__linux__/xenial/latest'))" >> ~/.Rprofile.site
- echo "options(HTTPUserAgent = paste0('R/', getRversion(), ' R (',
paste(getRversion(), R.version['platform'], R.version['arch'], R.version['os']),
')'))" >> ~/.Rprofile.site
While looking complicated, it is actually fairly simple. The first line adds the RStudio binary package repository to the .Rprofile
. The second adds an HTTPUserAgent
to the .Rprofile
to enable packages that are installed via Rscript
to also use the binary package versions. These few lines cut the travis build time from around 12 minutes to under 4 minutes.
The above is an incredibly easy way to speed-up your CI steps and works with other CI systems. If you use GitHub Actions, then this has already been implemented.
A couple of things to note
- The above code is for Ubuntu 16.04 Xenial. If you are using
18.04 bionic
, then change in the obvious way - There are few different OSs available for RSPM
- If you are interested in using the RSPM in your own organisation, give us a shout - we’re RStudio Partners.
Other methods
There are three other possibilities for reducing your CI time.
- The first is similar to the RStudio package manager and use binary builds, but this time use the Ubuntu versions provided by Michael Rutter. The general idea is to add a new Ubuntu package repository, then install packages via
apt install r-cran-*
. Details are available at CRAN. Also see Dirk Eddelbuettel’s recent blog post and youtube video for even more details. - Alternatively, we could use the
ccache
trick, where we store compiled files to be used for the next build. This requires a little more work, but this has already been done by Patrick Schratz - Parallel builds using the
Ncpus
argument withinstall.packages()
typically doesn’t typically work for most CI systems, as the (free) VM will only have a single core.