Detecting Security Vulnerabilities in R Packages
One of our main roles at Jumping Rivers is to set-up and provide ongoing maintenance to R, Python and RStudio infrastructure. This typically involves ensuring software is up-to-date and making sure everything is running smoothly.
The OSS Index developed by Sonatype is a
free catalogue of open source components and scanning tools to help developers identify vulnerabilities, understand risk, and keep their software safe.
The {oysteR} package is an R interface to the OSS Index that allows users to scan their installed R packages. A few months ago, I stumbled across a fledgeling version of this package and decided to make a few contributions to help move the package from GitHub to CRAN. A few PRs later, I’m now a co-author and the package is on CRAN.
Installing the {oysteR}
package is straightforward, just the usual install.packages()
dance:
install.packages("oysteR")
After loading the package
library("oysteR")
We can audit the installed R packages for security vulnerabilities via the command
audit = audit_deps()
Which produces the output
# ℹ Calling installed.packages(), this may take time
#
# ── Calling sonatype API: https://www.sonatype.com/ ──
#
# → Using Sonatype tokens
# ℹ Calling API: batch 1 of 2
# ℹ Calling API: batch 2 of 2
#
# ── Vulnerability overview ──
#
# ℹ 218 packages were scanned
# ℹ 190 packages were in the Sonatype database
# ℹ 1 package contains known vulnerability
# ℹ A total of 1 known vulnerability was identified
As the output suggests, this function performs a few steps:
- Calls
installed.packages()
to determine the installed packages on your machine. Although there are warnings about this taking a little while, I’ve never had any issues. - Splits the packages into batches of 128 and queries the Sonatype API. Note, I’ve registered for Sonatype to allow more API calls. This isn’t strictly necessary, but registering increases the number of API calls you can make - see the GitHub README.
- Summarises the results. In above, a total of 218 packages were scanned, at least 190 were found in the Sonatype database, and a single vulnerability was identified.
Packages might not be on Sonatype for a variety of reasons. For example, you may have personal packages. In my case, I have a large number of Jumping Rivers teaching related packages.
The million-pound question is, what is the vulnerability? To obtain a few more results, we can use
get_vulnerabilities(audit)
which returns a tibble given further details. In our case, the vulnerable package is
{widgetframe}, version 0.3.1 (the current latest version)
After consulting the link provided by get_vulnerabilites()
, we see that the issue
- was originally highlighted by Bob Rudis in 2018
- concerns “Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’)”, which is basically not sanitising URLs
- the underlying Javascript package has been updated thanks to a PR from Bob.
- but the R package is still using the old package.
Vulnerabilities in R
Right now the few vulnerabilities that have been detected within R packages typically involve a Javascript library that has been included. However, it would be a bit hopeful to assume these are the only vulnerabilities around. To paraphrase Donald Trump, “if we don’t look, then we won’t find,” so it is likely that security issues exist in other packages that contain other code, e.g. C++. As R gets more popular, I suspect that it will receive more and more attention from people with nefarious intentions. Particularly, as we push dashboards and documents to the web.
Summary
Including Javascript is amazingly easy - see this recent great blogpost from Maëlle Salmon and Garrick Aden-Buie for an excellent discussion. However, when we bundle external code within our package, we now need to ensure that we update the package at regular intervals. This brings a few challenges:
- for package authors, we need to ensure that our packages are updated regularly. If we decide to stop updating, that’s OK, but we need to let the users know.
- for CRAN. Currently, there is no mechanism to remove potentially dangerous packages - but this is the trade-off we have for not breaking builds.
Ultimately the final responsibility lies with users (or their organisations), who need to take responsibility for the packages they use.