R Package Quality: Maintainer Criteria

This is final part of a five part series of related posts on validating R packages. Other posts in the series are:
At last we come to the final post! Over the previous four posts, we considered all aspects of how we validate package. As we’ve constantly repeated, most individual scores aren’t that important. Instead, it’s the cumulative effect that’s important; it gives us a hint of where to spend our energy.
This final post, considers the package’s maintenance aspects, including update frequency and bug management. The general idea is that around this component is to understand if bugs are addressed in a clear, quick and transparent method. Some of the scores are subjective, for example scoring the bug closure rate. However, as this is combined with multiple scores, tinkering with any particular score has limited effect.
Score 1: Bug Closure Rate
A score based on the median bug closure rate. If longer than 12 months, give a score of 0; between 6 and 12 months, a score of 0.2; between 4 and 6 months, a score of 0.5; between 2 and 4 months, a score of 0.8; and if shorter than two months, give 1.
An analysis of CRAN suggests 70% of packages have a bug closure rate less than two months.
Score 2: Maintainer
Binary score of whether a package has at least one maintainer. All packages on CRAN must have a maintainer.
Score 3: Source Control
A binary score of whether the package has an associated version-controlled repository. This isn’t just GitHub! But includes r-forge, GitLab, and various other flavours of source control out there.
Score 4: Bug Reports URL
A binary score of whether a package links to a location where it is possible to file bug reports. If possible, we try to infer this URL. For example, if the website is a GitHub repo, then it’s almost certain to have an issues page.
Score 5: Bugs Status
The proportion of bug reports that are closed.
If no issues have ever been opened, a value of 1
is returned.
Score 6: The Number of Contributors
A score based on the number of contributors to the package. Returns 0 if a single contributor, 0.5 if two contributors, 1 if 3 or more contributors are found. Around 60% of CRAN packages have at least two contributors. Only 30% of CRAN packages have more than two contributors.
Score 7: Maintainers other Packages
Score based on how many packages its maintainers have created on CRAN. A score of 1 indicates 3 or more CRAN packages, 0.5 two packages, and 0 for 1 or fewer packages. Around 60% have two packages on CRAN, and 40% have three or more packages.
Examples
Package | No’ of Contributors | Bug Status | Closure Rate |
---|---|---|---|
{drat} | 1.00 | 0.75 | 0.80 |
{microbenchmark} | 1.00 | 0.78 | 0.00 |
{shinyjs} | 0.00 | 0.78 | 0.00 |
{tibble} | 1.00 | 0.68 | 0.00 |
{tsibble} | 1.00 | 0.81 | 1.00 |
For clarity, scores where all packages are 1, have been omitted from the table.
All packages have GitHub pages and are authored by experienced R developers.
{shinyjs}
scores 0 for the number of contributors, as there is only a single contributor.
In the context of Shiny Application validation,
a sole author is something to be aware of.
The (surprising?) bug closure rate is 0 for {tibble}
, {shinyjs}
, and {microbenchmark}
.
Looking at the GitHub Issues for {tibble}
there does seem to be a lot of long term issues/features.
Interestingly, we’ve found that many of the popular packages have a low score closure rate.
This is usually, that issues are also tracking some future features.
Again, individual scores aren’t the important issues. It’s the overall story!
