API as a package: Structure

Published: September 15, 2022

tags: r, python

This is part one of our three part series

Part 1: API as a package: Structure (this post)
Part 2: API as a package: Logging
Part 3: API as a package: Testing

Introduction

At Jumping Rivers we were recently tasked with taking a prototype application built in {shiny} to a public facing production environment for a public sector organisation. During the scoping exercise it was determined that a more appropriate solution to fit the requirements was to build the application with a {plumber} API providing the interface to the Bayesian network model and other application tools written in R.

When building applications in {shiny} we have for some time been using the “app as a package” approach which has been popularised by tools like {golem} and {leprechaun}, in large part due to the convenience that comes with leveraging the testing and dependency structure that our R developers are comfortable with in authoring packages, and the ease with which one can install and run an application in a new environment as a result. For this project we looked to take some of these ideas to a {plumber} application. This blog post discusses some of the thoughts and resultant structure that came as a result of that process.

As I began to flesh out this blog post I realised that it was becoming very long, and there were a number of different aspects that I wanted to discuss: structure, logging and testing to name a few. To try to keep this a bit more palatable I will instead do a mini-series of blog posts around the API as a package idea and focus predominantly on the structure elements here.

API as a package

There are a few things I really like about the {shiny} app as a package approach that I wanted to reflect in the design and build of a {plumber} application as package.

It encourages a regular structure and organisation for an application. All modules have a consistent naming pattern and structure.
It encourages leveraging the {testthat} package and including some common tests across a series of applications, see golem::use_reccommended_tests() for example.
An instance of the app can be created via a single function call which does all the necessary set up, say my_package::run_app()

Primarily I wanted these features, which could be reused across {plumber} applications that we create both internally and for our clients. As far as I know there isn’t a similar package that provides an opinionated way of laying out a {plumber} application as a package, and it is my intention to create one as a follow up to this work.

Regular structure

When developing the solution for this particular project I did have in the back of my mind that I wanted to create as much reusable structure for any future projects of this sort as possible. I really wanted to have an easy way to, from a package structure, be able to build out an API with nested routes, using code that could easily transfer to another package.

I opted for a structure that utilised the inst/extdata/api/routes directory of a package as a basis with the idea that the following file structure

| inst/extdata/api/routes/
|
| - model.R
| - reports/
  -  |
     | - pdf.R

with example route definitions inside

# model.R
#* @post /prediction
exported_function_from_my_package

# pdf.R
#* @post /weekly
exported_function_from_my_package

would translate to an API with the following endpoints

/model/prediction
/reports/pdf/weekly

A few simple function definitions would allow us to do this for any given package that uses this file structure.

The first function here just grabs the directory from the current package where I will define the endpoints that make up my API.

get_internal_routes = function(path = ".") {
  system.file("extdata", "api", "routes", path,
              package = utils::packageName(),
              mustWork = TRUE)
}

create_routes will recursively list out all of the .R files within the chosen directory and name them according to the name of the file, this will make it easy to build out a a number of “nested” routers that will all be mounted into the same API, achieving the compartmentalisation that we desire. For example the two files at <my_package>/inst/extdata/api/routes/model.R and <my_package>/inst/extdata/api/routes/reports/pdf.R will take on the names "model" and "reports/pdf" respectively.

add_default_route_names = function(routes, dir) {
  names = stringr::str_remove(routes, pattern = dir)
  names = stringr::str_remove(names, pattern = "\\.R$")
  names(routes) = names
  routes
}

create_routes = function(dir) {
  routes = list.files(
    dir, recursive = TRUE,
    full.names = TRUE, pattern = "*\\.R$"
  )
  add_default_route_names(routes, dir)
}

The final few pieces to the puzzle ensure that we have / at the beginning of a string (ensure_slash()), for the purpose of mounting components to my router. add_plumber_definition() just calls the necessary functions from {plumber} to process a new route file, i.e from the decorated functions in the file create the routes, and then mount them at a given path to an existing router object. For example given a file “test.R” that has a #* @get /identity decorator against a function definition and endpoint = "test" we would add /test/identity to the existing router. generate_api() takes a full named vector/list of file paths, ensures they all have an appropriate name and mounts them all to a new Plumber router object.

ensure_slash = function(string) {
  has_slash = grepl("^/", string)
  if (has_slash) string else paste0("/", string)
}

add_plumber_definition = function(pr, endpoint, file, ...) {
  router = plumber::pr(file = file, ...)
  plumber::pr_mount(pr = pr,
                    path = endpoint,
                    router = router
  )
}

generate_api = function(routes, ...) {
  endpoints = purrr::map_chr(names(routes), ensure_slash)
  purrr::reduce2(
    .x = endpoints, .y = routes,
    .f = add_plumber_definition, ...,
    .init =  plumber::pr(NULL)
  )
}

With these defined I can then, as I develop my package, add new routes by defining functions and adding {plumber} tag annotations to files in /inst/ and rebuild the new API with

get_internal_routes() %>%
  create_routes() %>%
  generate_api()

and nothing about this code is specific to my current package so is transferable. As a concrete, but very much simplified example, I might have the following collection of files/annotations under <my_package>/inst/extdata/api/routes

## File: /example.R
# Taken from plumber quickstart documentation
# https://www.rplumber.io/articles/quickstart.html
#* @get /echo
function(msg="") {
  list(msg = paste0("The message is: '", msg, "'"))
}


## File: /test.R
#* @get /is_alive
function() {
  list(alive = TRUE)
}


## File: /nested/example.R
# Taken from plumber quickstart documentation
# https://www.rplumber.io/articles/quickstart.html
#* @get /echo
function(msg="") {
  list(msg = paste0("The message is: '", msg, "'"))
}

which would give me

get_internal_routes() %>%
  create_routes() %>%
  generate_api()

# # Plumber router with 0 endpoints, 4 filters, and 3 sub-routers.
# # Use `pr_run()` on this object to start the API.
# ├──[queryString]
# ├──[body]
# ├──[cookieParser]
# ├──[sharedSecret]
# ├──/example
# │  │ # Plumber router with 1 endpoint, 4 filters, and 0 sub-routers.
# │  ├──[queryString]
# │  ├──[body]
# │  ├──[cookieParser]
# │  ├──[sharedSecret]
# │  └──/echo (GET)
# ├──/nested
# │  ├──/example
# │  │  │ # Plumber router with 1 endpoint, 4 filters, and 0 sub-routers.
# │  │  ├──[queryString]
# │  │  ├──[body]
# │  │  ├──[cookieParser]
# │  │  ├──[sharedSecret]
# │  │  └──/echo (GET)
# ├──/test
# │  │ # Plumber router with 1 endpoint, 4 filters, and 0 sub-routers.
# │  ├──[queryString]
# │  ├──[body]
# │  ├──[cookieParser]
# │  ├──[sharedSecret]
# │  └──/is_alive (GET)

This {cookieCutter} example is available to view at our Github blog repo.

Basic testing

In my real project I refrained from having any actual function definitions being made in inst/. Instead each function that was part of the exposed API was a proper exported function from my package (additionally filenames for said functions followed a regular structure too of api_<topic>.R). This allows for leveraging {testthat} against the logic of each of the functions as well as using other tools like {lintr} and ensuring that dependencies, documentation etc are all dealt with appropriately. Testing individual functions that will be exposed as routes can be a little different to other R functions in that the objects passed as arguments come from a request. As alluded to in the introduction I will prepare another blog post detailing some elements of testing for API as a package but a short snippet that I found particularly helpful for testing that a running API is functioning as I expect is included here.

The following code could be used to set up (and subsequently tear down) a running API that is expecting requests for a package cookieCutter

# tests/testthat/setup.R

## run before any tests
# pick a random available port to serve your app locally
port = httpuv::randomPort()

# start a background R process that launches an instance of the API
# serving on that random port
running_api = callr::r_bg(
  function(port) {
    dir = cookieCutter::get_internal_routes()
    routes = cookieCutter::create_routes(dir)
    api = cookieCutter::generate_api(routes)
    api$run(port = port, host = "0.0.0.0")
  }, list(port = port)
)

# Small wait for the background process to ensure it
# starts properly
Sys.sleep(1)

## run after all tests
withr::defer(running_api$kill(), testthat::teardown_env())

A simple test to ensure that our is_alive endpoint works then might look like

test_that("is alive", {
  res = httr::GET(glue::glue("http://0.0.0.0:{port}/test/is_alive"))
  expect_equal(res$status_code, 200)
})

Logging

{shiny} has some useful packages for adding logging, in particular {shinylogger} is very helpful at giving you plenty of logging for little effort on my part as the user. As far as I could find nothing similar exists for {plumber} so I set up a bunch of hooks, using the {logger} package to write information to both file and terminal. Since that could form it’s own blogpost I will save that discussion for the future.