End-to-end testing with shinytest2: Part 1
This is the first of a series of three blog posts about using the {shinytest2} package to develop automated tests for shiny applications. In the posts we will cover
the purpose of browser-driven end-to-end tests for a shiny developer, and tools (like {shinytest2}) that help implement them (this post);
how best to design your test code so that it supports your future work.
Automated testing is an essential part of any production-quality software project. Much of the focus in the R world, is on testing the individual components of a project (the functions, classes etc), but for those working with {shiny} applications there are great tools that can test your application as if a user was interacting with it. In this blog series, we focus on {shinytest2}, with which we can write tests from a user’s perspective.
But … test code is code: In an evolving project, it requires the same care and maintenance as the rest of your source code. So try to ensure that your test code is descriptive, and has sensible abstractions that reduce future maintenance.
Also … test code defines tests: If your tests pass when they shouldn’t, or they fail for reasons outside of their remit or in unpredictable ways, they aren’t doing their job properly. Reducing false positives and false negatives (suitably defined) may be of as much value to the code that presents your data analysis results as it is to the machine learning model that generated them.
Since these posts cover {shinytest2}, they assume some familiarity with {shiny} and also with R package development. If you want to know a bit more about these topics, there are some wonderful online resources:
- the RStudio (now Posit) {shiny} articles,
- the “Mastering Shiny” book,
- the “R Packages” book.
UI-based End-To-End Testing
{shiny} is a great tool for building interactive data-driven web applications. Many of the apps built with shiny are quite simple, maybe only a few hundred lines of code. But some apps are much much bigger. As a web application grows in complexity, the developer’s ability to reason about all the different parts of that application evaporates. This makes it harder to:
- add new features (will the new code break existing functionality?)
- fix bugs (it’s hard to find the broken code in a complicated code base)
- and onboard new developers.
As your app evolves over time, you should strive to keep the source code as simple as possible. Good design, documentation and team-communication can help, but one of the simplest ways to restrain the complexity of your source code is by investing time to add automated testing.
There are several levels at which you can test a shiny application. You might:
- write unit tests to ensure your R (and possibly JavaScript) functions work as expected;
- write reactivity tests for your back-end logic using
testServer()
and the, relatively new,moduleServer
syntax; - check that the app “works” before checking in your code, by opening it in the browser and “clicking about a bit …”;
- or you might formalise the latter by writing some manual test descriptions that define what should happen when the user interacts with your app in their browser.
Between them, these steps will help prevent you from breaking your app, identify issues that need to be fixed, and help demonstrate to clients or colleagues how your app works or that newly requested features / bug fixes have been implemented. This affords your developers more confidence when they want to restructure and simplify the source code of the app. So everyone wins.
Manual testing (especially the non-exploratory kind) can get pretty tedious though. It’s repetitive, it’s repetitive and it’s repetitive. That’s why you should invest time automating those user-interface-based (browser-side) end-to-end (app-focussed, not component-focussed) tests.
There are many tools for writing tests at this level. Typically the tools have two software components:
a webdriver that interacts with the app in the browser (or a headless version of a browser) thus mimicking how a user might interact with the app;
and a test / assertion library to compare the actual state of the app to your expectations.
Example tools in this space include:
- cypress, puppeteer and playwright (in JavaScript)
- selenium (several languages, including R)
- shinytest2 and shinytest (in R)
{shinytest2}
{shinytest2} builds upon the {shinytest} package and was written by
Barret Schloerke and his colleagues at RStudio. Like puppeteer,
{shinytest2} uses the Chrome DevTools Protocol to interact with the
browser, which is a pretty stable basis for building a browser
automation tool (the predecessor {shinytest} was built on a
now-unsupported browser library called
PhantomJS, so we strongly recommend migrating
to {shinytest2} if you are still using {shinytest}). Test scripts are
written in R and so should be accessible to R developers who are
comfortable with {testthat}. There is an
automated tool (described in the next post) for creating these test
scripts. Also, {shinytest2} understands the architecture of shiny apps,
and so it is simple to access the input
and output
variables that
are stored by a shiny app at any given time, the input
s can be
modified easily as well - to access these variables using the more
general UI-based end-to-end testing tools is much more difficult.
Last, and by no means least, the documentation for {shinytest2} is great and there are several videos online that might help you get up to speed.
A warning
Although end-to-end tests written with {shinytest2} (or the other tools, above) can provide good guarantees about the behaviour of a whole application, there are some caveats associated with this kind of test. Compared to component-focussed tests, they tend to be much slower, harder to write, more fragile (to changes in the code) and more flaky (due to external dependencies, the network and so on). So use these tests sparingly. If you can write a unit test that covers the same behaviour, this may be a much better use of both your and your computers time.
Summary
Here we have introduced browser-driven end-to-end tests as a way to check that your shiny app behaves as expected. {shinytest2} is a new tool allowing R developers to write this kind of test in R. These tests do have some drawbacks - they can be slow and unpredictable. In subsequent posts we will describe how to write tests using {shinytest2} and introduce some software design approaches that make these tests a bit more future-proof.