I’m Hong Ooi, data scientist with Microsoft Azure Global, and maintainer of the checkpoint package. The checkpoint package makes it easy for you freeze R packages in time, drawing from the daily snapshots of the CRAN repository that have been archived on a daily basis at MRAN since 2014.
Checkpoint has been around for nearly 6 years now, helping R users solve the reproducible research puzzle. In that time, it’s seen many changes, new features, and, inevitably, bug reports. Some of these bugs have been fixed, while others remain outstanding in the too-hard basket.
Many of these issues spring from the fact that it uses only base R functions, in particular install.packages
, to do its work. The problem is that install.packages
is meant for interactive use, and as an API, is very limited. For starters, it doesn’t return a result to the caller—instead, checkpoint has to capture and parse the printed output to determine whether the installation succeeded. This causes a host of problems, since the printout will vary based on how R is configured. Similarly, install.packages
refuses to install a package if it’s in use, which means checkpoint must unload it first—an imperfect and error-prone process at best.
In addition to these, checkpoint’s age means that it has accumulated a significant amount of technical debt over the years. For example, there is still code to handle ancient versions of R that couldn’t use HTTPS, even though the MRAN site (in line with security best practice) now accepts HTTPS connections only.
I’m happy to announce that checkpoint 1.0 is now in beta. This is a major refactoring/rewrite, aimed at solving these problems. The biggest change is to switch to pkgdepends for the backend, replacing the custom-written code using install.packages
. This brings the following benefits:
In addition, checkpoint 1.0 features experimental support for a checkpoint.yml
manifest file, to specify packages to include or exclude from the checkpoint. You can include packages from sources other than MRAN, such as Bioconductor or Github, or from the local machine; similarly, you can exclude packages which are not publicly distributed (although you’ll still have to ensure that such packages are visible to your checkpointed session).
The overall interface is still much the same. To create a checkpoint, or use an existing one, call the checkpoint()
function:
library(checkpoint)
checkpoint("2020-01-01")
This calls out to two other functions, create_checkpoint
and use_checkpoint
, reflecting the two main objectives of the package. You can also call these functions directly. To revert your session to the way it was before, call uncheckpoint()
.
One difference to be aware of is that function names and arguments now consistently use snake_case, reflecting the general style seen in the tidyverse and related frameworks. The names of ancillary functions have also been changed, to better reflect their purpose, and the package size has been significantly reduced. See the help files for more information.
There are two main downsides to the change, both due to known issues in the current pkgdepends/pkgcache chain:
r_version
argument to create_checkpoint
to install binaries intended for a different R version.file://
URL). You must either use the standard MRAN site, or have an actual webserver hosting a mirror of MRAN.It’s anticipated that these will both be fixed before pkgdepends is released to CRAN.
You can get the checkpoint 1.0 beta from GitHub:
remotes::install_github("RevolutionAnalytics/checkpoint")
Any comments or feedback will be much appreciated. You can email me directly, or open an issue at the repo.