dso-r: R companion package for dso
dso is a command line helper for building reproducible data analysis projects on top of dvc. To learn more about dso, please refer to the dso documentation. {dso-r}
is the R companion package for dso. The purpose of this package is to provide access to files and configuration organized in a dso project.
Installation
For now, it is just possible to install the development version from GitHub:
remotes::install_github("Boehringer-Ingelheim/dso-r")
Typical usage
The DSO R-Package provides convenient access to stage parameters from R scripts or notebooks. Using read_params
the params.yaml
file of the specified stage is compiled and loaded into a dictionary. The path must be specified relative to the project root – this ensures that the correct stage is found irrespective of the current working directory, as long as it the project root or any subdirectory thereof. Only parameters that are declared as params
, dep
, or output
in dvc.yaml are loaded to ensure that one does not forget to keep the dvc.yaml
updated.
library(dso)
params <- read_params("subfolder/my_stage")
# Access parameters
params$thresholds
params$samplesheet
By default, DSO compiles paths in configuration files to paths relative to each stage (see configuration). From R, you can use stage_here
to resolve paths relative to the current stage independent of your current working directory. This works, because read_params
has stored the path of the current stage in a configuration object that persists in the current R session. stage_here
can use this information to resolve relative paths.
samplesheet <- readr::read_csv(stage_here(params$samplesheet))
When modifying the dvc.yaml
, params.in.yaml
, or params.yaml
files during development, use the reload(params)
function to ensure proper application of the changes by rebuilding and reloading the configuration.
reload(params)
Creating a stage within the R environment can be performed using create_stage
and supplying it with the relative path of the stage from project root and a description.
create_stage(name = "subfolder/my_stage", description = "This stage does something")
API documentation
Please refer to the documentation website