DSO: data science operations

DSO: data science operations#

DSO is a command line helper for building reproducible data anlaysis projects with ease by connecting our favorite tools: It builds on top of git and dvc for code and data versioning and provides project templates, dependency management via uv, linting checks, hierarchical overlay of configuration files and integrates with quarto and jupyter notebooks.

At Boehringer Ingelheim, we introduced DSO to meet the high quality standards required for biomarker analysis in clinical trials. DSO is under active development and we value community feedback.

DSO Kraken

tools used by DSO

Getting started#

Please refer to the documentation, in particular the getting started section.

Installation#

See installation.

Contact#

Please use the issue tracker.

Release notes#

See the changelog.

License#

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (LGPL) as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Additionally, the templates files used internally by dso init and dso create are distributed under the Creative Commons Zero v1.0 Universal license. See also the separate LICENSE file in the templates directory.

Credits#

dso was initially developed by

DSO depends on many great open source projects, most notably dvc, hiyapyco and jinja2.