Skip to contents

Main Functions

Primary user-facing functions for creating data objects

cvo()
Read in a CombinedVariantOutput.tsv file and store as an object
cnv()
Read in a *CopyNumberVariants.vcf file and store as an object
qualitymetrics()
Read in a MetricsOutput.tsv file and store as an object
tmb()
Read in a TMB_trace.tsv file and store as an object

Reading Data Files

Import TSO500 data files

read_qmo_data()
Read in a batch of MetricsOutput.tsv files into a list
read_cvo_data()
Read in a batch of CombinedVariantOutput.tsv files into a list
read_small_variants()
Read in a list of combined.variant.output objects and return a data frame of small variants per sample
read_gene_amplifications()
Read in a list of combined.variant.output objects and return a data frame of gene amplifications per sample
read_fusions()
Read in a list of combined.variant.output objects and return a data frame of fusions per sample
read_splice_variants()
Read in a list of combined.variant.output objects and return a data frame of splice variants per sample
read_tmb_trace_data()
Read in a batch of TMB_trace.tsv files into a list
read_tmb_details_data()
Read in a batch of *tmb.json files into a list
read_tmb_details_data_csv()
Read in a batch of *tmb.metrics.csv files into a list
read_cnv_data()
Read in a batch of *CopyNumberVariants.vcf files into a list of CNV objects
read_annotation_data()
Read Excel spreadsheet into a list of data frames, per sheet
read_analysis_status()
Read in a list of combined.quality.metrics.output objects and return a data frame of the analysis status per timepoint
read_run_qc_metrics()
Read in a list of combined.quality.metrics.output objects and return a data frame of run qc metrics per timepoint

Quality Metrics

Extract and access QC metrics

get_analysis_details_df()
Extracts all analysis details from list of combined.variant.output objects, returning a data frame of analysis details per sample
get_analysis_status()
Extract the analysis status from combined.quality.metrics.output object and return in data frame format
get_analysis_status(<combined.quality.metrics.output>)
Get analysis status from combined.quality.metrics.output object
get_count_df()
Extracts the counts of the different variant types from list of combined.variant.output objects, returning a data frame of counts per sample
get_dna_expanded_metrics()
Extract expanded dna qc metrics from combined.quality.metrics.output object and return in data frame format
get_dna_expanded_metrics(<combined.quality.metrics.output>)
Get expanded dna qc metrics from combined.quality.metrics.output object
get_dna_qc_metrics()
Extract dna qc metrics from combined.quality.metrics.output object and return in data frame format
get_dna_qc_metrics(<combined.quality.metrics.output>)
Get dna qc metrics from combined.quality.metrics.output object
get_dna_qc_metrics_cnv()
Extract dna qc metrics for CNV from combined.quality.metrics.output object and return in data frame format
get_dna_qc_metrics_cnv(<combined.quality.metrics.output>)
Get dna qc metrics for cnv from combined.quality.metrics.output object
get_dna_qc_metrics_msi()
Extract dna qc metrics for msi from combined.quality.metrics.output object and return in data frame format
get_dna_qc_metrics_msi(<combined.quality.metrics.output>)
Get dna qc metrics for msi from combined.quality.metrics.output object
get_dna_qc_metrics_snvtmb()
Extract dna qc metrics for small variants and tmb from combined.quality.metrics.output object and return in data frame format
get_dna_qc_metrics_snvtmb(<combined.quality.metrics.output>)
Get dna qc metrics for small variants and tmb from combined.quality.metrics.output object
get_fusions()
Extract fusions from combined.variant.output object and return in data frame format
get_fusions(<combined.variant.output>)
Get fusions from combined.variant.output object
get_gene_amplifications()
Extract gene amplifications from combined.variant.output object and return in data frame format
get_gene_amplifications(<combined.variant.output>)
Get gene amplifications from combined.variant.output object
get_metrics_df()
Extracts all TMB/MSI metrics from list of combined.variant.output objects, returning a data frame of TMB/MSI per sample
get_rna_expanded_metrics()
Extract expanded rna qc metrics combined.quality.metrics.output object and return in data frame format
get_rna_expanded_metrics(<combined.quality.metrics.output>)
Get expanded rna qc metrics from combined.quality.metrics.output object
get_rna_qc_metrics()
Extract rna qc metrics from combined.quality.metrics.output object and return in data frame format
get_rna_qc_metrics(<combined.quality.metrics.output>)
Get rna qc metrics from combined.quality.metrics.output object
get_run_qc_metrics()
Extract run qc metrics from combined.quality.metrics.output object and return in data frame format
get_run_qc_metrics(<combined.quality.metrics.output>)
Get run qc metrics from combined.quality.metrics.output object
get_sequencing_run_details_df()
Extracts all sequencing run details from list of combined.variant.output objects, returning a data frame of sequencing run details per sample
get_small_variants()
Extract small variants from combined.variant.output object and return in data frame format
get_small_variants(<combined.variant.output>)
Get small variants from combined.variant.output object
get_splice_variants()
Extract splice variants from combined.variant.output object and return in data frame format
get_splice_variants(<combined.variant.output>)
Get splice variants from combined.variant.output object
get_summarised_statistics_df()
Helper function to extract summarized counts based on sample_id
get_tmb_data()
Extract TMB data from tmb.variant.output object
get_tmb_data(<tmb.variant.output>)
Extract TMB data from tmb.variant.output object
read_dna_expanded_metrics()
Read in a list of combined.quality.metrics.output objects and return a data frame of expanded dna qc metrics per timepoint
read_dna_qc_metrics()
Read in a list of combined.quality.metrics.output objects and return a data frame of dna qc metrics per timepoint
read_dna_qc_metrics_cnv()
Read in a list of combined.quality.metrics.output objects and return a data frame of dna qc metrics (cnv) per timepoint
read_dna_qc_metrics_msi()
Read in a list of combined.quality.metrics.output objects and return a data frame of dna qc metrics (msi) per timepoint
read_dna_qc_metrics_snvtmb()
Read in a list of combined.quality.metrics.output objects and return a data frame of dna qc metrics (small variants/tmb) per timepoint
read_rna_expanded_metrics()
Read in a list of combined.quality.metrics.output objects and return a data frame of expanded rna qc metrics per timepoint
read_rna_qc_metrics()
Read in a list of combined.quality.metrics.output objects and return a data frame of rna qc metrics per timepoint
extract_metrics()
Helper function to extract TMB/MSI metrics from list of combined.variant.output objects

Filtering and Processing

Filter and process variant data

process_and_filter_small_variant_data()
Process and filter small variant data-frame to requirements
filter_consequences()
Filters variant data for variant consequences. Removes any variant with a consequence matching the list of submitted consequences or an empty consequence field
filter_depth()
Helper function to filter small variant data according to specified depth
filter_for_cosmic_id()
Helper function to filter small variant data and keep only variants with annotated COSMIC ID(s)
filter_for_included_in_tmb()
Helper function to filter small variant data and keep only variants that are included in TMB numerator
filter_germline_db()
Helper function to filter small variant data according to the GermlineFilterDatabase filter
filter_germline_proxi()
Helper function to filter small variant data according to the GermlineFilterProxi filter
keep_consequences()
Filters for variant data for variant consequences. Keeps any variant with a consequence matching the list of submitted consequences

Data Integration

Combine and enrich data

add_tmb_variant_data()
Adds information from TMB trace table to small variant data
add_amplification_data()
Adds amplifications to small variant data, variant type (consequence_s) column renamed to variant_type
add_annotation_data()
Adds GEL annotation data to small variant data
get_metrics_df()
Extracts all TMB/MSI metrics from list of combined.variant.output objects, returning a data frame of TMB/MSI per sample
get_count_df()
Extracts the counts of the different variant types from list of combined.variant.output objects, returning a data frame of counts per sample
get_analysis_details_df()
Extracts all analysis details from list of combined.variant.output objects, returning a data frame of analysis details per sample
get_sequencing_run_details_df()
Extracts all sequencing run details from list of combined.variant.output objects, returning a data frame of sequencing run details per sample
summarize_cnv_data()
Read in a batch of *CopyNumberVariants.vcf files into one dataframe
update_annotation_join_columns()
Helper function to make small variant DF joinable to annotation data

Visualization

Create plots and tables

plot_af_density()
Plot allele-frequency kernel density estimate (KDE) for small variants
plot_af_histogram()
Plot allele-frequency histogram for small variants
plot_af_per_variant_consequence()
Plot allele-frequency for small variants per variant consequence
plot_onco_print()
Plot OncoPrint plot (heatmap) for variants
prepare_dataframe_for_oncoprint()
Transforms data frame holding variant information to a matrix that can be used as OncoPrint input
make_qc_table()
Visualize TSO500 QC results as gt-Table
add_common_theme_elements()
Add theme elements for visual customisation

Export and Writing

Export data and generate files

generate_dragen_samplesheet()
Generate sample sheet for DRAGEN TSO500 Anaylsis Pipeline input based on TSO500 sample sheet.
write_workbook()
Save data frames to Microsoft Excel workbook
write_worksheet()
Write worksheet to Excel workbook.
write_multiqc_data()
Write files needed for reporting in MultiQC.
write_rdata_file()
Save data frames to RData object.

Validation

Validate TSO500 data

validate_tso500()
Validator function for combined.variant.output constructor Not to be called directly NOT IMPLEMENTED
validate_tso500_qc()
Validator function for quality.metrics.output constructor Not to be called directly NOT IMPLEMENTED

Parsing Functions

Parse file formats and data structures

parse_illumina_samplesheet()
Parses DRAGEN sample sheet and returns a dataframe
parse_p_dot_notation()
Parses P-Dot notation column, splitting it into distinct NP ID and amino acid variant columns
parse_vcf_to_df()
Parse VCF files for a provided path and construct data frame.
parse_qmo_record()
Helper function to parse key-value lines in MetricsOutput.tsv
parse_qmo_table()
Helper function to parse tabular data in MetricsOutput.tsv
parse_cvo_record()
Helper function to parse key-value lines in CombinedVariantOutput.tsv
parse_cvo_table()
Helper function to parse tabular data in CombinedVariantOutput.tsv
handle_empty_qmo_table_values()
Helper function to handle empty rows in MetricsOutput.tsv tabular data
handle_empty_cvo_table_values()
Helper function to handle empty rows in CombinedVariantOutput.tsv tabular data
trim_qmo_header_and_footer()
Helper function to remove header and footer from MetricsOutput.tsv
trim_cvo_header_and_footer()
Helper function to remove header and footer from CombinedVariantOutput.tsv

Constructors

Internal constructor functions for data objects

new_combined_quality_metrics_output()
Constructor function for quality.metrics.output objects Not to be called directly
new_combined_variant_output()
Constructor function for combined.variant.output objects Not to be called directly
new_tmb_variant_output()
Constructor function for tmb.variant.output objects Not to be called directly
new_cnv_output()
Constructor function for combined.cnv.output objects Not to be called directly