get_gambl_metadata.Rd
Return metadata for a selection of samples.
get_gambl_metadata(
seq_type_filter = "genome",
tissue_status_filter = "tumour",
case_set,
remove_benchmarking = TRUE,
with_outcomes = TRUE,
from_flatfile = TRUE,
sample_flatfile,
biopsy_flatfile,
only_available = TRUE,
seq_type_priority = "genome"
)
Filtering criteria (default: all genomes).
Filtering criteria (default: only tumour genomes, can be "mrna" or "any" for the superset of cases).
Optional short name for a pre-defined set of cases avoiding any embargoed cases (current options: 'BLGSP-study', 'FL-study', 'DLBCL-study', 'FL-DLBCL-study', 'FL-DLBCL-all', 'DLBCL-unembargoed', 'BL-DLBCL-manuscript', 'MCL','MCL-CLL').
By default the FFPE benchmarking duplicate samples will be dropped.
Optionally join to gambl outcome data.
New default is to use the metadata in the flat-files from your clone of the repo. Can be overridden to use the database.
Optionally provide the full path to a sample table to use instead of the default.
Optionally provide the full path to a biopsy table to use instead of the default.
If TRUE, will remove samples with FALSE or NA in the bam_available column (default: TRUE).
For duplicate sample_id with different seq_type available, the metadata will prioritize this seq_type and drop the others.
A data frame with metadata for each biopsy in GAMBL
This function returns metadata for GAMBL samples. Options for subset and filter the returned data are available. For more information on how to use this function with different filtering criteria, refer to the parameter descriptions, examples and vignettes. Embargoed cases (current options: 'BLGSP-study', 'FL-study', 'DLBCL-study', 'FL-DLBCL-study', 'FL-DLBCL-all', 'DLBCL-unembargoed', 'BL-DLBCL-manuscript', 'MCL','MCL-CLL')
#basic usage
my_metadata = get_gambl_metadata()
#use pre-defined custom sample sets
only_blgsp_metadata = get_gambl_metadata(case_set = "BLGSP-study")
#override default filters and request metadata for samples other than tumour genomes,
#e.g. also get the normals
only_normal_metadata = get_gambl_metadata(tissue_status_filter = c('tumour','normal'))
non_duplicated_genome_and_capture = get_gambl_metadata(seq_type_filter = c('genome', 'capture'),
seq_type_priority = "genome")