Get Lymphgen.
get_lymphgen.Rd
Get a specific flavour of LymphGen from the main GAMBL outputs.
Usage
get_lymphgen(
these_samples_metadata,
flavour,
lymphgen_file,
keep_all_rows = FALSE,
keep_original_columns = FALSE,
streamlined = FALSE,
verbose = FALSE
)
Arguments
- these_samples_metadata
A metadata table to auto-subset the data to samples in that table before returning.
- flavour
Lymphgen flavour.
- lymphgen_file
Path to lymphgen file.
- keep_all_rows
Boolean parameter, default is FALSE.
- keep_original_columns
Boolean parameter, default is FALSE.
- streamlined
Boolean, set to true to get just a data frame with one column for sample_id and one for LymphGen class
- verbose
Boolean, set to TRUE to print informational messages. Useful for debugging. Default is FALSE
Value
If run with A list of data frames with the following names: lymphgen (a data frame containing the tidy LymphGen output), features (a binary matrix indicating which patients had each feature), feature_annotation (a data frame with one row per LymphGen feature reduced to gene or arm, for arm-level events and summary statistics for the feature across the cohort), features_long (a data frame with one row per LymphGen feature/patient event), sample_annotation (a data frame with one row per sample and columns indicating the number of features for each LymphGen class in that sample)
Details
Get a specific flavour of LymphGen from the main GAMBL outputs and tidy the composites. Optionally return a matrix of features instead
Examples
my_meta <- get_gambl_metadata()
#> 3273 capture samples are missing a value for protocol. Assuming Exome.
#> 138 biopsies are missing from the biopsy metadata. This should be fixed!
#> affected cohorts: DLBCL_LSARP_Trios,Ennishi_tapestri,SMZL_Strefford,cHL_Maura,MCL_Barcelona
#> 110 biopsies with discrepancies in the pathology field. This should be fixed!
#> 10 biopsies with discrepancies in the time_point field. This should be fixed!
lymphgen_all <- get_lymphgen(
flavour = "no_cnvs.no_sv.with_A53",
these = my_meta,
keep_original_columns = TRUE
)
#> NO A53
#> Joining with `by = join_by(Sample.Name)`
#> Joining with `by = join_by(Sample.Name, BTG2)`
#> Joining with `by = join_by(Sample.Name)`
#> Joining with `by = join_by(Sample.Name, SOCS1)`
head(lymphgen_all$features[, c(1:14)])
#> # A tibble: 6 × 14
#> sample_id BCL2 EZH2 SOCS1 KMT2D EP300 CREBBP TNFRSF14 MEF2B IRF8 PIM2
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 00-14595_tumo… 1 1 1 0 0 0 0 0 0 1
#> 2 00-14595_tumo… 1 1 1 0 0 0 0 0 0 0
#> 3 00-14595_tumo… 1 1 1 0 0 0 0 0 0 0
#> 4 00-14595_tumo… 1 1 0 0 0 0 0 0 0 1
#> 5 00-15201_tumo… 0 0 0 1 0 0 0 0 0 0
#> 6 00-15201_tumo… 0 0 0 0 1 0 0 0 0 1
#> # ℹ 3 more variables: BTG2 <dbl>, TBL1XR1 <dbl>, KLHL14 <dbl>
head(lymphgen_all$lymphgen[, c(1:14)])
#> # A tibble: 6 × 14
#> sample_id Copy.Number BCL2.Translocation BCL6.Translocation Model
#> <chr> <chr> <chr> <chr> <chr>
#> 1 00-14595_tumorA Not Available Not Available Not Available NoFusCGH
#> 2 00-14595_tumorB Not Available Not Available Not Available NoFusCGH
#> 3 00-14595_tumorC Not Available Not Available Not Available NoFusCGH
#> 4 00-14595_tumorD Not Available Not Available Not Available NoFusCGH
#> 5 00-15201_tumorA Not Available Not Available Not Available NoFusCGH
#> 6 00-15201_tumorB Not Available Not Available Not Available NoFusCGH
#> # ℹ 9 more variables: Confidence.BN2 <dbl>, Confidence.EZB <dbl>,
#> # Confidence.MCD <dbl>, Confidence.N1 <dbl>, Confidence.ST2 <dbl>,
#> # BN2.Feature.Count <dbl>, EZB.Feature.Count <dbl>, MCD.Feature.Count <dbl>,
#> # N1.Feature.Count <dbl>