Skip to contents

Get a specific flavour of LymphGen from the main GAMBL outputs.

Usage

get_lymphgen(
  these_samples_metadata,
  flavour,
  lymphgen_file,
  keep_all_rows = FALSE,
  keep_original_columns = FALSE,
  streamlined = FALSE,
  verbose = FALSE
)

Arguments

these_samples_metadata

A metadata table to auto-subset the data to samples in that table before returning.

flavour

Lymphgen flavour.

lymphgen_file

Path to lymphgen file.

keep_all_rows

Boolean parameter, default is FALSE.

keep_original_columns

Boolean parameter, default is FALSE.

streamlined

Boolean, set to true to get just a data frame with one column for sample_id and one for LymphGen class

verbose

Boolean, set to TRUE to print informational messages. Useful for debugging. Default is FALSE

Value

If run with A list of data frames with the following names: lymphgen (a data frame containing the tidy LymphGen output), features (a binary matrix indicating which patients had each feature), feature_annotation (a data frame with one row per LymphGen feature reduced to gene or arm, for arm-level events and summary statistics for the feature across the cohort), features_long (a data frame with one row per LymphGen feature/patient event), sample_annotation (a data frame with one row per sample and columns indicating the number of features for each LymphGen class in that sample)

Details

Get a specific flavour of LymphGen from the main GAMBL outputs and tidy the composites. Optionally return a matrix of features instead

Examples

my_meta <- get_gambl_metadata()
#> 3273 capture samples are missing a value for protocol. Assuming Exome.
#> 138 biopsies are missing from the biopsy metadata. This should be fixed!
#> affected cohorts:  DLBCL_LSARP_Trios,Ennishi_tapestri,SMZL_Strefford,cHL_Maura,MCL_Barcelona
#> 110 biopsies with discrepancies in the pathology field. This should be fixed!
#> 10 biopsies with discrepancies in the time_point field. This should be fixed!
lymphgen_all <- get_lymphgen(
  flavour = "no_cnvs.no_sv.with_A53",
  these = my_meta,
  keep_original_columns = TRUE
)
#> NO A53
#> Joining with `by = join_by(Sample.Name)`
#> Joining with `by = join_by(Sample.Name, BTG2)`
#> Joining with `by = join_by(Sample.Name)`
#> Joining with `by = join_by(Sample.Name, SOCS1)`
head(lymphgen_all$features[, c(1:14)])
#> # A tibble: 6 × 14
#>   sample_id       BCL2  EZH2 SOCS1 KMT2D EP300 CREBBP TNFRSF14 MEF2B  IRF8  PIM2
#>   <chr>          <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl> <dbl> <dbl> <dbl>
#> 1 00-14595_tumo…     1     1     1     0     0      0        0     0     0     1
#> 2 00-14595_tumo…     1     1     1     0     0      0        0     0     0     0
#> 3 00-14595_tumo…     1     1     1     0     0      0        0     0     0     0
#> 4 00-14595_tumo…     1     1     0     0     0      0        0     0     0     1
#> 5 00-15201_tumo…     0     0     0     1     0      0        0     0     0     0
#> 6 00-15201_tumo…     0     0     0     0     1      0        0     0     0     1
#> # ℹ 3 more variables: BTG2 <dbl>, TBL1XR1 <dbl>, KLHL14 <dbl>

head(lymphgen_all$lymphgen[, c(1:14)])
#> # A tibble: 6 × 14
#>   sample_id       Copy.Number   BCL2.Translocation BCL6.Translocation Model   
#>   <chr>           <chr>         <chr>              <chr>              <chr>   
#> 1 00-14595_tumorA Not Available Not Available      Not Available      NoFusCGH
#> 2 00-14595_tumorB Not Available Not Available      Not Available      NoFusCGH
#> 3 00-14595_tumorC Not Available Not Available      Not Available      NoFusCGH
#> 4 00-14595_tumorD Not Available Not Available      Not Available      NoFusCGH
#> 5 00-15201_tumorA Not Available Not Available      Not Available      NoFusCGH
#> 6 00-15201_tumorB Not Available Not Available      Not Available      NoFusCGH
#> # ℹ 9 more variables: Confidence.BN2 <dbl>, Confidence.EZB <dbl>,
#> #   Confidence.MCD <dbl>, Confidence.N1 <dbl>, Confidence.ST2 <dbl>,
#> #   BN2.Feature.Count <dbl>, EZB.Feature.Count <dbl>, MCD.Feature.Count <dbl>,
#> #   N1.Feature.Count <dbl>