Consolidate Lymphgen. — consolidate

Replace the lymphgen column in the incoming metadata with classification for additional samples.

consolidate_lymphgen(sample_table, derived_data_path = "", verbose = TRUE)

Arguments

sample_table: Input data frame with metadata.
derived_data_path: Optional argument specifying the path to a folder with files following the pattern *lymphgen.txt.
verbose: Default is TRUE.

Value

A data frame with a supplemented lymphGen column.

Details

Supplement the "lymphgen" column of the metadata with classification for additional samples. Expects at least to have columns "patient_id" to bind on, and "lymphgen" to supplement the data on.

Examples

metadata = get_gambl_metadata()
consolidate_lymphgen(sample_table = metadata)
#> No external data path was provided, using default path /projects/nhl_meta_analysis_scratch/gambl/results_local/icgc_dart/derived_and_curated_metadata/
#> Found these file(s) with lymphgen information: METADATA_3_cohorts_lymphgen.txt
#> # A tibble: 1,671 × 110
#>    compression bam_available patient_id sample_id       seq_type capture_space
#>    <chr>       <lgl>         <chr>      <chr>           <chr>    <chr>        
#>  1 bam         TRUE          00-14595   00-14595_tumorA genome   none         
#>  2 cram        TRUE          00-14595   00-14595_tumorB genome   none         
#>  3 cram        TRUE          00-14595   00-14595_tumorC genome   none         
#>  4 bam         TRUE          00-14595   00-14595_tumorD genome   none         
#>  5 cram        TRUE          00-15201   00-15201_tumorA genome   none         
#>  6 cram        TRUE          00-15201   00-15201_tumorB genome   none         
#>  7 bam         TRUE          00-16220   00-16220_tumorB genome   none         
#>  8 cram        TRUE          00-20702   00-20702T       genome   none         
#>  9 cram        TRUE          00-23442   00-23442_tumorA genome   none         
#> 10 cram        TRUE          00-23442   00-23442_tumorB genome   none         
#> # ℹ 1,661 more rows
#> # ℹ 104 more variables: genome_build <chr>, tissue_status <chr>, cohort <chr>,
#> #   library_id <chr>, pathology <chr>, time_point <chr>, protocol <chr>,
#> #   ffpe_or_frozen <chr>, read_length <dbl>, strandedness <chr>,
#> #   seq_source_type <chr>, EBV_status_inf <chr>, link_name <chr>,
#> #   data_path <chr>, unix_group <chr>, biopsy_id <chr>, fastq_link_name <chr>,
#> #   fastq_data_path <chr>, COO_consensus <chr>, DHITsig_consensus <chr>, …