Assign CN state to SSMs. — assign_cn_to

Annotate mutations with their copy number information.

Usage

assign_cn_to_ssm(
  these_samples_metadata,
  maf_data,
  seg_data,
  projection,
  coding_only = FALSE,
  assume_diploid = FALSE,
  include_silent = FALSE,
  ...
)

Arguments

these_samples_metadata: Metadata table with one or more rows to specify the samples to process.
maf_data: A data frame of mutations in MAF format or maf_data object (e.g. from get_coding_ssm or get_ssm_by_sample).
seg_data: A data frame of segmented copy number data or seg_data object
projection: Specified genome projection that returned data is relative to. This is only required when it cannot be inferred from maf_df or seg_df (or they are not provided).
coding_only: Optional. Set to TRUE to restrict to only variants in coding space Default is to work with genome-wide variants.
assume_diploid: Optional, this parameter annotates every mutation as copy neutral. Default is FALSE.
include_silent: Logical parameter indicating whether to include silent mutations in coding space. Default is FALSE. This parameter only makes sense if coding_only is set to TRUE.
...: Any additional parameters.

Value

A list containing a data frame (MAF-like format) with three extra columns: - log.ratio is the log ratio from the seg file (NA when no overlap). - LOH - CN (the rounded absolute copy number estimate of the region based on log.ratio, NA when no overlap was found).

Details

This function takes a metadata table and returns all mutations for the samples in that metadata. Each mutation is annotated with the local copy number state of each mutated site. The user can specify if only coding mutations are of interest. To do so, set coding_only = TRUE. When necessary, this function relies on get_ssm_by_samples and get_cn_segments to obtain the required data.

Examples


if (FALSE) { # \dontrun{
 # long-handed way (mostly for illustration)
 # 1. get some metadata for a collection of samples
 some_meta = suppressMessages(get_gambl_metadata()) %>%
        dplyr::filter(cohort=="DLBCL_ICGC")

 # 2. Get the SSMs for these samples

 ssm_genomes_grch37 = get_coding_ssm(projection = "grch37",
                                  these_samples_metadata = some_meta)
 # peek at the results
 ssm_genomes_grch37 %>% dplyr::select(1:8)

 # 3. Lazily let this function obtain the corresponding seg_data
 #  for the right genome_build
 cn_list = assign_cn_to_ssm(some_meta,ssm_genomes_grch37)

 cn_list$maf %>% dplyr::select(1:8,log.ratio,CN)
 # or using the other genome build:
 ssm_genomes_hg38 = get_coding_ssm(projection = "hg38",
                                  these_samples_metadata = some_meta)
 cn_list = assign_cn_to_ssm(some_meta,ssm_genomes_hg38)
 cn_list$maf %>% dplyr::select(1:8,log.ratio,CN)
} # }

# Easiest/laziest way: Let the function obtain
# the seg_data and maf_data for you

 # 1. get some metadata for a collection of samples
 some_meta = suppressMessages(get_gambl_metadata()) %>%
        dplyr::filter(cohort=="DLBCL_ICGC") %>% head(3)

cn_list = assign_cn_to_ssm(these_samples_metadata = some_meta,
                           projection = "grch37")
#> dummy segments are not annotated in the inputs
#> fill_missing_with parameter will be ignored
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> Running in default mode of any...

cn_list$maf %>% dplyr::group_by(Tumor_Sample_Barcode,CN) %>%
  dplyr::count()
#> genomic_data Object
#> Genome Build: grch37 
#> Showing first 10 rows:
#>    Tumor_Sample_Barcode       CN     n
#> 1              SP124957 1.893249    17
#> 2              SP124957 2.000000 10263
#> 3              SP124957 2.112682    96
#> 4              SP124957 2.126767    59
#> 5              SP124957 2.144877   479
#> 6              SP124957 2.156021   149
#> 7              SP124957 3.000000    52
#> 8              SP124957 3.521206   344
#> 9              SP124957 3.551033   712
#> 10             SP124957 3.563755   145