Annotate mutations with their copy number information.

assign_cn_to_ssm(
  this_sample_id,
  coding_only = FALSE,
  from_flatfile = TRUE,
  use_augmented_maf = TRUE,
  tool_name = "battenberg",
  maf_file,
  maf_df,
  seg_file,
  seg_file_source = "battenberg",
  assume_diploid = FALSE,
  genes,
  include_silent = FALSE,
  this_seq_type = "genome",
  projection = "grch37"
)

Arguments

this_sample_id

Sample ID of the sample you want to annotate.

coding_only

Optional. set to TRUE to restrict to only coding variants.

from_flatfile

Optional. Instead of the database, load the data from a local MAF and seg file.

use_augmented_maf

Boolean statement if to use augmented maf, default is FALSE.

tool_name

name of tool to be used, default is "battenberg".

maf_file

Path to maf file.

maf_df

Optional. Use a maf dataframe instead of a path.

seg_file

path to seq file.

seg_file_source

Specify what copy number calling program the input seg file is from, as it handles ichorCNA differently than WisecondorX, Battenberg, etc.

assume_diploid

Optional. If no local seg file is provided, instead of defaulting to a GAMBL sample, this parameter annotates every mutation as copy neutral.

genes

Genes of interest.

include_silent

Logical parameter indicating whether to include silent mutations into coding mutations. Default is FALSE

this_seq_type

Specified seq type for returned data.

projection

specified genome projection that returned data is in reference to.

Value

A list containing a data frame (MAF-like format) with two extra columns: log.ratio is the log ratio from the seg file (NA when no overlap was found) as well as the segmented copy number data with the same copy number information CN is the rounded absolute copy number estimate of the region based on log.ratio (NA when no overlap was found)

Details

This function takes a sample ID with the this_sample_id parameter and annotates mutations with copy number information. A variety of parameters are at hand for a customized workflow. For example, the user can specify if only coding mutations are of interest. To do so, set coding_only = TRUE. It is also possible to point the function to already loaded maf/seq files, or a path to these files. See parameters; maf_file, maf_path, seq_file and seg_path for more information on how to use these parameters. This function can also take a vector with genes of interest (genes) that the returned data frame will be restricted to. Is this function not what you are looking for? Try one of the following, similar, functions; get_cn_segments, get_cn_states, get_sample_cn_segments

Examples

cn_list = assign_cn_to_ssm(this_sample_id = "HTMCP-01-06-00422-01A-01D",
                           coding_only = TRUE)
#> trying to find output from: battenberg
#> looking for flatfile: /projects/nhl_meta_analysis_scratch/gambl/results_local/gambl/battenberg_current/99-outputs/seg/genome--projection/HTMCP-01-06-00422-01A-01D--HTMCP-01-06-00422-10A-01D--matched.battenberg.grch37.seg