Assign CN to SSM. — assign_cn_to

Annotate mutations with their copy number information.

assign_cn_to_ssm(
  this_sample_id,
  coding_only = FALSE,
  from_flatfile = TRUE,
  use_augmented_maf = TRUE,
  tool_name = "battenberg",
  maf_file,
  maf_df,
  seg_file,
  seg_file_source = "battenberg",
  assume_diploid = FALSE,
  genes,
  include_silent = FALSE,
  this_seq_type = "genome",
  projection = "grch37"
)

Arguments

this_sample_id: Sample ID of the sample you want to annotate.
coding_only: Optional. set to TRUE to restrict to only coding variants.
from_flatfile: Optional. Instead of the database, load the data from a local MAF and seg file.
use_augmented_maf: Boolean statement if to use augmented maf, default is FALSE.
tool_name: name of tool to be used, default is "battenberg".
maf_file: Path to maf file.
maf_df: Optional. Use a maf dataframe instead of a path.
seg_file: path to seq file.
seg_file_source: Specify what copy number calling program the input seg file is from, as it handles ichorCNA differently than WisecondorX, Battenberg, etc.
assume_diploid: Optional. If no local seg file is provided, instead of defaulting to a GAMBL sample, this parameter annotates every mutation as copy neutral.
genes: Genes of interest.
include_silent: Logical parameter indicating whether to include silent mutations into coding mutations. Default is FALSE
this_seq_type: Specified seq type for returned data.
projection: specified genome projection that returned data is in reference to.

Value

A list containing a data frame (MAF-like format) with two extra columns: log.ratio is the log ratio from the seg file (NA when no overlap was found) as well as the segmented copy number data with the same copy number information CN is the rounded absolute copy number estimate of the region based on log.ratio (NA when no overlap was found)

Details

This function takes a sample ID with the this_sample_id parameter and annotates mutations with copy number information. A variety of parameters are at hand for a customized workflow. For example, the user can specify if only coding mutations are of interest. To do so, set coding_only = TRUE. It is also possible to point the function to already loaded maf/seq files, or a path to these files. See parameters; maf_file, maf_path, seq_file and seg_path for more information on how to use these parameters. This function can also take a vector with genes of interest (genes) that the returned data frame will be restricted to. Is this function not what you are looking for? Try one of the following, similar, functions; get_cn_segments, get_cn_states, get_sample_cn_segments

Examples

cn_list = assign_cn_to_ssm(this_sample_id = "HTMCP-01-06-00422-01A-01D",
                           coding_only = TRUE)
#> trying to find output from: battenberg
#> looking for flatfile: /projects/nhl_meta_analysis_scratch/gambl/results_local/gambl/battenberg_current/99-outputs/seg/genome--projection/HTMCP-01-06-00422-01A-01D--HTMCP-01-06-00422-10A-01D--matched.battenberg.grch37.seg