Retrieve all SSMs from the GAMBL database within a single genomic coordinate range.

get_ssm_by_region(
  chromosome,
  qstart,
  qend,
  region = "",
  basic_columns = TRUE,
  streamlined = FALSE,
  maf_data,
  seq_type = "genome",
  projection = "grch37",
  from_indexed_flatfile = TRUE,
  augmented = TRUE,
  min_read_support = 3,
  mode = "slms-3",
  verbose = FALSE
)

Arguments

chromosome

The chromosome you are restricting to (with or without a chr prefix).

qstart

Query start coordinate of the range you are restricting to.

qend

Query end coordinate of the range you are restricting to.

region

Region formatted like chrX:1234-5678 instead of specifying chromosome, start and end separately.

basic_columns

Set to FALSE to return MAF with all columns (116). Default is TRUE, which returns the first 45 columns. Note that if streamlined is set to TRUE, only two columns will be returned, regardless of what's specified in this parameter.

streamlined

Return Start_Position and Tumor_Smaple_Barcode as the only two MAF columns. Default is FALSE. Setting to TRUE will overwrite anything specified with basic_columns.

maf_data

An already loaded MAF like object to subset to regions of interest.

seq_type

The seq_type you want back, default is genome.

projection

Obtain variants projected to this reference (one of grch37 or hg38).

from_indexed_flatfile

Set to TRUE to avoid using the database and instead rely on flatfiles.

augmented

default: TRUE. Set to FALSE if you instead want the original MAF from each sample for multi-sample patients instead of the augmented MAF.

min_read_support

Only returns variants with at least this many reads in t_alt_count (for cleaning up augmented MAFs).

mode

Only works with indexed flatfiles. Accepts 2 options of "slms-3" and "strelka2" to indicate which variant caller to use. Default is "slms-3".

verbose

Boolean parameter set to FALSE per default.

Value

A data frame containing all the MAF data columns (one row per mutation).

Details

This function lets the user specify a region of interest for returning SSM calls within that region. There are multiple ways a region can be specified. For example, the user can provide the full region in a "region" format (chr:start-end) to the region parameter. Or, the user can provide chromosome, start and end coordinates individually with chr, start, and end parameters. For more usage examples, refer to the parameter descriptions and examples in the vignettes. Is this function not what you are looking for? Try one of the following, similar, functions; get_coding_ssm, get_coding_ssm_status, get_ssm_by_patients, get_ssm_by_sample, get_ssm_by_samples, get_ssm_by_regions

Examples

#basic usage
my_mutations = get_ssm_by_region(region = "chr8:128,723,128-128,774,067")

#specifying chromosome, start and end individually
my_mutations = get_ssm_by_region(chromosome = "8",
                                 qstart = 128723128,
                                 qend = 128774067)

#keep all 116 columns in the read MAF
bcl2_all_details = get_ssm_by_region(region = "chr18:60796500-60988073",
                                     basic_columns = FALSE)
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)