Count the variants in a region with a variety of filtering options.

count_ssm_by_region(
  region,
  chromosome,
  start,
  end,
  all_mutations_in_these_regions,
  these_samples_metadata,
  count_by,
  seq_type = "genome"
)

Arguments

region

Region formatted like chrX:1234-5678 instead of specifying chromosome, start and end separately.

chromosome

The chromosome you are restricting to (with or without a chr prefix).

start

Query start coordinate of the range you are restricting to.

end

Query end coordinate of the range you are restricting to.

all_mutations_in_these_regions

If you are calling this function many times (e.g. bins spanning a larger region), to save a ton of time you are strongly encouraged to provide the output of get_ssm_by_region on the entire region of interest and passing it to this function

these_samples_metadata

A metadata table subset to the sample IDs of interest. If not provided, the function will call get_gambl_metadata and regions will be returned for all samples in the metadata.

count_by

Defaults to counting all variants. Specify 'sample_id' if you want to collapse and count only one per sample

seq_type

The seq_type you want back, default is genome.

Details

This function internally calls get_ssm_by_region thus, the parameters available to this function are arguments that are being passed to the internal call. For more details on how these parameters can be used, refer to get_ssm_by_region.

Examples

#define a region.
my_region = gene_to_region(gene_symbol = "MYC",
                           return_as = "region")
#> 1 region(s) returned for 1 gene(s)

#get meta data and subset
my_metadata = get_gambl_metadata()
fl_metadata = dplyr::filter(my_metadata, pathology == "FL")

#count SSMs for the selected sample subset and defined region.
fl_ssm_counts_myc = count_ssm_by_region(region = my_region,
                                       these_samples_metadata = fl_metadata)