count_ssm_by_region.Rd
Count the variants in a region with a variety of filtering options.
count_ssm_by_region(
region,
chromosome,
start,
end,
all_mutations_in_these_regions,
these_samples_metadata,
count_by,
seq_type = "genome"
)
Region formatted like chrX:1234-5678 instead of specifying chromosome, start and end separately.
The chromosome you are restricting to (with or without a chr prefix).
Query start coordinate of the range you are restricting to.
Query end coordinate of the range you are restricting to.
If you are calling this function many times (e.g. bins spanning a larger region), to save a ton of time you are strongly encouraged to provide the output of get_ssm_by_region
on the entire region of interest and passing it to this function
A metadata table subset to the sample IDs of interest. If not provided, the function will call get_gambl_metadata
and regions will be returned for all samples in the metadata.
Defaults to counting all variants. Specify 'sample_id' if you want to collapse and count only one per sample
The seq_type you want back, default is genome.
This function internally calls get_ssm_by_region thus, the parameters available to this function are arguments that are being passed to the internal call. For more details on how these parameters can be used, refer to get_ssm_by_region.
#define a region.
my_region = gene_to_region(gene_symbol = "MYC",
return_as = "region")
#> 1 region(s) returned for 1 gene(s)
#get meta data and subset
my_metadata = get_gambl_metadata()
fl_metadata = dplyr::filter(my_metadata, pathology == "FL")
#count SSMs for the selected sample subset and defined region.
fl_ssm_counts_myc = count_ssm_by_region(region = my_region,
these_samples_metadata = fl_metadata)