calc_mutation_frequency_sliding_windows.Rd
Count the number of mutations in a sliding window across a region for all samples.
calc_mutation_frequency_sliding_windows(
this_region,
chromosome,
start_pos,
end_pos,
metadata,
seq_type,
slide_by = 100,
window_size = 1000,
plot_type = "none",
sortByColumns = "pathology",
return_format = "long-simple",
min_count_per_bin = 3,
return_count = FALSE,
drop_unmutated = FALSE,
classification_column = "lymphgen",
from_indexed_flatfile = FALSE,
mode = "slms-3"
)
Genomic region in bed format.
Chromosome name in region.
Start coordinate of region.
End coordinate of region.
Data frame containing sample ids and column with annotated data for the 2 groups of interest. All other columns are ignored. Currently, function exits if asked to compare more than 2 groups.
The seq_type you want back, default is genome.
Slide size for sliding window, default is 100.
Size of sliding window, default is 1000.
Set to TRUE for a plot of your bins. By default no plots are made.
Which of the metadata to sort on for the heatmap
Return format of mutations. Accepted inputs are "long" and "long-simple". Default is "long-simple".
Minimum counts per bin, default is 3.
Boolean statement to return count. Default is FALSE.
This may not currently work properly. Default is FALSE.
Only used for plotting, default is "lymphgen"
Set to TRUE to avoid using the database and instead rely on flat-files (only works for streamlined data, not full MAF details). Default is FALSE.
Only works with indexed flat-files. Accepts 2 options of "slms-3" and "strelka2" to indicate which variant caller to use. Default is "slms-3".
Count matrix.
This function is called to return the mutation frequency for a given region, for all GAMBL samples. Regions are specified with the this_region
parameter.
Alternatively, the region of interest can also be specified by calling the function with chromosome
, start_pos
, and end_pos
parameters.
It is also possible to return a plot of the created bins. This is done with setting plot_type = TRUE
.
There are a collection of parameters available for further customizing the return, for more information, refer to the parameter descriptions and examples.
This function is unlikely to be used directly in most cases. See get_mutation_frequency_bin_matrix instead.
chr11_mut_freq = calc_mutation_frequency_sliding_windows(this_region = "chr11:69455000-69459900",
slide_by = 10,
window_size = 10000)
#> processing bins of size 10000 across 4900 bp region