Annotate SSM with Blacklists
annotate_ssm_blacklist.Rd
Annotate and auto-drop a MAF data frame with existing blacklists.
Usage
annotate_ssm_blacklist(
mutations_df,
this_seq_type,
tool_name = "slms_3",
tool_version = "1.0",
annotator_name = "vcf2maf",
annotator_version = "1.2",
genome_build = "grch37",
project_base,
blacklist_file_template,
drop_threshold = 4,
return_blacklist = FALSE,
use_curated_blacklist = FALSE,
verbose = FALSE,
invert = FALSE
)
Arguments
- mutations_df
A data frame with mutation data.
- this_seq_type
The seq_type of your mutations if you prefer to apply only the corresponding blacklist. More than one seq_type can be specified as a vector if desired. This parameter is required.
- tool_name
The tool or pipeline that generated the files (should be the same for all).
- tool_version
The version of the tool specified under
tool_name
.- annotator_name
Name of annotator, default is "vcf2maf".
- annotator_version
Version of annotator specified under
annotator_name
.- genome_build
The genome build projection for the variants you are working with (default is grch37).
- project_base
Optional: A full path to the directory that your blacklist_file_pattern is relative to.
- blacklist_file_template
Optional: A string that contains the relative path to your blacklist file from after the project_base (i.e. results) with any wildcards surrounded with curly braces.
- drop_threshold
The minimum count from one of the blacklists to drop a variant.
- return_blacklist
Boolean parameter for returning the blacklist. Default is FALSE.
- use_curated_blacklist
Boolean parameter for using a curated blacklist, default is FALSE.
- verbose
For debugging, print out a bunch of possibly useful information.
- invert
USE WITH CAUTION! This returns only the variants that would be dropped in the process (opposite of what you want, probably).
Value
A MAF format data frame with two new columns indicating the number of occurrences of each variant in the two blacklists.
Details
Annotate and auto-drop a MAF data frame with existing blacklists to remove variants that would be dropped during the merge process.
This function returns a MAF format data frame with two new columns, indicating the number of occurrences of each variant in the two blacklists.
Note that there are a collection of parameters to this function to improve flexibility for many applications,
such as return_blacklist
(returns the used blacklist to the vector given the function, or printed to the terminal if blank).
For returning variants that would be dropped, one can specify invert = TRUE
, please use with caution, this is most likely the opposite of what you want from this function.
Lastly, the minimum count from one of the blacklists to drop a variant is specified with drop_threshold = 4
.
This function also conveniently lets you know how many variants that were dropped in the annotation process.