Estimate Purity.
estimate_purity.Rd
Annotate a MAF with segmented absolute copy number data and added additional columns (VAF, Ploidy and Final_purity).
Usage
estimate_purity(
these_samples_metadata,
maf_data,
seg_data,
show_plots = FALSE,
assume_diploid = FALSE,
coding_only = FALSE,
projection,
verbose = FALSE
)
Arguments
- these_samples_metadata
Metadata for one sample
- maf_data
Optional. Instead of using the path to a maf file, use a local dataframe as the maf file.
- seg_data
Data frame or seg_data object for the sample of interest. Can contain data from other samples, which will be ignored.
- show_plots
Optional. Show two faceted plots that display the VAF and purity distributions for each copy number state in the sample. Default is FALSE.
- assume_diploid
Optional. If no local seg file is provided, instead of defaulting to a GAMBL sample, this parameter annotates every mutation as copy neutral. Default is FALSE.
- coding_only
Optional. set to TRUE to restrict to only coding variants. Default is FALSE.
- in_maf
Path to a local maf file.
- in_seg
Path to a local corresponding seg file for the same sample ID as the input maf.
- this_seq_type
Seq type for returned CN segments. One of "genome" (default) or "capture".
- verbose.
Set to TRUE for more feedback.
Value
A list containing a data frame (MAF-like format) with the segmented absolute copy number data and three extra columns: VAF is the variant allele frequency calculated from the t_ref_count and t_alt_count Ploidy is the number of copies of an allele in the tumour cell Final_purity is the finalized purity estimation per mutation after considering different copy number states and LOH events.
Details
This function takes a single row of metadata that defines a
sample you wish to estimate the purity of.
The user can also use an already loaded maf file with maf_df
. In addition, a path to the maf/seq file of interest can also be passed to this function with
in_maf
and in_seg
. To visualize VAF and purity distributions, set the show_plots
to TRUE (default is FALSE).
For more information on how to run this function with the parameters at hand, refer to the parameter descriptions and function examples.
Examples
# get metadata for one sample
my_meta = suppressMessages(get_gambl_metadata()) %>%
dplyr::filter(sample_id == "HTMCP-01-06-00422-01A-01D",
seq_type == "genome")
#estimate purity, allowing the data to be retrieved for you
outputs = estimate_purity(these = my_meta,
show_plots = TRUE,
projection = "grch37")
#> dummy segments are not annotated in the inputs
#> fill_missing_with parameter will be ignored
#> Running in default mode of any...
outputs$sample_purity_estimation
#> [1] 1