Subset maf file to only features that would be available in the WEX data.

genome_to_exome(
  maf,
  custom_bed,
  genome_build = "grch37",
  padding = 100,
  chr_prefixed = FALSE
)

Arguments

maf

Incoming maf object. Can be maf-like data frame or maftools maf object. Required parameter. Minimum columns that should be present are Chromosome, Start_Position, and End_Position.

custom_bed

Optional argument specifying a path to custom bed file for covered regions. Must be bed-like and contain chrom, start, and end position information in the first 3 columns. Other columns are disregarded if provided.

genome_build

String indicating genome build of the maf file. Default is grch37, but can accept modifications of both grch37- and hg38-based builds.

padding

Numeric value that will be used to pad probes in WEX data from both ends. Default is 100. After padding, overlapping features are squished together.

chr_prefixed

Is the data chr-prefixed or not? Default is FALSE.

Value

A data frame of a maf-like object with the same columns as in input, but where rows are only kept for features that would be present as if the sample is WEX.

Details

To subset an incoming MAF data frame to only show features that would be available in WEX data this function was developed. Pass the incoming MAF (genome) to the maf parameter as the only required parameter to run this function. Other parameters such as custom_bed, genome_build, padding, and chr_prefixed are also available for greater control of how this function operates. Refer to parameter descriptions for more information on how to use the available parameters.

Examples

#get all ssm in the MYC aSHM region
myc_ashm_maf = get_ssm_by_region(region = "8:128748352-128749427")

#get mutations with 100 bp padding (default)
maf = genome_to_exome(maf = myc_ashm_maf)

#get mutations covered in WEX with no padding
maf = genome_to_exome(maf = myc_ashm_maf,
                padding = 0)