Annotate mutations target motif
annotate_ssm_motif_context.Rd
Checks for the presence of mutations at a given motif
Arguments
- maf
MAF data frame (required columns: Reference_Allele, Chromosome, Start_Position, End_Position)
- motif
The motif sequence (default is WRCY)
- index
Position of the mutated allele in the motif
- genome_build
The genome build for the variants you are working with (default is grch37)
- fastaPath
Can be a path to a FASTA file
Details
In positions that reference allele has been mutated, it will capture (motif length - 1) before and (motif length + 1) alleles after the mutated position. Then, it looks for the presence of motif in the captured sequence and check if the mutation has occurred in the indexed position, it will return SITE and if the the motif is present, but the mutation is not in the indexed position, it will return MOTIF. In other cases, it will return FALSE. NA will be shown if the mutation is an indel mutation.
Examples
my_maf <- get_coding_ssm() %>%
dplyr::filter(Hugo_Symbol=="BCL2") %>%
dplyr::arrange(Chromosome,Start_Position,Tumor_Sample_Barcode) %>%
head()
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#> dat <- vroom(...)
#> problems(dat)
annotated = annotate_ssm_motif_context(maf = my_maf,
motif = "WRCY")
dplyr::select(annotated,1,5,6,11,13,16,seq,WRCY)
#> genomic_data Object
#> Genome Build: grch37
#> Showing first 10 rows:
#> Hugo_Symbol Chromosome Start_Position Reference_Allele Tumor_Seq_Allele2
#> 1 BCL2 18 60795859 C T
#> 2 BCL2 18 60795894 A T
#> 3 BCL2 18 60795911 A G
#> 4 BCL2 18 60795911 A G
#> 5 BCL2 18 60795947 C G
#> 6 BCL2 18 60985305 GTG AAT
#> Tumor_Sample_Barcode seq WRCY
#> 1 CAR_126_PreCART CTTCACT FALSE
#> 2 08-13706T GCAAGCT MOTIF
#> 3 00-15336_CLC01491 CCAAACT MOTIF
#> 4 00-15336_CLC02290 CCAAACT MOTIF
#> 5 FL1010T2 AATCAAA FALSE
#> 6 SP192988 CAAGTGCAC FALSE