Skip to contents

Checks for the presence of mutations at a given motif

Usage

annotate_ssm_motif_context(
  maf,
  motif = "WRCY",
  index = 3,
  genome_build,
  fastaPath
)

Arguments

maf

MAF data frame (required columns: Reference_Allele, Chromosome, Start_Position, End_Position)

motif

The motif sequence (default is WRCY)

index

Position of the mutated allele in the motif

genome_build

The genome build for the variants you are working with (default is grch37)

fastaPath

Can be a path to a FASTA file

Value

A data frame with two extra columns (seq and motif).

Details

In positions that reference allele has been mutated, it will capture (motif length - 1) before and (motif length + 1) alleles after the mutated position. Then, it looks for the presence of motif in the captured sequence and check if the mutation has occurred in the indexed position, it will return SITE and if the the motif is present, but the mutation is not in the indexed position, it will return MOTIF. In other cases, it will return FALSE. NA will be shown if the mutation is an indel mutation.

Examples

my_maf <- get_coding_ssm() %>% 
  dplyr::filter(Hugo_Symbol=="BCL2") %>%
  dplyr::arrange(Chromosome,Start_Position,Tumor_Sample_Barcode) %>%
  head()
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)

annotated = annotate_ssm_motif_context(maf = my_maf,
                                       motif = "WRCY")

dplyr::select(annotated,1,5,6,11,13,16,seq,WRCY)
#> genomic_data Object
#> Genome Build: grch37 
#> Showing first 10 rows:
#>   Hugo_Symbol Chromosome Start_Position Reference_Allele Tumor_Seq_Allele2
#> 1        BCL2         18       60795859                C                 T
#> 2        BCL2         18       60795894                A                 T
#> 3        BCL2         18       60795911                A                 G
#> 4        BCL2         18       60795911                A                 G
#> 5        BCL2         18       60795947                C                 G
#> 6        BCL2         18       60985305              GTG               AAT
#>   Tumor_Sample_Barcode       seq  WRCY
#> 1      CAR_126_PreCART   CTTCACT FALSE
#> 2            08-13706T   GCAAGCT MOTIF
#> 3    00-15336_CLC01491   CCAAACT MOTIF
#> 4    00-15336_CLC02290   CCAAACT MOTIF
#> 5             FL1010T2   AATCAAA FALSE
#> 6             SP192988 CAAGTGCAC FALSE