grande_maf.Rd
A MAF (data frame) drawn from the Grande et al. dataset.
grande_maf
grande_maf
A MAF in data frame format. 12251 rows and 125 columns.
HUGO symbol for the gene (HUGO symbols are always in all caps). "Unknown" is used for regions that do not correspond to a gene
Entrez gene ID (an integer). "0" is used for regions that do not correspond to a gene region or Ensembl ID
One or more genome sequencing center reporting the variant
The reference genome used for the alignment
The affected chromosome
Lowest numeric position of the reported variant on the genomic reference sequence. Mutation start coordinate
Highest numeric genomic position of the reported variant on the genomic reference sequence. Mutation end coordinate
Genomic strand of the reported allele. Currently, all variants will report the positive strand: '+'
Translational effect of variant allele
Type of mutation. TNP (tri-nucleotide polymorphism) is analogous to DNP (di-nucleotide polymorphism) but for three consecutive nucleotides. ONP (oligo-nucleotide polymorphism) is analogous to TNP but for consecutive runs of four or more (SNP, DNP, TNP, ONP, INS, DEL, or Consolidated)
The plus strand reference allele at this position. Includes the deleted sequence for a deletion or "-" for an insertion
Primary data genotype for tumor sequencing (discovery) allele 1. A "-" symbol for a deletion represents a variant. A "-" symbol for an insertion represents wild-type allele. Novel inserted sequence for insertion does not include flanking reference bases
Tumor sequencing (discovery) allele 2
The rs-IDs from the dbSNP database, "novel" if not found in any database used, or null if there is no dbSNP record, but it is found in other databases
The dbSNP validation status is reported as a semicolon-separated list of statuses. The union of all rs-IDs is taken when there are multiple
Aliquot barcode for the tumor sample
Aliquot barcode for the matched normal sample
Primary data genotype. Matched normal sequencing allele 1. A "-" symbol for a deletion represents a variant. A "-" symbol for an insertion represents wild-type allele. Novel inserted sequence for insertion does not include flanking reference bases (cleared in somatic MAF)
Matched normal sequencing allele 2
Secondary data from orthogonal technology. Tumor genotyping (validation) for allele 1. A "-" symbol for a deletion represents a variant. A "-" symbol for an insertion represents wild-type allele. Novel inserted sequence for insertion does not include flanking reference bases
Secondary data from orthogonal technology. Tumor genotyping (validation) for allele 2
Second pass results from independent attempt using same methods as primary data source. Generally reserved for 3730 Sanger Sequencing
Second pass results from orthogonal technology
An assessment of the mutation as somatic, germline, LOH, post transcriptional modification, unknown, or none. The values allowed in this field are constrained by the value in the Validation_Status field
TCGA sequencing phase (if applicable). Phase should change under any circumstance that the targets under consideration change
Molecular assay type used to produce the analytes used for sequencing. Allowed values are a subset of the SRA 1.5 library_strategy field values. This subset matches those used at CGHub
The assay platforms used for the validation call
Boolean variable
Boolean column stating if BAM file exists or not
Instrument used to produce primary sequence data
GDC aliquot UUID for tumor sample
GDC aliquot UUID for matched normal sample
The coding sequence of the variant in HGVS recommended format
The protein sequence of the variant in HGVS recommended format. "p.=" signifies no change in the protein
Same as the HGVSp column, but using 1-letter amino-acid codes
Ensembl ID of the transcript affected by the varian
The exon number (out of total number)
Read depth across this locus in tumor BAM
Read depth supporting the reference allele in tumor BAM
Read depth supporting the variant allele in tumor BAM
Read depth across this locus in normal BAM
Read depth supporting the reference allele in normal BAM (cleared in somatic MAF)
Read depth supporting the variant allele in normal BAM (cleared in somatic MAF)
A semicolon delimited list of all possible variant effects, sorted by priority
The variant allele used to calculate the consequence
Stable Ensembl ID of affected gene
Stable Ensembl ID of feature (transcript, regulatory, motif)
Type of feature. Currently one of Transcript, RegulatoryFeature, MotifFeature (or blank)
Consequence type of this variant; sequence ontology terms
Relative position of base pair in the cDNA sequence as a fraction. A "-" symbol is displayed as the numerator if the variant does not appear in cDNA
Relative position of base pair in coding sequence. A "-" symbol is displayed as the numerator if the variant does not appear in coding sequence
Relative position of affected amino acid in protein. A "-" symbol is displayed as the numerator if the variant does not appear in coding sequence
Only given if the variation affects the protein-coding sequence
The alternative codons with the variant base in upper case
Known identifier of existing variation
Allele number from input; 0 is reference, 1 is first alternate etc.
Shortest distance from the variant to transcript
The gene symbol
The source of the gene symbol
Gene identifier from the HUGO Gene Nomenclature Committee if applicable
Biotype of transcript
A flag (YES) indicating that the VEP-based canonical transcript, the longest translation, was used for this gene. If not, the value is null
The CCDS identifier for this transcript, where applicable
The Ensembl protein identifier of the affected transcript
UniProtKB/Swiss-Prot accession
UniProtKB/TrEMBL identifier of protein product
UniParc identifier of protein product
RefSeq identifier for this transcript
The SIFT prediction and/or score, with both given as prediction (score)
The PolyPhen prediction and/or score
The exon number (out of total number)
The intron number (out of total number)
The source and identifier of any overlapping protein domains
Non-reference allele and frequency of existing variant in 1000 Genomes
Non-reference allele and frequency of existing variant in 1000 Genomes
Non-reference allele and frequency of existing variant in 1000 Genomes
Non-reference allele and frequency of existing variant in 1000 Genomes combined African population
Non-reference allele and frequency of existing variant in 1000 Genomes combined American population
Non-reference allele and frequency of existing variant in 1000 Genomes combined Asian population
Non-reference allele and frequency of existing variant in 1000 Genomes combined East Asian population
Non-reference allele and frequency of existing variant in 1000 Genomes combined European population
Non-reference allele and frequency of existing variant in 1000 Genomes combined South Asian population
Non-reference allele and frequency of existing variant in NHLBI-ESP African American population
Non-reference allele and frequency of existing variant in NHLBI-ESP European American population
Clinical significance of variant from dbSNP as annotated in ClinVar
Somatic status of each ID reported under Existing_variation (0, 1, or null)
Pubmed ID(s) of publications that cite existing variant
The source and identifier of a transcription factor binding profile aligned at this position
The relative position of the variation in the aligned TFBP
A flag indicating if the variant falls in a high information position of a transcription factor binding profile (TFBP) (Y, N, or null)
The difference in motif score of the reference and variant sequences for the TFBP
The impact modifier for the consequence type
Indicates if this block of consequence data was picked by VEP's pick feature (1 or null)
Sequence Ontology variant class
Transcript support level, which is based on independent RNA analyses
Indicates by how many bases the HGVS notations for this variant have been shifted
Indicates if existing variant is associated with a phenotype, disease or trait (0, 1, or null)
Alleles in this variant have been converted to minimal representation before consequence calculation (1 or null)
Global Allele Frequency from ExAC
African/African American Allele Frequency from ExAC
American Allele Frequency from ExAC
East Asian Allele Frequency from ExAC
Finnish Allele Frequency from ExAC
Non-Finnish European Allele Frequency from ExAC
Other Allele Frequency from ExAC
South Asian Allele Frequency from ExAC
Indicates if gene that the variant maps to is associated with a phenotype, disease or trait (0, 1, or null)
Copied from input VCF. This includes filters implemented directly by the variant caller and other external software used in the DNA-Seq pipeline. See below for additional details.
The flanking basepairs
Variant ID
Variant quality
Adjusted Global Allele Frequency from ExAC
Adjusted Global Allele Frequency from ExAC
Global Allele Frequency from ExAC
African/African American Allele Frequency from ExAC
American Allele Frequency from ExAC
East Asian Allele Frequency from ExAC
Finnish Allele Frequency from ExAC
Non-Finnish European Allele Frequency from ExAC
Other Allele Frequency from ExAC
South Asian Allele Frequency from ExAC
Filter