Maf To Custom Track.
maf_to_custom_track.Rd
Convert mutations into a UCSC custom track file
Usage
maf_to_custom_track(
maf_data,
these_samples_metadata = NULL,
this_seq_type = "genome",
output_file,
as_bigbed = FALSE,
colour_column = "lymphgen",
as_biglolly = FALSE,
track_name = "GAMBL mutations",
track_description = "mutations from GAMBL",
verbose = FALSE,
padding_size = 0,
projection = "grch37",
bedToBigBed_path = "config",
these_sample_ids = NULL
)
Arguments
- maf_data
maf_data obtained from of the
get_ssm
family of functions.- these_samples_metadata
A metadata table to subset the samples of interest from the input
maf_data
. If NULL (the default), all samples inmaf_data
are kept.- this_seq_type
The seq type you want back, default is "genome".
- output_file
Name for your new bed file that can be uploaded as a custom track to UCSC.
- as_bigbed
Boolean parameter controlling the format of the returned file. Default is FALSE.
- colour_column
Set the colouring properties of the returned bed file. Per default, this function will assign colour based on "lymphgen".
- as_biglolly
Boolean parameter controlling the format of the returned file. Default is FALSE (i.e a BED file will be returned).
- track_name
Track name. Default is "GAMBL mutations"
- track_description
Track description. Default is "mutations from GAMBL"
- verbose
Default is FALSE.
- padding_size
Optional parameter specifying the padding size in the returned file, default is 0.
- projection
Specify which genome build to use. Possible values are "grch37" (default) or "hg38". This parameter has effect only when
as_bigbed
oras_biglolly
is TRUE.- bedToBigBed_path
Path to your local
bedToBigBed
UCSC tool or the string"config"
(default). If set to"config"
,GAMBLR.helpers::check_config_value
is called internally and thebedToBigBed
path is obtained from theconfig.yml
file saved in the current working directory. This parameter is ignored if bothas_bigbed
andas_biglolly
is set toFALSE
.- these_sample_ids
DEPRECATED
Details
This function takes a set of mutations as maf_data and converts it to a UCSC Genome Browser ready BED (or bigbed/biglolly) file complete with the required header. Upload the resulting file to UCSC genome browser to view your data as a custom track. Optional parameters available for further customization of the returned file. For more information, refer to the parameter descriptions and function examples.
Examples
# using grch37 coordinates
myc_grch37 <- GAMBLR.utils::create_bed_data(
GAMBLR.data::grch37_lymphoma_genes_bed
) %>%
dplyr::filter(name == "MYC")
print(myc_grch37)
#> genomic_data Object
#> Genome Build: grch37
#> Showing first 10 rows:
#> chrom start end name
#> 1 8 128747680 128753674 MYC
# desired projection will be automatically set to the
# genome_build of your region object
genome_maf <- get_ssm_by_regions(regions_bed = myc_grch37,
these_samples_metadata = get_gambl_metadata(),
this_seq_type = "genome",
streamlined = FALSE)
#> 3273 capture samples are missing a value for protocol. Assuming Exome.
#> 138 biopsies are missing from the biopsy metadata. This should be fixed!
#> affected cohorts: DLBCL_LSARP_Trios,Ennishi_tapestri,SMZL_Strefford,cHL_Maura,MCL_Barcelona
#> 110 biopsies with discrepancies in the pathology field. This should be fixed!
#> 10 biopsies with discrepancies in the time_point field. This should be fixed!
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#> dat <- vroom(...)
#> problems(dat)
# myc_hg19.bed will be created in your working directory
maf_to_custom_track(maf_data = genome_maf,
output_file = "myc_genome_hg19.bed")
#> 3273 capture samples are missing a value for protocol. Assuming Exome.
#> 138 biopsies are missing from the biopsy metadata. This should be fixed!
#> affected cohorts: DLBCL_LSARP_Trios,Ennishi_tapestri,SMZL_Strefford,cHL_Maura,MCL_Barcelona
#> 110 biopsies with discrepancies in the pathology field. This should be fixed!
#> 10 biopsies with discrepancies in the time_point field. This should be fixed!
#> Joining with `by = join_by(group)`
#lazy/concise way:
my_region = "8:128747680-128753674"
capture_maf <- get_ssm_by_regions(regions_list = my_region,
these_samples_metadata = get_gambl_metadata(),
this_seq_type = "genome",
projection = "grch37",
streamlined = FALSE)
#> 3273 capture samples are missing a value for protocol. Assuming Exome.
#> 138 biopsies are missing from the biopsy metadata. This should be fixed!
#> affected cohorts: DLBCL_LSARP_Trios,Ennishi_tapestri,SMZL_Strefford,cHL_Maura,MCL_Barcelona
#> 110 biopsies with discrepancies in the pathology field. This should be fixed!
#> 10 biopsies with discrepancies in the time_point field. This should be fixed!
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#> dat <- vroom(...)
#> problems(dat)
maf_to_custom_track(maf_data = capture_maf,
output_file = "myc_capture_hg19.bed")
#> 3273 capture samples are missing a value for protocol. Assuming Exome.
#> 138 biopsies are missing from the biopsy metadata. This should be fixed!
#> affected cohorts: DLBCL_LSARP_Trios,Ennishi_tapestri,SMZL_Strefford,cHL_Maura,MCL_Barcelona
#> 110 biopsies with discrepancies in the pathology field. This should be fixed!
#> 10 biopsies with discrepancies in the time_point field. This should be fixed!
#> Joining with `by = join_by(group)`