Retrieve Manta SVs and filter.

get_manta_sv(
  these_sample_ids,
  these_samples_metadata,
  projection = "grch37",
  chromosome,
  qstart,
  qend,
  region,
  min_vaf = 0.1,
  min_score = 40,
  pass = TRUE,
  pairing_status,
  from_flatfile = TRUE,
  verbose = TRUE
)

Arguments

these_sample_ids

A vector of multiple sample_id (or a single sample ID as a string) that you want results for.

these_samples_metadata

A metadata table to auto-subset the data to samples in that table before returning.

projection

The projection genome build.

chromosome

Optional, the chromosome you are restricting to.

qstart

Optional, query start coordinate of the range you are restricting to.

qend

Optional, query end coordinate of the range you are restricting to.

region

Optional, region formatted like chrX:1234-5678 instead of specifying chromosome, start and end separately.

min_vaf

The minimum tumour VAF for a SV to be returned. Default is 0.1.

min_score

The lowest Manta somatic score for a SV to be returned. Default is 40.

pass

If set to TRUE, only return SVs that are annotated with PASS in the FILTER column. Set to FALSE to keep all variants, regardless if they PASS the filters. Default is TRUE.

pairing_status

Use to restrict results (if desired) to matched or unmatched results (default is to return all).

from_flatfile

Set to TRUE by default, FALSE is no longer supported (database).

verbose

Set to FALSE to prevent the path of the requested bedpe file to be printed.

Value

A data frame in a bedpe-like format with additional columns that allow filtering of high-confidence SVs.

Details

Return Manta SVs with aditional VCF information to allow for filtering of high-confidence variants. To return SV calls for multiple samples, give these_sample_ids a vector of sample IDs, if only one sample is desired, give this parameter one sample ID, as a string (or a vector of characters). The user can also call the these_samples_metadata parameter to make use of an already subset metadata table. In this case, the returned calls will be restricted to the sample_ids within that data frame. This function relies on a set of specific functions to be successful in returning SV calls for any available sample in gambl. First, this function calls get_combined_sv and performs an anit_join with the full metadata to identify what samples are currently missing from the return of get_combined_sv. This function then calls get_manta_sv_by_samples (wrapper function for get_manta_sv_by_sample) on the subset of the missing samples. The merged calls are subject to any filtering that is specified within this function. This function can also restrict the returned calls to any genomic regions specified within chromosome, qstart, qend, or the complete region specified under region (in chr:start-end format). Useful filtering parameters are also available, use min_vaf to set the minimum tumour VAF for a SV to be returned and min_score to set the lowest Manta somatic score for a SV to be returned. pair_status can be used to only return variants that are annotated with PASS in the filtering column (VCF). Is this function not what you are looking for? Try one of the following, similar, functions; get_combined_sv, get_manta_sv_by_sample, get_manta_sv_by_samples

Examples

#lazily get every SV in the table with default quality filters
all_sv = get_manta_sv(verbose = FALSE)
#> WARNING! No SV calls found in flat-file for: 171116-PL02
#> WARNING! No SV calls found in flat-file for: 171447-PL01
#> WARNING! No SV calls found in flat-file for: 171451-PL01

#get all SVs for a single sample
some_sv = get_manta_sv(these_sample_ids = "94-15772_tumorA")

#get the SVs in a region around MYC
myc_locus_sv = get_manta_sv(region = "8:128723128-128774067", verbose = FALSE)
#> WARNING! No SV calls found in flat-file for: 171116-PL02
#> WARNING! No SV calls found in flat-file for: 171447-PL01
#> WARNING! No SV calls found in flat-file for: 171451-PL01

#get SVs for multiple samples, using these_samples_id
my_metadata = get_gambl_metadata()
these_samples = dplyr::select(my_metadata, sample_id)
my_samples_df = head(these_samples, 10)
my_samples = pull(my_samples_df, sample_id)

my_svs_2 = get_manta_sv(these_sample_ids = my_samples,
                        projection = "hg38",
                        verbose = FALSE)

#get SVs for multiple samples using a metadata table and with no VAF/score filtering
my_metadata = get_gambl_metadata() %>%
this_metadata = head(my_metadata, 10)
#> Error in get_gambl_metadata() %>% this_metadata = head(my_metadata, 10): invalid (NULL) left side of assignment

my_svs = get_manta_sv(these_samples_metadata = this_metadata,
                      verbose = FALSE,
                      min_vaf = 0,
                      min_score = 0)
#> Error in get_manta_sv(these_samples_metadata = this_metadata, verbose = FALSE,     min_vaf = 0, min_score = 0): object 'this_metadata' not found