adjust_ploidy.Rd
adjust_ploidy
returns a seg file with log.ratios adjusted to the overall sample ploidy.
adjust_ploidy(
this_seg,
seg_path,
projection = "grch37",
pga,
pga_cutoff = 0.05,
exclude_sex = TRUE,
return_seg = TRUE
)
Input data frame of seg file.
Optionally, specify the path to a local seg file.
Argument specifying the projection of seg file, which will determine chr prefix and genome size. Default is grch37, but hg38 is also accepted.
If PGA is calculated through other sources, the data frame with columns sample_id and PGA can be provided in this argument.
Minimum PGA for the sample to adjust ploidy. Default is 0.05 (5%).
Boolean argument specifying whether to exclude sex chromosomes from calculation. Default is TRUE.
Boolean argument specifying whether to return a data frame in seg-consistent format, or a raw data frame with all step-by-step transformations. Default is TRUE.
A data frame in seg-consistent format with ploidy-adjusted log ratios.
This function adjusts the ploidy of the sample using the percent of genome altered (PGA). The PGA is calculated internally, but can also be optionally provided as data frame if calculated from other sources. Only the samples above the threshold-provided PGA will have ploidy adjusted. The function can work with either individual or multi-sample seg file. The telomeres are always excluded from calculation, and sex chromosomes can be optionally included or excluded. The supported projections are grch37 and hg38. The chromosome prefix is handled internally per projection and does not need to be consistent.
sample_seg = get_sample_cn_segments(this_sample_id = "14-36022T")
sample_seg = dplyr::rename(sample_seg, "sample" = "ID")
adjust_ploidy(this_seg = sample_seg)
#> Calculating PGA ...
#> Returning the seg file with ploidy-adjusted CN ...
#> # A tibble: 187 × 6
#> sample chrom start end LOH_flag log.ratio
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 14-36022T 1 10001 762600 0 0
#> 2 14-36022T 1 762601 121500000 0 0
#> 3 14-36022T 1 142600000 248277662 0 0
#> 4 14-36022T 1 248277663 248278622 0 0
#> 5 14-36022T 1 248278623 249226346 0 -1
#> 6 14-36022T 1 249226347 249250620 0 0
#> 7 14-36022T 2 10001 11319 0 0
#> 8 14-36022T 2 11320 90500000 0 0
#> 9 14-36022T 2 96800000 186704965 0 0
#> 10 14-36022T 2 186704966 186712276 0 0
#> # ℹ 177 more rows
one_sample = get_sample_cn_segments(this_sample_id = "14-36022T")
one_sample = dplyr::rename(one_sample, "sample" = "ID")
another_sample = get_sample_cn_segments(this_sample_id = "BLGSP-71-21-00243-01A-11E")
another_sample = dplyr::rename(another_sample, "sample" = "ID")
multi_sample_seg = rbind(one_sample, another_sample)
adjust_ploidy(this_seg = multi_sample_seg)
#> Calculating PGA ...
#> Returning the seg file with ploidy-adjusted CN ...
#> # A tibble: 278 × 6
#> sample chrom start end LOH_flag log.ratio
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 14-36022T 1 10001 762600 0 0
#> 2 14-36022T 1 762601 121500000 0 0
#> 3 14-36022T 1 142600000 248277662 0 0
#> 4 14-36022T 1 248277663 248278622 0 0
#> 5 14-36022T 1 248278623 249226346 0 -1
#> 6 14-36022T 1 249226347 249250620 0 0
#> 7 14-36022T 2 10001 11319 0 0
#> 8 14-36022T 2 11320 90500000 0 0
#> 9 14-36022T 2 96800000 186704965 0 0
#> 10 14-36022T 2 186704966 186712276 0 0
#> # ℹ 268 more rows