Skip to contents

Assemble the binary feature matrix and use the random forest prediction model to classify BL tumors into genetic subgroups. Please see PMID 36201743 on genetic subgroups of BL.

Usage

classify_bl(
  these_samples_metadata,
  maf_data,
  projection = "grch37",
  output = "both",
  ashm_cutoff = 3
)

Arguments

these_samples_metadata

The metadata data frame that contains sample_id column with ids for the samples to be classified. Required input.

maf_data

The MAF data frame to be used for matrix assembling. Any maf columns can be provided, but the required are "Hugo_Symbol", "NCBI_Build", "Chromosome", "Start_Position", "End_Position", "Variant_Classification", "HGVSp_Short", and "Tumor_Sample_Barcode". Required input.

projection

The projection of the samples. Defaults to grch37.

output

The output to be returned after prediction is done. Can be one of predictions, matrix, or both. Defaults to both.

ashm_cutoff

Numeric value indicating number of mutations for binarizing aSHM feature. Recommended to use the default value (3).

Value

data frame with classification, binary matrix used in classification, or both

Examples

if (FALSE) { # \dontrun{
test_meta <- get_gambl_metadata()  %>%
    filter(pathology == "BL")
maf <- get_ssm_by_samples(
    these_samples_metadata = test_meta
)
predictions <- classify_bl(
     these_samples_metadata = test_meta,
     maf_data = maf
)
predictions <- classify_bl(
     these_samples_metadata = test_meta,
     maf_data = maf,
     output = "predictions"
)
} # }