Skip to contents

Post-process KNN results across K to score consistency, (optionally) refine classified/Other cutoffs per-class and (optionally) assign composite classes

Usage

DLBCLone_ensemble_postprocess(
  optimized_model,
  assign_composites = FALSE,
  other_min = 2,
  any_split = TRUE,
  min_purity = 0.75,
  min_gap = 2,
  optimize_per_class = TRUE
)

Arguments

optimized_model

Output of DLBCLone_optimize_params

assign_composites

Logical; if TRUE, samples with split votes across multiple in-group classes will be assigned a composite class (e.g. "EZB/MCD") instead of "Other".

other_min

Integer; For comparing across a range of K values,this is the threshold for the number of K values a sample is classified as Other for it to be re-assigned as Other. Set this to a high value if you don't want samples to be reassigned at all.

any_split

Logical; if TRUE, any split among in-group votes across the Ks tested will triggers reassignment (or composite).

min_purity

Numeric in the range of 0-1; top in-group vote share required to keep the top class instead of assigning a composite class.

min_gap

Integer; top minus second in-group votes must be at least this gap to keep top class.

optimize_per_class

Logical; If TRUE, a range of thresholds will be tested per class to optimize the classification/Other cutoff. This is a more complex approach that may yield better results but is not yet fully validated.

Value

list with:

  • predictions: updated predictions_df (DLBCLone_wo or DLBCLone_wc / DLBCLone_wc_simplified)

  • consistency_report: per-sample counts and decision flags