Optimize the threshold for classifying samples as "Other"
optimize_outgroup.RdPerforms a post-hoc evaluation of the classification of a sample as one of the main classes vs the outgroup/unclassified label "Other" and returns the optimal threshold for classifying a sample as "Other" based on the ground truth provided in the true_labels vector. It evaluates the performance of the classifier using a range of thresholds and returns the best threshold based on the specified metric (balanced accuracy or accuracy).
Usage
optimize_outgroup(
predicted_labels,
true_labels,
other_score,
all_classes = c("MCD", "EZB", "BN2", "N1", "ST2", "Other"),
maximize = "balanced_accuracy",
exclude_other_for_accuracy = FALSE,
cap_classification_rate = 1,
verbose = FALSE,
other_class = "Other"
)Arguments
- predicted_labels
Vector of predicted labels for the samples
- true_labels
Vector of true labels for the samples
- other_score
Vector of scores for the "Other" class for each sample
- all_classes
Vector of classes to use for training and testing. Default: c("MCD","EZB","BN2","N1","ST2","Other")
- maximize
Metric to use for optimization. Either "accuracy" (actual accuracy across all samples) or "balanced_accuracy" (the mean of the balanced accuracy values across all classes). Default: "balanced_accuracy"
- exclude_other_for_accuracy
Set to TRUE to exclude the "Other" class from the 'lymphgen' column when calculating accuracy metrics (passed to DLBCLone_optimize_params). Default: FALSE