Plot Quality Control Metrics.
fancy_qc_plot.Rd
Plot for visualizing QC metrics and allowing for grouping by different metadata columns.
Usage
fancy_qc_plot(
these_sample_ids,
keep_cohort,
keep_pathology,
seq_type = "genome",
metadata,
these_samples_metadata,
plot_data,
fill_by = "pathology",
comparison_samples,
plot_title = "",
y_axis_lab = "",
return_plotdata = FALSE
)
Arguments
- these_sample_ids
Data frame with sample IDs (to be plotted) in the first column (has to be named sample_id).
- keep_cohort
Optional parameter to be used when these_sample is NULL. Returns metadata and filters on the cohort supplied in this parameter.
- keep_pathology
Optional parameter to be used when these_sample is NULL. Returns metadata and filters on the pathology supplied in this parameter.
- seq_type
Selected seq type for incoming QC metrics.
- metadata
Optional, user can provide a metadata df to subset sample IDs from.
- these_samples_metadata
GAMBL metadata subset to the cases you want to process.
- plot_data
Plotting parameter, define the data type to be plotted.
- fill_by
Parameter for specifying fill variable for grouped bar plot. Can be any factor from incoming metadata, e.g pathology, cohort, etc.
- comparison_samples
Optional parameter, give the function a vector of sample IDs to be compared against the main plotting group. Pathology is default.
- plot_title
Plotting parameter, plot title.
- y_axis_lab
Plotting parameter, label of y-axis.
- return_plotdata
Optional parameter, if set to TRUE a vector of acceptable data types for plotting will be returned, and nothing else.
Details
This function is readily available for visualizing a variety of quality control metrics. To get started, the user can easily overview all the available metrics with `return_plotdata = TRUE`. When this parameter is set to TRUE, a vector of characters will be returned detailing all the, for this plot, available metrics. After deciding what metric to plot, simply give the metric of choice to the `plot_data` parameter. This function also lets the user provide a data frame with sample IDs to be included in the plot. Optionally, the user can also provide an already filtered metadata table with sample IDs of interest to the `these_samples_metadata`. If none of the two parameters are supplied, the user can easily restrict the plot to any cohort and/or pathology of their liking. This is done by calling `keep_cohort` and `keep_pathology`. If these parameters are used, the function will retrieve metadata for all available GAMBL sample IDs and then subset to the specified cohort or pathology. The layout of the returned plot can also be further customized with `sort_by`. This parameter controls the order in which samples would appear. Similarly, `fill_by` allows the user to control on what factor the plot will be filled by. Sometimes it can also be useful to see how a subset of samples compares to another group; to do this one could call the function with a vector of additional sample IDs given to the `comparison_samples` parameter (see examples for more information). lastly, the plot can also be configured with custom plot title and axis labels (`plot_title` and `y_axis_lab`). For more information, see examples and parameter descriptions.
Examples
if (FALSE) { # \dontrun{
#load packages
library(dplyr)
library(GAMBLR.open)
#get sample IDs for available genome samples
genome_collated = collate_results(seq_type_filter = "genome") %>%
pull(sample_id)
#subset the collated samples on BL samples
my_samples = get_gambl_metadata() %>%
dplyr::filter(sample_id %in% genome_collated) %>%
dplyr::filter(pathology == "BL") %>% pull(sample_id)
fancy_qc_plot(these_sample_ids = my_samples, plot_data = "AverageBaseQuality")
} # }