Helper function for checking integrity of study files.

study_check(
  data_clinical_samples_path = "data_clinical_samples.txt",
  data_fusions_path = "data_fusions.txt",
  cases_fusions_path = "case_lists/cases_fusion.txt",
  cases_all_path = "case_lists/cases_all.txt",
  cases_sequenced_path = "case_lists/cases_sequenced.txt",
  project_name = "gambl_genome",
  out_dir
)

Arguments

data_clinical_samples_path

Path to clinical file.

data_fusions_path

Path to data_fusion file from setup_fusions.

cases_fusions_path

Path to cases_fusion from setup_fusions.

cases_all_path

Path to cases_all from setup_study.

cases_sequenced_path

Path to cases_sequenced from setup_study.

project_name

Project name, should match what is specified under setup_study/setup_fusions.

out_dir

Directory with all study related files, the only argument that needs to be specified, given that paths to all generated study files are not changed from default.

Value

Nothing.

Details

This function was designed to ensure that all the sample IDs described in the maf are actually present in the clinical files. If this is not the case, the function will notify the user what samples are found in the case list that are not described in the clinical file. The function then sub-sets the case list to only include samples from the clinical file. Note that the project_name has to match what is specified for the previously run functions (i.e setup_study, setup_fusions and finalize_study).

Examples

if (FALSE) {
samples_not_in_clinical = study_check(out_dir = "GAMBLR/cBioPortal/instance01/")
}