Generate an md5 hash for a set of samples to help ensure reproducibility

get_samples_md5_hash(
  these_samples_metadata,
  these_samples,
  sample_set_name,
  sample_sets_df
)

Arguments

these_samples_metadata

Optionally provide a metadata table or any data frame with a column named sample_id that has been subset to the samples you're working with.

these_samples

Optionally provide a vector of sample_id you are working with.

sample_set_name

Optionally provide the name of a sample set in GAMBL and the function will load the samples from that set and provide the hash.

sample_sets_df

Optionally provide a data frame of the sample sets instead of relying on/loading the local file from the GAMBL repo.

Value

The md5 hash of the ordered set of sample_id.

Details

This function can accept a wide range of formatted sample IDs to create an md5 hash. For example, if the user is working with an already subset metadata table (with sample IDs of interest), The user can give this table to the function with these_sampels_metadata. As an alternative, sample IDs can also be provided as a vector of characters with these_samples parameter. Another option is to use defined sample sets (GAMBL) with sample_set_name. As a final option, the user can also provide a data frame with samples IDs instead of loading them from the GAMBL repo, This is achieved with calling the sample_sets_df parameter.