Enabling Genomic-Phenomic Association Discovery without Sacrificing Anonymity

doi:10.1371/journal.pone.0053875

Enabling Genomic-Phenomic Association Discovery without Sacrificing Anonymity

Figure 14

Datasets used for comparison of anonymization strategies.

The DATA SELECT process is an extraction of some records of the SD into a smaller, specific dataset, such as BioVU or a demonstration cohort. The ANONYMIZE process is the anonymization algorithm described in this manuscript. The DEMO EXTRACT process selects the remaining records associated with the Demonstration cohort from a larger, anonymized dataset. The resultant datasets are as follows: anonymized version of the Synthetic Derivative (SD-Anon); anonymized version of BioVU (BioVU-Anon); SD-Anon, from which the demonstration group is extracted (); BioVU-Anon, from which the demonstration group is extracted (); and the anonymized version of the demonstration cohort (). , , and each represent different anonymizations of the Demonstration group.

doi: https://doi.org/10.1371/journal.pone.0053875.g014