MedicalResearch.com Interview with:
Ernest Turro, PhD
Associate Professor Genetics and Genomics Sciences
The Turro group runs a research program on statistical genomics,
with a dual focus on rare diseases and blood-related traits at the Icahn School of Medicine
Mount Sinai Health System
MedicalResearch.com: What is the background for this study? Would you describe the Rareservoir database?
The main motivation for our work is that only half of the approximately 10,000 catalogued rare diseases have a resolved genetic cause (or aetiology). Patients with these diseases are unable to obtain a genetic diagnosis which could otherwise inform prognosis, treatment for themselves and affected relatives.
One route towards resolving the remaining aetiologies is to enroll large numbers of rare disease patients into research studies so that statistical analyses can be performed comparing the genetic with the clinical characteristics of the study participants. One major endeavour, the 100,000 Genomes Project (100KGP), sequenced the genomes and collected clinical phenotype data for 34,523 UK patients and 43,016 unaffected relatives across 29,741 families.
The scale of this study is unprecedented, partly thanks to the ever-decreasing cost of DNA sequencing (25 years ago, it cost $1bn to sequence a whole genome, while now it costs only a few hundred dollars). Working with such large datasets is notoriously cumbersome. To overcome this, we developed a computational approach (the Rareservoir) that distills the most important information into a relatively small database, allowing us to apply our statistical methods nimbly.
We noted that the "genetic variants" that cause rare diseases are typically kept rare in the human population by natural selection because affected individuals tend to have few children, if any. This meant that we could discard the genetic information corresponding to variants that are common in the human population without throwing away the key disease-causing variants. By focussing on these "rare variants", we were able to perform our analyses using a small database (a `Rareservoir’), only 5.5GB in size, hastening our progress significantly.