Automated Churchill Program Sequences Entire Human Genome In 90 Minutes

Peter White, Ph.D. Principal Investigator, Center for Microbial Pathogenesis Director, Biomedical Genomics Core Director of Molecular Bioinformatics, The Research Institute at Nationwide Children's Hospital Assistant Professor of Pediatrics, The Ohio State University MedicalResearch.com Interview with:
Peter White, Ph.D.
Principal Investigator, Center for Microbial Pathogenesis
Director, Biomedical Genomics Core
Director of Molecular Bioinformatics, The Research Institute at Nationwide Children’s Hospital
Assistant Professor of Pediatrics, The Ohio State University

Medical Research: What is the background for this study? What are the main findings?

Dr. White: Next generation sequencing has revolutionized genomics research and has opened the door to a new era of genomic medicine. It’s now possible to sequence a patients entire genome in about two days, but the output from the sequencer must go through multiple computationally challenging steps before it can be processed for clinically relevant information. The challenge we found is that this data analysis process was requiring days to perform, by highly qualified bioinformaticians and required enormous computational resources.

To overcome the challenges of analyzing that large amount of genomic sequence data, we developed a computational pipeline called “Churchill”, which we published in the latest issue of Genome Biology (http://genomebiology.com/2015/16/1/6/abstract). Churchill fully automates the analytical process required to take raw sequence data through a series of complex and computationally intensive processes, ultimately producing a list of genetic variants ready for clinical interpretation and tertiary analysis. The major impact of our work was the development of a novel balanced parallelization strategy that allows efficient analysis of a whole genome sequencing sample in as little as 90 minutes.

Medical Research: What should clinicians and patients take away from your report?

Dr. White: At Nationwide Children’s we have a strategic goal to introduce genomic medicine into multiple domains of pediatric research and healthcare. Rapid diagnosis of monogenic disease can be critical in newborns, so our initial focus was to create an analysis pipeline that was extremely fast, but didn’t sacrifice clinical diagnostic standards of reproducibility and accuracy. Each step in the process was optimized to significantly reduce analysis time, without sacrificing data integrity, resulting in an analysis method that is 100 percent reproducible. The output of Churchill was validated using National Institute of Standards and Technology (NIST) benchmarks. In comparison with other computational pipelines, Churchill was shown to have the highest overall diagnostic effectiveness and was the only pipeline to be able to function deterministically.

Medical Research: What recommendations do you have for future research as a result of this study?

Dr. White: Genomics is having a tremendous impact in the discovery and diagnosis or rare genetic diseases in both research and clinical settings, with its importance being highlighted by President Obama’s recent Precision Medicine Initiative. However, to truly see the potential of genomic medicine we will need to start sequencing entire patient populations. Such studies are already beginning, for example “Genomics England” will sequence 100,000 genomes from patients with rare inherited diseases, cancer and infection. Using cloud-computing resources from Amazon Web Services, Churchill was able to complete analysis of 1,088 whole genome samples from the 1000 Genomes Project in seven days and identified millions of new genetics variants. We believe that Churchill may be an optimal approach to tackle the data analysis challenge of population scale genomic studies, and through the use of cloud computing we hope to create an environment where genomic data can be easily shared and analyzed by scientists around the globe.

Citation:

Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics

Kelly BJ, Fitch JR, Hu Y, Corsmeier DJ, Zhong H, Wetzel AN, Nordquist RD, Newsom DL, White P.

Genome Biol. 2015 Jan 20;16(1):6.

 

Peter White, Ph.D., Principal Investigator, Center for Microbial Pathogenesis, & Director, Biomedical Genomics Core (2015). Automated Churchhill Program Sequences Entire Human Genome In 90 Minutes MedicalResearch.com