Researchers at Rady Children’s Institute for Genomic Medicine (RCIGM) have built an automated pipeline designed to deliver a potential diagnosis in record time for hospitalized children suspected of having rare genetic diseases.
The pipeline required minimal user intervention, increased usability, and shortened the time to diagnosis, delivering a provisional finding in a median time of less than 24 hours, Rady said. The pipeline includes rapid whole genome sequencing from a blood sample, machine learning for genetic interpretation, and the use of clinical natural language processing to analyze data from electronic health records.
The work is published in Science Translational Medicine in a paper titled, “Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation.”
“This is truly pioneering work by the RCIGM team–saving the lives of very sick newborn babies by using AI to rapidly and accurately analyze their whole genome sequence,” Eric Topol, MD, Professor of Molecular Medicine at Scripps Research. Topol is also Executive VP, Scripps Research, and Director & Founder, Scripps Research Translational Institute.
Said Michelle Clark, PhD, statistical scientist at RCIGM and the first author of the study: “Using machine-learning platforms doesn’t replace human experts. Instead it augments their capabilities.
“By informing timely targeted treatments, rapid genome sequencing can improve the outcomes of seriously ill children with genetic diseases,” Clark added.
The technologies involved include a rapid Whole Genome Sequencing (rWGS) process to screen a child’s entire genetic makeup for thousands of genetic anomalies from a blood sample. Key components in the rWGS pipeline come from Illumina, including Nextera DNA Flex library preparation, whole genome sequencing via the NovaSeq 6000 and the S1 flow cell format.
Other pipeline elements include Clinithink’s clinical natural language processing platform CliX ENRICH that quickly combs through a patient’s electronic medical record to automatically extract crucial phenotype information.
Another core element of the machine learning system is MOON by Diploid. The platform automates genome interpretation using AI to automatically filter and rank likely pathogenic variants. Deep phenotype integration, based on natural language processing of the medical literature, is one of the key features driving this automated interpretation. MOON takes five minutes to suggest the causal mutation out of the 4.5 million variants in a whole genome. In addition, Alexion’s rare disease and data science expertise enabled the translation of clinical information into a computable format for guided variant interpretation.
Comparison and Verification
The genetic sequencing data was fed into automated computational platforms under the supervision of researchers. For comparison and verification, clinical medical geneticists on the team used Fabric Genomics’ AI-based algorithms–VAAST and Phevor integrated into the clinical decision support software, OPAL (now called Fabric Enterprise)–to confirm the output of the automated pipeline.
Fabric Enterprise, in turn, is integrated into RCIGM’s standard variant analysis and data interpretation workflow, Fabric Genomics said, combining sequencing data with phenotypic data from patient records to rank genes based on pathogenicity and identify the genetic variants responsible for genetic disease.
RCIGM began performing genomic sequencing in July 2017 to guide medical intervention to neonatal and pediatric intensive care (NICU/PICU) patients. In February 2018, RCIGM researchers broke the Guinness World Record for the fastest diagnosis through whole genome sequencing, with an average of 19 hours.
As of March of this year, the team had completed testing and interpretation of more than 750 children’s genomes. One-third of the children tested have received a genetic diagnosis, with 25% of those benefiting from an immediate change in clinical care based upon their diagnosis.
“Some people call this artificial intelligence. We call it augmented intelligence,” said Stephen Kingsmore, MD, DSc, President and CEO of RCIGM. “Patient care will always begin and end with the doctor. By harnessing the power of technology, we can quickly and accurately determine the root cause of genetic diseases. We rapidly provide this critical information to intensive care physicians so they can focus on personalizing care for babies who are struggling to survive.”
An estimated 4% of newborns in North America are affected by genetic diseases, which are the leading cause of death in infants. Rare genetic diseases also account for approximately 15% of admissions to children’s hospitals.
Increased automation of the process removes a barrier to scaling up clinical use of WGS by reducing the need for time-consuming manual analysis and interpretation of the data by scarce certified clinical medical geneticists. Although this pipeline would need to be adapted for use at different hospital systems, such an automated tool could aid clinicians to expedite an accurate genetic disease diagnosis, potentially hastening lifesaving changes to patient care. This new method opens the door to increased use of genome sequencing as a first-line diagnostic test for babies with cryptic conditions.
“We’ve been partners with [RCIGM] for years and feel privileged to help them make such a positive impact on children with otherwise undiagnosed conditions,” added Martin Reese, PhD, Founder and CEO of Fabric Genomics.