Analyzing and compiling large datasets is one of the fundamental principles behind the development of computational systems that evolved into modern computers. But what happens when the datasets become so large or complex that traditional processing applications are inadequate? Such are the questions presented to Big Data scientists who are trying to disseminate the seemingly insurmountable glut of digital information.
Nowhere has the big data push had a bigger impact than on the field of genomics. Understanding the key interactions between genes contained within the human genome and the subsequent proteins for which they code is critical to unlocking the functional roles of specific biomolecules in disease states.
In order to address many of these genomic issues researchers from the Icahn School of Medicine at Mount Sinai and Princeton University have designed a new online tool that predicts the role of key proteins and genes in diseases of the human immune system.
“The resulting comprehensive web-accessible resource (ImmuNet) facilitates researchers’ use of the global data output to generate testable hypotheses for specific immunological research areas,” noted the scientists. “We demonstrate the value and ease of use of ImmuNet to complement genetic studies for identifying disease-associated genes.”
The online tool uses information compiled from 38,088 public experiments in order to predict new immune pathway interactions, mechanisms, and disease-associated genes. Due to advances in computing power and storage, combined with sharp declines in technology prices, big data researchers are able to combine more powerful algorithms and models into tools such as ImmunNet—which can pull previously unidentified disease patterns from databases.
"This new tool unlocks the insight contained in big data, the world's biomedical research output, to help understand immunological mechanisms and diseases," explained co-senior author Stuart Sealfon, M.D., professor and chairman of the department of neurology at Mount Sinai Health System. "The goal of 'ImmuNet' is to accelerate the understanding of immune pathways and genes, ultimately leading to the development of improved treatment for diseases with an immunological component."
The findings from this study were published online late today in Immunity through an article entitled “Interactive Big Data Resource to Elucidate Human Immune Pathways and Diseases.”
ImmuNet enables immunology researchers without special computational training to use the statistical techniques of Bayesian data integration—an interpretation of statistical probability—and machine learning algorithms to "interrogate" this huge archive of public data. Analyses such as these are able to detect relevant information among the sea of often-conflicting data obtained from diverse experiments. Additionally, Bayesian analysis is able to discern only those datasets that provide new insight about a pathway of interest while excluding datasets that are not relevant to the targeted pathway.
The researchers stressed that one of the main goals of ImmuNet is to advance the understanding of the immune system, the network of cells and organs that protects the body against infections and cancer.
"We expect the applicability of ImmuNet to wide-ranging areas of immunology will grow with the incorporation of continually increasing public big data," stated co-senior author Olga Troyanskaya, Ph.D., professor in the department of computer science and Lewis-Sigler Institute of Integrative Genomics at Princeton University. "By enabling immune researchers from diverse backgrounds to leverage these valuable and heterogeneous data collections, ImmuNet has the potential to accelerate discovery in immunology."