A recently awarded federal grant will help University of Minnesota researchers combine separate data and improve research across different fields of biological science.
School of Public Health researchers have begun developing new processes to analyze data that will allow them to identify certain biological processes, all of which are studied using different methods. The team received a $1 million federal grant from the National Institute of General Medical Sciences, which began funding their work in March.
“The overarching goal of our work is to develop methods that allow for the comprehensive analysis of … data that are both multi-source and also multi-way,” said Eric Lock, lead researcher and assistant professor in the University’s Division of Biostatistics.
Currently, data is split into separate groups that can only capture one aspect of functions happening in the body, straining researchers’ ability to see how different processes relate. Lock said it is challenging to take multiple sources of data that are measured from different technologies and incorporate them into a joint statistical framework, organizing data usable within software for analysis.
Many different studies collect both multi-source and multi-way data. These are from various technologies with different study samples or from the same technology measured over different timeframes. For example, an analysis to identify lung gene expression may include data measuring gene expression in the skin, blood and lung tissue. That study may also measure gene expression over time.
Lock said the project will take four or five years before their method will be widely usable for researchers.
Pierre-Gilles Henry, assistant professor in the University’s Center for Magnetic Resonance Research, works with his team on genetic ataxias, a group of diseases that affect coordination and balance, typically resulting from brain deterioration.
Using MRI and magnetic resonance spectroscopy, his team identifies structural and biochemical changes in the brain and spinal cord.
“Dr. Lock’s new methods [will] allow us to combine these different sources of data together to track disease progression even more precisely than by using only one dataset at a time,” Henry said. “This would be a tremendous development. We are excited by this collaboration.”
Lynn Eberly, co-investigator and professor in the Division of Biostatistics, said dealing with big masses of data is becoming a more common difficulty, especially in fields like genetic studies. In this field, researching small groups of humans involves hundreds of thousands of genes.
Eberly works with MRIs, but said even though the imaging is all magnetic resonance, they represent very different processes.
“They’re all signals from the MR magnet, and are represented by numbers on a scale, and the scales can be very different, and their interpretation can be very different,” she said.
Raghavendra Rao, neonatologist and associate professor of pediatrics, works with a research team focused on discovering markers in blood plasma for impending brain dysfunction due to iron deficiency in infants. Rao said a sophisticated strategy is necessary for combining large datasets from different platforms to effectively complete this work.