Over the last fifteen years, we have seen a fundamental shift in analytical research. For decades, the limiting factor for conducting quantitative research was the collection of relevant data to test a particular hypothesis. Investigators would painstakingly design a research study, enroll participants, collect relevant data, and finally perform statistical analyses to test their particular hypothesis. While this approach is still invaluable for asking focused research questions, modern researchers swim in a deluge of data, generated for high-throughput experimental studies (such as genome-wide association studies), from clinical data systems (such as electronic health records), or from general activities of modern daily life, such as using a smart phone.
The new challenge for data analysis is developing computational approaches and frameworks that allow investigators to ask insightful questions from these large collections of data, and perhaps more importantly, understand and interpret the meaning of analysis results with an understanding of the quirks and biases of data collection. We have developed a variety of approaches for analyzing genomic data, including PARIS, a tool for examining pathway-based hypotheses in genetic data, and Hi-MC, a tool for defining haplogroups from mitochondrial genetic data.