Using Electronic Health Records to Generate Phenotypes for Research

Sarah A. Pendergrass and Dana C. Crawford

Electronic health records contain patient-level data collected during and for clinical care. Data within the electronic health record include diagnostic billing codes, procedure codes, vital signs, laboratory test results, clinical imaging, and physician notes. With repeated clinic visits, these data are longitudinal, providing important information on disease development, progression, and response to treatment or intervention strategies. The near universal adoption of electronic health records nationally has the potential to provide population-scale real-world clinical data accessible for biomedical research, including genetic association studies. For this research potential to be realized, high-quality research-grade variables must be extracted from these clinical data warehouses. We describe here common and emerging electronic phenotyping approaches applied to electronic health records, as well as current limitations of both the approaches and the biases associated with these clinically collected data that impact their use in research.


Posted in Dana Crawford, Publications and tagged , , , , .
Dana Crawford

Dana Crawford

Professor of Population and Quantitative Health Sciences and Associate Director of the Cleveland Institute for Computational Biology, with interest in pharmacogenomics, electronic health records, and diverse populations. Also, an avid foodie!