SES Text Mining Algorithms for Electronic Health Records

From the Crawford Lab: We have developed a set of text-mining algorithms to extract education and occupation, both important variables that describe socioeconomic status (SES), from electronic health records.  The development and evaluation of the algorithm is described in PMC5147499, and the exclusion, jobs, and prefix lists developed for this algorithm can be found here.  Detailed […]

Reducing Clinical Noise for Body Mass Index Measures Due to Unit and Transcription Errors in the Electronic Health Record.

Body mass index (BMI) is an important outcome and covariate adjustment for many clinical association studies. Accurate assessment of BMI, therefore, is a critical part of many study designs. Electronic health records (EHRs) are a growing source of clinical data for research purposes, and have proven useful for identifying and replicating genetic associations. EHR-based data […]

Hi-MC: High-throughput Mitochondrial Haplogroup Classification

From the Crawford Lab: The Hi-MC package provides high-level mitochondrial haplogroups given standard PLINK .map and.ped files.  Hi-MC is a cost-effective approach to characterize major haplogroups in large sample sizes similar to those described in PMC4113317.  Detailed usage of the package can be found on the github, the preprint, and the final publication in PeerJ.

COCOS: Codon Consequence Scanner

From the Bush Lab and Haines Lab, Mariusz Butkiewicz has developed COCOS, a plugin for the Ensembl Variant Effect Predictor (VEP) plugin for annotating reading frame changes. The plugin captures Amino Acid sequence alterations stemming from variants that produce an altered reading frame, e.g. stop-lost variants and small genetic Insertion and Deletions (InDels).  The GitHub repository for COCOS can be found here.

Interaction eQTL Analysis

From the Bush Lab: This archive contains scripts and data for performing an analysis looking for cis-interacting variants that influence gene expression.  Our publication can be found here: http://www.cell.com/ajhg/fulltext/S0002-9297(16)30323-8 Analysis workflow Tarball archive of analysis scripts Tarball archive of data files

Early-Onset Alzheimer Disease and Candidate Risk Genes Involved in Endolysosomal Transport.

Mutations in APP, PSEN1, and PSEN2 lead to early-onset Alzheimer disease (EOAD) but account for only approximately 11% of EOAD overall, leaving most of the genetic risk for the most severe form of Alzheimer disease unexplained. This extreme phenotype likely harbors highly penetrant risk variants, making it primed for discovery of novel risk genes and pathways for […]

A population-specific reference panel empowers genetic studies of Anabaptist populations.

Genotype imputation is a powerful strategy for achieving the large sample sizes required for identification of variants underlying complex phenotypes, but imputation of rare variants remains problematic. Genetically isolated populations offer one solution, however population-specific reference panels are needed to assure optimal imputation accuracy and allele frequency estimation. Here we report the Anabaptist Genome Reference […]

Integrative analysis of novel hypomethylation and gene expression signatures in glioblastomas.

Molecular and clinical heterogeneity critically hinders better treatment outcome for glioblastomas (GBMs); integrative analysis of genomic and epigenomic data may provide useful information for improving personalized medicine. By applying training-validation approach, we identified a novel hypomethylation signature comprising of three CpGs at non-CpG island (CGI) open sea regions for GBMs. The hypomethylation signature consistently predicted […]

Do race and age vary in non-malignant central nervous system tumor incidences in the United States?

Epidemiological analyses of many cancers have demonstrated differences in incidence and outcome for patients from different racial backgrounds. The aim of this study was to determine the incidence of non-malignant CNS tumors by race and age to identify incidence variance. Data from the Central Brain Tumor Registry of the United States (CBTRUS) from 2009 to […]

Do race and age vary in non-malignant central nervous system tumor incidences in the United States?

Epidemiological analyses of many cancers have demonstrated differences in incidence and outcome for patients from different racial backgrounds. The aim of this study was to determine the incidence of non-malignant CNS tumors by race and age to identify incidence variance. Data from the Central Brain Tumor Registry of the United States (CBTRUS) from 2009 to […]

Germline Genetic Variants and Lung Cancer Survival in African Americans.

Background: African Americans have the highest lung cancer mortality in the United States. Genome-wide association studies (GWASs) of germline variants influencing lung cancer survival have not yet been conducted with African Americans. We examined five previously reported GWAS catalog variants and explored additional genome-wide associations among African American lung cancer cases.Methods: Incident non-small cell lung […]

Fine-mapping of lipid regions in global populations discovers ethnic-specific signals and refines previously identified lipid loci.

Genome-wide association studies have identified over 150 loci associated with lipid traits, however, no large-scale studies exist for Hispanics and other minority populations. Additionally, the genetic architecture of lipid-influencing loci remains largely unknown. We performed one of the most racially/ethnically diverse fine-mapping genetic studies of HDL-C, LDL-C, and triglycerides to-date using SNPs on the MetaboChip […]