Automated Chart Review for Identifying Factors Associated with Childhood Asthma by Utilizing Electronic Medical Records 
Sunday, March 4, 2018: 4:30 PM
South Hall A2 (Convention Center)
Chung-Il Wi, MD, So-Hyun Lee, Sungrim Moon, PhD, Hee Yun Seol, MD, Sunghwan Sohn, PhD, Euijung Ryu, PhD, Hongfang Liu, PhD, Young J. Juhn, MD MPH
RATIONALE: Our recently validated natural language processing (NLP) algorithms using electronic medical records (EMRs) significantly reduced the time and effort for chart review for ascertaining asthma status. We extended our prior work to leverage NLP algorithms for extracting individual risk factors for childhood asthma from EMRs.

METHODS: We utilized a convenience sample (n=397) of the 1997-2007 Olmsted County Birth Cohort. The training cohort (n=177) was used for training the NLP systems for extracting from both children’s and their parents’ EMRs key terms/sentences related to 1) a patient (child) level (ie, history of breastfeeding and other atopic conditions [allergy rhinitis, eczema, and food allergy] and 2) a family level (ie, family history of asthma and other atopic conditions) risk factors. We assessed the performance of the NLP algorithms with manual chart review as gold standard (criterion validity) in an independent test cohort (n=220).

RESULTS: The median age of the test cohort was 13 years (50 % female, 81% Whites, 63% asthmatics). 90% and 6% of children had history of breastfeeding and food allergy, respectively with prevalence of other histories ranging 15-52%. Positive predictive values for NLP algorithms in predicting each asthma-related variable were 87-100% and negative predictive values were 86-99%. The average time duration for collecting risk factors for asthma obtained in this study was 7 hours for manual chart review and 50 minutes for NLP algorithms.

CONCLUSIONS: As the NLP algorithms for identifying individual risk factors for asthma from EMRs are cost-effective and suitable, it will be a useful tool for large-scale clinical studies.