METHODS: Analysis cohorts were defined via cloud-based, de-identified, longitudinal data of over 1.6 million patients diagnosed with asthma (IBM® Explorys Platform). Gradient boosting trees were used to model CAE risk and assess the predictive power of reduced sets out of thousands of demographic and clinical features. Baseline and history-limited models were developed to predict the risk of future CAE based on retrospective records of all available data (unlimited timing), and the last 365 days, respectively. A short-term model was used to estimate CAE risk in the following 30 days, considering records from the previous 180 days.
RESULTS: The final cross-validated models – which incorporated 269, 166 and 116 CAE predictors for baseline, history-limited and short-term models, respectively – exceeded benchmarks reported in the literature from studies utilizing comparable settings. They demonstrated performance (measured by area under the curve) of 0.83, 0.83 and 0.74 for CAE predictors for the three models. Many of the risk factors included in our models were consistent with previously published findings, further supporting the validity of our results.
CONCLUSIONS: Combining predictive analytics and real-world data may be effective in developing clinically useful models to identify risk of future asthma exacerbations, allowing healthcare systems to design preventive interventions for at-risk patients.