Automated Identification of Heart Failure with Reduced Ejection Fraction using Deep Learning-based Natural Language Processing.

Link: https://doi.org/10.1101/2023.09.10.23295315
Authors: Nargesi, Arash A; Adejumo, Philip; Dhingra, Lovedeep; Rosand, Benjamin; Hengartner, Astrid; Coppi, Andreas; Benigeri, Simon; Sen, Sounok; Ahmad, Tariq; Nadkarni, Girish N; Lin, Zhenqiu; Ahmad, Faraz S; Krumholz, Harlan M; Khera, Rohan

Abstract: The lack of automated tools for measuring care quality has limited the implementation of a national program to assess and improve guideline-directed care in heart failure with reduced ejection fraction (HFrEF). A key challenge for constructing such a tool has been an accurate, accessible approach for identifying patients with HFrEF at hospital discharge, an opportunity to evaluate and improve the quality of care. We developed a novel deep learning-based language model for identifying patients with HFrEF from discharge summaries using a semi-supervised learning framework. For this purpose, hospitalizations with heart failure at Yale New Haven Hospital (YNHH) between 2015 to 2019 were labeled as HFrEF if the left ventricular ejection fraction was under 40% on antecedent echocardiography. The model was internally validated with model-based net reclassification improvement (NRI) assessed against chart-based diagnosis codes. We externally validated the model on discharge summaries from hospitalizations with heart failure at Northwestern Medicine, community hospitals of Yale New Haven Health in Connecticut and Rhode Island, and the publicly accessible MIMIC-III database, confirmed with chart abstraction. A total of 13,251 notes from 5,392 unique individuals (mean age 73 ± 14 years, 48% female), including 2,487 patients with HFrEF (46.1%), were used for model development (train/held-out test: 70/30%). The deep learning model achieved an area under receiving operating characteristic (AUROC) of 0.97 and an area under precision-recall curve (AUPRC) of 0.97 in detecting HFrEF on the held-out set. In external validation, the model had high performance in identifying HFrEF from discharge summaries with AUROC 0.94 and AUPRC 0.91 on 19,242 notes from Northwestern Medicine, AUROC 0.95 and AUPRC 0.96 on 139 manually abstracted notes from Yale community hospitals, and AUROC 0.91 and AUPRC 0.92 on 146 manually reviewed notes at MIMIC-III. Model-based prediction of HFrEF corresponded to an overall NRI of 60.2 ± 1.9% compared with the chart diagnosis codes (p-value < 0.001) and an increase in AUROC from 0.61 [95% CI: 060-0.63] to 0.91 [95% CI 0.90-0.92]. We developed and externally validated a deep learning language model that automatically identifies HFrEF from clinical notes with high precision and accuracy, representing a key element in automating quality assessment and improvement for individuals with HFrEF.

Leave a Comment