• Media type: E-Article
  • Title: Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data
  • Contributor: Klann, Jeffrey G; Estiri, Hossein; Weber, Griffin M; Moal, Bertrand; Avillach, Paul; Hong, Chuan; Tan, Amelia L M; Beaulieu-Jones, Brett K; Castro, Victor; Maulhardt, Thomas; Geva, Alon; Malovini, Alberto; South, Andrew M; Visweswaran, Shyam; Morris, Michele; Samayamuthu, Malarkodi J; Omenn, Gilbert S; Ngiam, Kee Yuan; Mandl, Kenneth D; Boeker, Martin; Olson, Karen L; Mowery, Danielle L; Follett, Robert W; Hanauer, David A; [...]
  • Published: Oxford University Press (OUP), 2021
  • Published in: Journal of the American Medical Informatics Association, 28 (2021) 7, Seite 1411-1420
  • Language: English
  • DOI: 10.1093/jamia/ocab018
  • ISSN: 1527-974X
  • Origination:
  • Footnote:
  • Description: <jats:title>Abstract</jats:title> <jats:sec> <jats:title>Objective</jats:title> <jats:p>The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity.</jats:p> </jats:sec> <jats:sec> <jats:title>Materials and Methods</jats:title> <jats:p>Twelve 4CE sites participated. First, we developed an EHR-based severity phenotype consisting of 6 code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of intensive care unit (ICU) admission and/or death. We also piloted an alternative machine learning approach and compared selected predictors of severity with the 4CE phenotype at 1 site.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p>The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability—up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean area under the curve of 0.903 (95% confidence interval, 0.886-0.921), compared with an area under the curve of 0.956 (95% confidence interval, 0.952-0.959) for the machine learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared with chart review.</jats:p> </jats:sec> <jats:sec> <jats:title>Discussion</jats:title> <jats:p>We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly owing to heterogeneous pandemic conditions.</jats:p> </jats:sec> <jats:sec> <jats:title>Conclusions</jats:title> <jats:p>We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.</jats:p> </jats:sec>
  • Access State: Open Access