Skip to the main content

Meeting abstract

https://doi.org/10.15836/ccar2024.527

Is machine learning the optimal tool for assessing outcomes in healthcare data? Insights from a pulmonary embolism cohort

Marin Pavlov orcid id orcid.org/0000-0003-3962-2774 ; Dubrava University Hospital, Zagreb, Croatia
Andrej Novak orcid id orcid.org/0000-0002-7828-4870 ; Dubrava University Hospital, Zagreb, Croatia
Šime Manola orcid id orcid.org/0000-0001-6444-2674 ; Dubrava University Hospital, Zagreb, Croatia
Ivana Jurin orcid id orcid.org/0000-0002-2637-9691 ; Dubrava University Hospital, Zagreb, Croatia


Full text: english pdf 130 Kb

page 527-527

downloads: 110

cite

Download JATS file


Abstract

Keywords

machine learning; pulmonary embolism; outcomes

Hrčak ID:

328363

URI

https://hrcak.srce.hr/328363

Publication date:

13.12.2024.

Visits: 327 *



Goal: To determine the outcome predictor rank list in a population of pulmonary embolism (PE) patients with follow-up longer than one year using contemporary machine learning models.

Patients and Methods: Machine learning models (LightGBM variant of XGBoost) were used to analyse the outcome data of a PE cohort. Patients were recruited from November 2013 until November 2018 in two academic hospitals in metropolitan area and followed by a telephone interview or hospital visit. Primary outcome was all cause mortality. In all patients PE diagnosis was established by computed tomography. Two models were generated in both XGBoost and frequentistic analysis: 1) a model with 19 variables 2) a model with 8 variables. Both models were recreated from previously published results (1,2).

Results: The study population comprised of 761 patients (predominantly female (57.4%), aged 73 (61-81)) has been described previously (1,2). Median follow-up was 675 days (114-1331). Death within follow-up occurred in 335 cases (44.0%). In XGBoost algorhitm, Pulmonary Embolism Severity Index (PESI) score and body mass index (BMI) were the two strongest predictors of primary outcome. Overall, the models were accurate with area under curve of 0.840 and 0.864. For BMI, this is contrary to the results of frequentistic statistic inference, in which BMI failed to enter the Cox proportional hazards model.

Conclusion: In the XGBoost analysis, a machine learning framework more suitable to handle non-linear data, outcome analysis yielded different results as compared to frequentist statistical inference. Since such non-normally distributed data prevail in health care data bases, machine learning models may provide deeper insight in analysis of variables impact on outcome.

LITERATURE

1 

Jurin I, Pavlov M, Manola S, Letilovic T, Hadzibegovic I. Long-term outcome in pulmonary embolism: Is it healthy to be lean? Eur J Intern Med. 2023 July;113:126–8. https://doi.org/10.1016/j.ejim.2023.04.017 PubMed: http://www.ncbi.nlm.nih.gov/pubmed/37095015

2 

Jurin I, Pavlov M, Manola S, Radonic V, Hadzibegovic I. The lean paradox in pulmonary embolism: Beyond the estimated plasma volume? Eur J Intern Med. 2023 August;114:127–8. https://doi.org/10.1016/j.ejim.2023.05.029 PubMed: http://www.ncbi.nlm.nih.gov/pubmed/37258382


This display is generated from NISO JATS XML with jats-html.xsl. The XSLT engine is libxslt.