Severity and mortality prediction models to triage Indian COVID-19 patients

Samarth Bhatia, Yukti Makhija, Sneha Jayaswal, Shalendra Singh, Prabhat Singh Malik, Sri Krishna Venigalla, Pallavi Gupta, Shreyas N. Samaga, Rabi Narayan Hota, Ishaan Gupta

As the second wave in India mitigates, COVID-19 has now infected about 29 million patients countrywide, leading to more than 350 thousand people dead. As the infections surged, the strain on the medical infrastructure in the country became apparent. While the country vaccinates its population, opening up the economy may lead to an increase in infection rates. In this scenario, it is essential to effectively utilize the limited hospital resources by an informed patient triaging system based on clinical parameters. Here, we present two interpretable machine learning models predicting the clinical outcomes, severity, and mortality, of the patients based on routine non-invasive surveillance of blood parameters from one of the largest cohorts of Indian patients at the day of admission.

Translational science is a rapidly growing field with immediate potential in direct clinical applications. This includes the development of computational models that analyze Electronic Health Records (EHRs) and aid us in interpreting the complex biological associations between clinical measurements and patient outcomes. Machine Learning is an extremely powerful tool deployed by translational scientists to recognize patterns and identify features from medical data that correlate with clinical outcomes. The predictions obtained from such machine learning models may assist in clinical decision-making. This can enable us to automate certain stages of diagnosis, especially when during a scarcity of medical resources such as trained medical professionals or intensive care units (ICUs).

Li Yan et al. proposed one of the first mortality prediction models for COVID-19 [6]. This model was trained and tested on 375 infected patients in the region of Wuhan, China. Of the 375 patients, 201 had recovered, and 174 had died. This paper also proposed a clinically operable decision tree that predicted the outcome based on lactic dehydrogenase (LDH), lymphocytes, and high-sensitivity C-reactive protein (hs-CRP) values. They achieved 100% accuracy in predicting COVID-19 severity and 81% accuracy in predicting patient mortality in their dataset using this decision tree.

Discussion and Conclusion
We were successfully able to make two machine learning models to triage COVID-19 patients in India into prediction categories like alive versus deceased and severe versus non-severe. This was done using XGBoost (12) models that use gradient boosting in decision trees (the gbtree booster of XGBoost was internally used). These models can be summarized through sample decision trees (Fig 5) to inform a clinical decision support system. Note that XGBoost is an ensemble model and creates multiple weak classifiers which work strongly when together.

Citation: Bhatia S, Makhija Y, Jayaswal S, Singh S, Malik PS, Venigalla SK, et al. (2022) Severity and mortality prediction models to triage Indian COVID-19 patients. PLOS Digit Health 1(3): e0000020.

Editor: Martin G. Frasch, University of Washington, UNITED STATES

Received: June 24, 2021; Accepted: January 31, 2022; Published: March 9, 2022

Copyright: © 2022 Bhatia et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: We have uploaded the de-identified data to a GitHub repository:

Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Competing interests: The authors have declared that no competing interests exist.

Latest Issue
Get instant
access to our latest e-book
Sanofi Harvard Medical School - Leadership in Medicine: Southeast Asia Medlab Middle East 2023 Dubai Derma 2023