Explainable Deep Learning for Disease Activity Prediction in Chronic Inflammatory Joint Diseases
Cécile Trottet, Ahmed Allam, Aron N. Horvath, Axel Finckh, Thomas Hügle, Sabine Adler, Diego Kyburz, Raphael Micheroli, Michael Krauthammer, Caroline Ospelt
Abstract
Analysing complex diseases such as chronic inflammatory joint diseases (CIJDs), where many factors influence the disease evolution over time, is a challenging task. CIJDs are rheumatic diseases that cause the immune system to attack healthy organs, mainly the joints. Different environmental, genetic and demographic factors affect disease development and progression. The Swiss Clinical Quality Management in Rheumatic Diseases (SCQM) Foundation maintains a national database of CIJDs documenting the disease management over time for 19’267 patients.
We propose the Disease Activity Score Network (DAS-Net), an explainable multi-task learning model trained on patients’ data with different arthritis subtypes, transforming longitudinal patient journeys into comparable representations and predicting multiple disease activity scores. First, we built a modular model composed of feed-forward neural networks, long short-term memory networks and attention layers to process the heterogeneous patient histories and predict future disease activity.
Introduction
Chronic inflammatory joint diseases (CIJDs) cause the immune system to attack healthy organs, particularly the joints [1]. In addition to causing pain, the inflammation can lead to synovitis, bone erosions, muscle and ligament damage. To this day, there exists no cure and the treatments primarily help attenuate the patients’ symptoms and improve their quality of life. Finding ways to minimise the disease activity is crucial to alleviate the disease burden on patients’ everyday life.
Digitalising patient healthcare data has led to a massive increase in available electronic health records (EHRs), opening up the opportunity to mine these records and employ machine learning (ML) approaches to discover novel evidence about real-world treatment efficacy and patient outcomes
Materials & Methods
The SCQM Foundation maintains a national database of inflammatory rheumatic diseases since 1997. The database documents the disease management over time for 19′267 patients through clinical measurements during the visits, demographics, prescribed medications and patient-reported outcome measures (database snapshot from 01.04.2022). Patients are diagnosed either with rheumatoid arthritis (RA), axial spondyloarthritis (axSpA), psoriatic arthritis (PsA) or undifferentiated arthritis (UA). Appendix S1 Fig shows the distribution of the number of medical visits per patient in the database.
Results & Discussion
We compared the performance of our model to two non-temporal machine learning models: vanilla neural network (MLP) and tree-based gradient boosting model (XGBoost), and one temporal LSTM model. Furthermore, we also included a static naive baseline. The static naive baseline uses the last available DAS28 (resp. ASDAS) score for the given patient as its prediction. This strategy implies using the last disease state of a patient as a predictor of their future disease state. The MLP and XGBoost baselines take as input the same features as our model but only their last available values. Restricting the number of values per feature is necessary since these models cannot handle varying input sizes. We trained one MLP and XGBoost model per prediction task.
Conclusion
In this work, we propose DAS-Net, a multitask neural network-based model for transforming heterogeneous rheumatic disease registry data into comparable patient representations and predicting future disease activity. When predicting future DAS, DAS-Net outperformed all non-temporal baseline models that discarded or simplified most of the patient history. Furthermore, it also outperformed a temporal LSTM model suggesting that DAS-Net is better suited to handle heterogeneous temporal patient records.
Our model design included attention layers that aided in explaining the importance of the different visits and parts of the patient’s history in outcome prediction. It showed that our model uses recent information but still attributes significant weight to older events and that the model attributes the majority of the weight to the clinical measures.
Citation: Trottet C, Allam A, Horvath AN, Finckh A, Hügle T, Adler S, et al. (2024) Explainable deep learning for disease activity prediction in chronic inflammatory joint diseases. PLOS Digit Health 3(6): e0000422. https://doi.org/10.1371/journal.pdig.0000422
Editor: Wisit Cheungpasitporn, Mayo Clinic Rochester: Mayo Clinic Minnesota, UNITED STATES
Received: December 5, 2023; Accepted: May 27, 2024; Published: June 27, 2024
Copyright: © 2024 Trottet et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data are owned by a third party, the Swiss Clinical Quality Management in Rheumatic Diseases (SCQM) foundation and may be obtained after approval and permission from SCQM. The code developed for the analysis is available on the following GitHub repository https://github.com/uzh-dqbm-cmi/scqm.
Funding: MK was awarded a grant from the Swiss National Science Foundation (project 201184) for this work (https://www.snf.ch/en). The funders had no role in study, analysis, decision to publish, or preparation of the manuscript. The SCQM Foundation is supported by pharmaceutical industries and donors. A list of financial supporters can be found on www.scqm.ch/en/partners/. SCQM supporting partners had no role in the study design or in the analysis and interpretation of the data, the writing of the manuscript or the decision to submit the manuscript for publication.
Competing interests: The authors have declared that no competing interests exist.