Data-driven survival modeling for breast cancer prognostics: A comparative study with machine learning and traditional survival modeling methods

Theophilus Gyedu Baidoo, Hansapani Rodrigo

Abstract

This investigation delves into the potential application of data-driven survival modeling approaches for prognostic assessments of breast cancer survival. The primary objective is to evaluate and compare the ability of machine learning (ML) models and conventional survival analysis techniques, to identify consistent key predictors of breast cancer survival outcomes.

Introduction

Breast cancer is one of the leading causes of death among women worldwide and remains a critical area of research due to its high incidence and mortality rates. In 2020, it became the most commonly diagnosed cancer globally, accounting for approximately 2.3 million new cases, or 11.7% of all cancer diagnoses [1].

Materials and Methods

Patients diagnosed with infiltrating ductal and lobular carcinoma of the breast between 2006 and 2010 were accessed in April 2023 for research purposes. To enhance the reliability of the analysis, patients with missing data on tumor size, regional lymph nodes, regional positive lymph nodes.

Results 

Table 1 presents baseline characteristics for 4,024 breast cancer patients, stratified by survival status (censored vs. deceased). Age distribution was similar across groups, with the largest proportion in the 30-50 year range (37%).

Acknowledgments

We thank Professor Tamer Oraby, Professor George Yanev, and Professor Zhuanzhuan Ma for their guidance and feedback, which greatly shaped this work. We also thank the reviewers for their valuable comments and suggestions.

Citation: Baidoo TG, Rodrigo H (2025) Data-driven survival modeling for breast cancer prognostics: A comparative study with machine learning and traditional survival modeling methods. PLoS ONE 20(4): e0318167. https://doi.org/10.1371/journal.pone.0318167

Editor: Guanghui Liu, State University of New York at Oswego, UNITED STATES OF AMERICA

Received: October 2, 2024; Accepted: January 12, 2025; Published: April 22, 2025

Copyright: © 2025 Baidoo and Rodrigo. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data relevant to this study are available from GitHub at https://github.com/Theophilus-Baidoo/Data–Driven-Survival-Modeling-for-Breast- Cancer.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.