Gregory Kiar, Yohan Chatelain, Pablo de Oliveira Castro, Eric Petit, Ariel Rokem, GaëlVaroquaux, Bratislav Misic, Alan C. Evans, Tristan Glatard
The analysis of brain-imaging data requires complex processing pipelines to support findings on brain function or pathologies. Recent work has shown that variability in analytical decisions, small amounts of noise, or computational environments can lead to substantial differences in the results, endangering the trust in conclusions. We explored the instability of results by instrumenting a structural connectome estimation pipeline with Monte Carlo Arithmetic to introduce random noise throughout. We evaluated the reliability of the connectomes, the robustness of their features, and the eventual impact on analysis.
The modelling of brain networks, called connectomics, has shaped our understanding of the structure and function of the brain across a variety of organisms and scales over the last decade [1–6]. In humans, these wiring diagrams are obtained in vivo through Magnetic Resonance Imaging (MRI), and show promise towards identifying biomarkers of disease. This can not only improve understanding of so-called “connectopathies”, such as Alzheimer’s Disease and Schizophrenia, but potentially pave the way for therapeutics [7–11].
Prior to exploring the analytic impact of instabilities, a direct understanding of the induced variability was required. A subset of the Nathan Kline Institute Rockland Sample (NKIRS) dataset  was randomly selected to contain 25 individuals with two sessions of imaging data, each of which was subsampled into two components, resulting in four samples per individual and 100 samples total (25 × 2 × 2 samples). Structural connectomes were generated with canonical deterministic and probabilistic pipelines [28, 29] which were instrumented with MCA, replicating computational noise either sparsely or densely throughout the pipelines [19, 26]. In the sparse case, a small subset of the libraries were instrumented with MCA, allowing for the evaluation of the cascading effects of numerical instabilities that may arise.
The perturbation of structural connectome estimation pipelines with small amounts of noise, on the order of machine error, led to considerable variability in derived brain graphs. Across all analyses the stability of results ranged from nearly perfectly trustworthy (i.e. no variation) to completely unreliable (i.e. containing no trustworthy information). Given that the magnitude of introduced numerical noise is to be expected in computational workflows, this finding has potentially significant implications for inferences in brain imaging as it is currently performed. In particular, this bounds the success of studying individual differences, a central objective in brain imaging , given that the quality of relationships between phenotypic data and brain networks will be limited by the stability of the connectomes themselves.
Citation: Kiar G, Chatelain Y, de Oliveira Castro P, Petit E, Rokem A, Varoquaux G, et al. (2021) Numerical uncertainty in analytical pipelines lead to impactful variability in brain networks. PLoS ONE 16(11): e0250755. https://doi.org/10.1371/journal.pone.0250755
Editor: Stavros I. Dimitriadis, Cardiff University, UNITED KINGDOM
Received: April 19, 2021; Accepted: August 25, 2021; Published: November 1, 2021
Copyright: © 2021 Kiar et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The unprocessed dataset is available through The Consortium of Reliability and Reproducibility (http://fcon_1000.projects.nitrc.org/indi/enhanced/), including both the imaging data as well as phenotypic data which may be obtained upon submission and compliance with a Data Usage Agreement. The connectomes generated through simulations have been bundled and stored permanently (https://doi.org/10.5281/zenodo.4041549), and are made available through The Canadian Open Neuroscience Platform (https://portal.conp.ca/search, search term “Kiar”). All software developed for processing or evaluation is publicly available on GitHub at https://github.com/gkpapers/2020ImpactOfInstability. Experiments were launched using Boutiques and Clowdr in Compute Canada’s HPC cluster environment. MCA instrumentation was achieved through Verificarlo available on Github at https://github.com/verificarlo/verificarlo. A set of MCA instrumented software containers is available on Github at https://github.com/gkiar/fuzzy.
Funding: GK was fully supported for this work by the Natural Sciences and Engineering Research Council of Canada <https://www.nserc-crsng.gc.ca/index_eng.asp> Award number: CGSD3 - 519497 - 2018. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.