1
|
Puniya BL. Artificial-intelligence-driven innovations in mechanistic computational modeling and digital twins for biomedical applications. J Mol Biol 2025:169181. [PMID: 40316010 DOI: 10.1016/j.jmb.2025.169181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2025] [Revised: 04/09/2025] [Accepted: 04/27/2025] [Indexed: 05/04/2025]
Abstract
Understanding of complex biological systems remains a significant challenge due to their high dimensionality, nonlinearity, and context-specific behavior. Artificial intelligence (AI) and mechanistic modeling are becoming essential tools for studying such complex systems. Mechanistic modeling can facilitate the construction of simulatable models that are interpretable but often struggle with scalability and parameters estimation. AI can integrate multi-omics data to create predictive models, but it lacks interpretability. The gap between these two modeling methods limits our ability to develop comprehensive and predictive models for biomedical applications. This article reviews the most recent advancements in the integration of AI and mechanistic modeling to fill this gap. Recently, with omics availability, AI has led to new discoveries in mechanistic computational modeling. The mechanistic models can also help in getting insight into the mechanism for prediction made by AI models. This integration is helpful in modeling complex systems, estimating the parameters that are hard to capture in experiments, and creating surrogate models to reduce computational costs because of expensive mechanistic model simulations. This article focuses on advancements in mechanistic computational models and AI models and their integration for scientific discoveries in biology, pharmacology, drug discovery and diseases. The mechanistic models with AI integration can facilitate biological discoveries to advance our understanding of disease mechanisms, drug development, and personalized medicine. The article also highlights the role of AI and mechanistic model integration in the development of more advanced models in the biomedical domain, such as medical digital twins and virtual patients for pharmacological discoveries.
Collapse
Affiliation(s)
- Bhanwar Lal Puniya
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE 68588, United States
| |
Collapse
|
2
|
Arnay Del Arco R, Castilla Rodríguez I, Cabrera Hernández MA. Improving clinical decision making by creating surrogate models from health technology assessment models: A case study on Type 1 Diabetes Melitus. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 262:108646. [PMID: 39954653 DOI: 10.1016/j.cmpb.2025.108646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 01/23/2025] [Accepted: 02/02/2025] [Indexed: 02/17/2025]
Abstract
BACKGROUND AND OBJECTIVE Computerized clinical decision support systems (CDSS) that incorporate the latest scientific evidence are essential for enhancing patient care quality. Such systems typically rely on some kind of model to accurately represent the knowledge required to assess the clinicians. Although the use of complex and computationally demanding simulation models is common in this field, such models limit the potential applications of CDSSs, both in real-time applications and in simulation-in-the-loop optimization tools. This paper presents a case study on Type 1 Diabetes Mellitus (T1DM) to demonstrate the development of surrogate models from health technology assessment models, with the aim of enhancing the potential of CDSSs. METHODS The paper details the process of developing machine learning (ML) based surrogate models, including the generation of a dataset for training and testing, and the comparison of different ML techniques. A number of distinct groupings of comorbidities were utilized in the creation of models, which were trained to predict confidence intervals for the time to develop each complication. RESULTS The results of the intersection over union (IoU) analysis between the simulation model output and the surrogate models output for the comorbidities under study were greater than 0.9. CONCLUSION The study concludes that ML-based surrogate models are a viable solution for real-time clinical decision-making, offering a substantial speedup in execution time compared to traditional simulation models.
Collapse
Affiliation(s)
- Rafael Arnay Del Arco
- Departamento de Ingeniería Informática y de Sistemas, Universidad de La Laguna, Camino San Francisco de Paula, n(o) 19, San Cristobal de La Laguna, 38200, Spain.
| | - Iván Castilla Rodríguez
- Departamento de Ingeniería Informática y de Sistemas, Universidad de La Laguna, Camino San Francisco de Paula, n(o) 19, San Cristobal de La Laguna, 38200, Spain
| | - Marco A Cabrera Hernández
- Departamento de Ingeniería Informática y de Sistemas, Universidad de La Laguna, Camino San Francisco de Paula, n(o) 19, San Cristobal de La Laguna, 38200, Spain
| |
Collapse
|
3
|
Lehtonen E, Teuho J, Vatandoust M, Knuuti J, Knol RJJ, van der Zant FM, Juárez‐Orozco LE, Klén R. Expanding interpretability through complexity reduction in machine learning-based modelling of cardiovascular disease: A myocardial perfusion imaging PET/CT prognostic study. Eur J Clin Invest 2025; 55 Suppl 1:e14391. [PMID: 40191939 PMCID: PMC11973839 DOI: 10.1111/eci.14391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Accepted: 01/09/2025] [Indexed: 04/09/2025]
Abstract
BACKGROUND Machine learning-based analysis can be used in myocardial perfusion imaging data to improve risk stratification and the prediction of major adverse cardiovascular events for patients with suspected or established coronary artery disease. We present a new machine learning approach for the identification of patients who develop major adverse cardiovascular events. The new method is robust against the deleterious effect of outliers in the training set stratification and training process. METHODS The proposed sum-of-sigmoids model is obtained by averaging the contributions of various input variables in an ensemble of XGBoost models. To illustrate its performance, we have applied it to predict major adverse cardiovascular events from advanced imaging data extracted from rest and adenosine stress 13N-ammonia positron emission tomography myocardial perfusion imaging polar maps. There were 1185 individual studies performed, and the event occurrence was tracked over a follow-up period of 2 years. RESULTS The sum-of-sigmoids model achieved a prediction accuracy of .83 on the test set, matching the performance of significantly more complex and less interpretable models (whose accuracies were .83-.84). CONCLUSION The sum-of-sigmoids model is interpretable and simple, while achieving similar prediction accuracy to significantly more complex machine learning models in the considered prediction task. It should be suitable for applications such as automated clinical risk stratification, where clear and explicit justification of the classification procedure is highly pertinent.
Collapse
Affiliation(s)
- Eero Lehtonen
- Turku PET CentreTurku University Hospital and University of TurkuTurkuFinland
| | - Jarmo Teuho
- Turku PET CentreTurku University Hospital and University of TurkuTurkuFinland
| | - Monire Vatandoust
- Turku PET CentreTurku University Hospital and University of TurkuTurkuFinland
| | - Juhani Knuuti
- Turku PET CentreTurku University Hospital and University of TurkuTurkuFinland
| | - Remco J. J. Knol
- Cardiac Imaging Division Alkmaar, Department of Nuclear MedicineNorthwest ClinicsAlkmaarThe Netherlands
| | - Friso M. van der Zant
- Cardiac Imaging Division Alkmaar, Department of Nuclear MedicineNorthwest ClinicsAlkmaarThe Netherlands
| | - Luis Eduardo Juárez‐Orozco
- Turku PET CentreTurku University Hospital and University of TurkuTurkuFinland
- Department of Cardiology, Division Heart & LungsUniversity Medical Center Utrecht, Utrecht UniversityUtrechtThe Netherlands
| | - Riku Klén
- Turku PET CentreTurku University Hospital and University of TurkuTurkuFinland
| |
Collapse
|
4
|
Robertson C, Safta C, Collier N, Ozik J, Ray J. Bayesian Calibration of Stochastic Agent Based Model via Random Forest. Stat Med 2025; 44:e70029. [PMID: 40083065 DOI: 10.1002/sim.70029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 01/30/2025] [Accepted: 02/03/2025] [Indexed: 03/16/2025]
Abstract
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high-dimensional calibration can be computationally prohibitive. This paper presents a random forest-based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented, and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results, and their predictive performance is analyzed, showing improved performance with a reduction in computation.
Collapse
Affiliation(s)
| | - Cosmin Safta
- Sandia National Laboratories, Livermore, CA, USA
| | - Nicholson Collier
- Argonne National Laboratory, Lemont, IL, USA
- University of Chicago, Chicago, IL, USA
| | - Jonathan Ozik
- Argonne National Laboratory, Lemont, IL, USA
- University of Chicago, Chicago, IL, USA
| | - Jaideep Ray
- Sandia National Laboratories, Livermore, CA, USA
| |
Collapse
|
5
|
Nilsson A, Meimetis N, Lauffenburger DA. Towards an interpretable deep learning model of cancer. NPJ Precis Oncol 2025; 9:46. [PMID: 39948231 PMCID: PMC11825879 DOI: 10.1038/s41698-025-00822-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Accepted: 01/27/2025] [Indexed: 02/16/2025] Open
Abstract
Cancer is a manifestation of dysfunctional cell states. It emerges from an interplay of intrinsic and extrinsic factors that disrupt cellular dynamics, including genetic and epigenetic alterations, as well as the tumor microenvironment. This complexity can make it challenging to infer molecular causes for treating the disease. This may be addressed by system-wide computer models of cells, as they allow rapid generation and testing of hypotheses that would be too slow or impossible to perform in the laboratory and clinic. However, so far, such models have been impeded by both experimental and computational limitations. In this perspective, we argue that they can now be achieved using deep learning algorithms to integrate omics data and prior knowledge of molecular networks. Such models would have many applications in precision oncology, e.g., for identifying drug targets and biomarkers, predicting resistance mechanisms and toxicity effects of drugs, or simulating cell-cell interactions in the microenvironment.
Collapse
Affiliation(s)
- Avlant Nilsson
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biology and Biological Engineering, Chalmers University of Technology, Göteborg, Sweden
- Department of Cell and Molecular Biology, SciLifeLab, Karolinska Institutet, Stockholm, Sweden
| | - Nikolaos Meimetis
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Douglas A Lauffenburger
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
6
|
Kou Y, Tian Y, Ha Y, Wang S, Sun X, Lv S, Luo B, Yang Y, Qin L. Comprehensive Sepsis Risk Prediction in Leukemia Using a Random Forest Model and Restricted Cubic Spline Analysis. J Inflamm Res 2025; 18:1013-1032. [PMID: 39867945 PMCID: PMC11766288 DOI: 10.2147/jir.s505813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Accepted: 01/12/2025] [Indexed: 01/28/2025] Open
Abstract
Background Sepsis is a severe complication in leukemia patients, contributing to high mortality rates. Identifying early predictors of sepsis is crucial for timely intervention. This study aimed to develop and validate a predictive model for sepsis risk in leukemia patients using machine learning techniques. Methods This retrospective study included 4310 leukemia patients admitted to the Affiliated Hospital of Guangdong Medical University from 2005 to 2024, using 70% for training and 30% for validation. Feature selection was performed using univariate logistic regression, LASSO, and the Boruta algorithm, followed by multivariate logistic regression analysis. Seven machine learning models were constructed and evaluated using receiver operating characteristic (ROC) curves and decision curve analysis (DCA). Shapley additive explanations (SHAP) were applied to interpret the results, and restricted cubic spline (RCS) regression explored the nonlinear relationships between variables and sepsis risk. Furthermore, we examined the interactions among predictors to better understand their potential interrelationships. Results The random forest (RF) model outperformed all others, achieving an AUC of 0.765 in the training cohort and 0.700 in the validation cohort. Key predictors of sepsis identified by SHAP analysis included C-reactive protein (CRP), procalcitonin (PCT), neutrophil count (Neut), lymphocyte count (Lymph), thrombin time (TT), red blood cell count (RBC), total bile acid (TBA), and systolic blood pressure (SBP). RCS analysis revealed significant non-linear associations between CPR, PCT, Neut, Lymph, TT, RBC and SBP with sepsis risk. Pairwise correlation analysis further revealed interactions among these variables. Conclusion The RF model exhibited robust predictive power for sepsis in leukemia patients, providing clinicians with a valuable tool for early risk assessment and the optimization of treatment strategies.
Collapse
Affiliation(s)
- Yanqi Kou
- Department of Hematology, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, Luoyang, Henan Province, People’s Republic of China
- Department of Gastroenterology, Affiliated Hospital of Guangdong Medical University, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
| | - Yuan Tian
- Department of Gastroenterology, Affiliated Hospital of Guangdong Medical University, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
- Department of Pathology, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
| | - Yanping Ha
- Department of Gastroenterology, Affiliated Hospital of Guangdong Medical University, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
- Department of Pathology, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
| | - Shijie Wang
- Department of Hematology, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, Luoyang, Henan Province, People’s Republic of China
| | - Xiaobai Sun
- Department of Hematology, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, Luoyang, Henan Province, People’s Republic of China
| | - Shuxin Lv
- Department of Hematology, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, Luoyang, Henan Province, People’s Republic of China
| | - Botao Luo
- Department of Gastroenterology, Affiliated Hospital of Guangdong Medical University, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
- Department of Pathology, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
| | - Yuping Yang
- Department of Gastroenterology, Affiliated Hospital of Guangdong Medical University, Guangdong Medical University, Zhanjiang City, Guangdong Province, People’s Republic of China
| | - Ling Qin
- Department of Hematology, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, Luoyang, Henan Province, People’s Republic of China
| |
Collapse
|
7
|
Cogno N, Axenie C, Bauer R, Vavourakis V. Agent-based modeling in cancer biomedicine: applications and tools for calibration and validation. Cancer Biol Ther 2024; 25:2344600. [PMID: 38678381 PMCID: PMC11057625 DOI: 10.1080/15384047.2024.2344600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 04/15/2024] [Indexed: 04/29/2024] Open
Abstract
Computational models are not just appealing because they can simulate and predict the development of biological phenomena across multiple spatial and temporal scales, but also because they can integrate information from well-established in vitro and in vivo models and test new hypotheses in cancer biomedicine. Agent-based models and simulations are especially interesting candidates among computational modeling procedures in cancer research due to the capability to, for instance, recapitulate the dynamics of neoplasia and tumor - host interactions. Yet, the absence of methods to validate the consistency of the results across scales can hinder adoption by turning fine-tuned models into black boxes. This review compiles relevant literature that explores strategies to leverage high-fidelity simulations of multi-scale, or multi-level, cancer models with a focus on verification approached as simulation calibration. We consolidate our review with an outline of modern approaches for agent-based models' validation and provide an ambitious outlook toward rigorous and reliable calibration.
Collapse
Affiliation(s)
- Nicolò Cogno
- Department of Radiation Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Institute for Condensed Matter Physics, Technische Universit¨at Darmstadt, Darmstadt, Germany
| | - Cristian Axenie
- Computer Science Department and Center for Artificial Intelligence, Technische Hochschule Nürnberg Georg Simon Ohm, Nuremberg, Germany
| | - Roman Bauer
- Nature Inspired Computing and Engineering Research Group, Computer Science Research Centre, University of Surrey, Guildford, UK
| | - Vasileios Vavourakis
- Department of Medical Physics and Biomedical Engineering, University College London, London, UK
- Department of Mechanical and Manufacturing Engineering, University of Cyprus, Nicosia, Cyprus
| |
Collapse
|
8
|
Nardini JT. Forecasting and Predicting Stochastic Agent-Based Model Data with Biologically-Informed Neural Networks. Bull Math Biol 2024; 86:130. [PMID: 39307859 DOI: 10.1007/s11538-024-01357-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 09/02/2024] [Indexed: 10/18/2024]
Abstract
Collective migration is an important component of many biological processes, including wound healing, tumorigenesis, and embryo development. Spatial agent-based models (ABMs) are often used to model collective migration, but it is challenging to thoroughly predict these models' behavior throughout parameter space due to their random and computationally intensive nature. Modelers often coarse-grain ABM rules into mean-field differential equation (DE) models. While these DE models are fast to simulate, they suffer from poor (or even ill-posed) ABM predictions in some regions of parameter space. In this work, we describe how biologically-informed neural networks (BINNs) can be trained to learn interpretable BINN-guided DE models capable of accurately predicting ABM behavior. In particular, we show that BINN-guided partial DE (PDE) simulations can (1) forecast future spatial ABM data not seen during model training, and (2) predict ABM data at previously-unexplored parameter values. This latter task is achieved by combining BINN-guided PDE simulations with multivariate interpolation. We demonstrate our approach using three case study ABMs of collective migration that imitate cell biology experiments and find that BINN-guided PDEs accurately forecast and predict ABM data with a one-compartment PDE when the mean-field PDE is ill-posed or requires two compartments. This work suggests that BINN-guided PDEs allow modelers to efficiently explore parameter space, which may enable data-driven tasks for ABMs, such as estimating parameters from experimental data. All code and data from our study is available at https://github.com/johnnardini/Forecasting_predicting_ABMs .
Collapse
Affiliation(s)
- John T Nardini
- Department of Mathematics and Statistics, The College of New Jersey, Ewing, NJ, 08628, USA.
| |
Collapse
|
9
|
Khan MS, Moallemi EA, Thiruvady D, Nazari A, Bryan BA. Machine learning-based surrogate modelling of a robust, sustainable development goal (SDG)-compliant land-use future for Australia at high spatial resolution. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 363:121296. [PMID: 38843732 DOI: 10.1016/j.jenvman.2024.121296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/13/2024] [Accepted: 05/29/2024] [Indexed: 06/18/2024]
Abstract
We developed a high-resolution machine learning based surrogate model to identify a robust land-use future for Australia which meets multiple UN Sustainable Development Goals. We compared machine learning models with different architectures to pick the best performing model considering the data type, accuracy metrics, ability to handle uncertainty and computational overhead requirement. The surrogate model, called ML-LUTO Spatial, was trained on the Land-Use Trade-Offs (version 1.0) model of Australian agricultural land system sustainability. Using the surrogate model, we generated projections of land-use futures at 1.1 km resolution with 95% classification accuracy, and which far surpassed the computational benchmarks of the original model. This efficiency enabled the generation of numerous SDG-compliant (SDGs 2, 6, 7, 13, 15) future land-use maps on a standard laptop, a task previously dependent upon high-performance computing clusters. Combining these projections, we derived a single, robust land-use future and quantified the uncertainty. Our findings indicate that while agricultural land-use remains dominant in all Australian regions, extensive carbon plantings were identified in Queensland and environmental plantings played a role across the study area, reflecting a growing urgency for offsetting greenhouse gas emissions and the restoration of ecosystems to support biodiversity across Australia to meet the 2050 Sustainable Development Goals.
Collapse
Affiliation(s)
- Md Shakil Khan
- School of Life & Environmental Sciences, Deakin University, Melbourne Burwood Campus, VIC, 3125, Australia.
| | - Enayat A Moallemi
- The Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia
| | - Dhananjay Thiruvady
- School of Information Technology, Deakin University, Melbourne Burwood Campus, VIC, 3125, Australia
| | - Asef Nazari
- School of Information Technology, Deakin University, Melbourne Burwood Campus, VIC, 3125, Australia
| | - Brett A Bryan
- School of Life & Environmental Sciences, Deakin University, Melbourne Burwood Campus, VIC, 3125, Australia
| |
Collapse
|
10
|
Puniya BL, Verma M, Damiani C, Bakr S, Dräger A. Perspectives on computational modeling of biological systems and the significance of the SysMod community. BIOINFORMATICS ADVANCES 2024; 4:vbae090. [PMID: 38948011 PMCID: PMC11213628 DOI: 10.1093/bioadv/vbae090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 05/12/2024] [Accepted: 06/14/2024] [Indexed: 07/02/2024]
Abstract
Motivation In recent years, applying computational modeling to systems biology has caused a substantial surge in both discovery and practical applications and a significant shift in our understanding of the complexity inherent in biological systems. Results In this perspective article, we briefly overview computational modeling in biology, highlighting recent advancements such as multi-scale modeling due to the omics revolution, single-cell technology, and integration of artificial intelligence and machine learning approaches. We also discuss the primary challenges faced: integration, standardization, model complexity, scalability, and interdisciplinary collaboration. Lastly, we highlight the contribution made by the Computational Modeling of Biological Systems (SysMod) Community of Special Interest (COSI) associated with the International Society of Computational Biology (ISCB) in driving progress within this rapidly evolving field through community engagement (via both in person and virtual meetings, social media interactions), webinars, and conferences. Availability and implementation Additional information about SysMod is available at https://sysmod.info.
Collapse
Affiliation(s)
- Bhanwar Lal Puniya
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE 68588, United States
| | - Meghna Verma
- Systems Medicine, Clinical Pharmacology and Quantitative Pharmacology, R&D BioPharmaceuticals, AstraZeneca, Gaithersburg, MD 20878, United States
| | - Chiara Damiani
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan 20126, Italy
| | - Shaimaa Bakr
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford University, Stanford, CA 94305-5479, United States
| | - Andreas Dräger
- Computational Systems Biology of Infections and Antimicrobial-Resistant Pathogens, Cluster of Excellence ‘Controlling Microbes to Fight Infections’, Institute for Bioinformatics and Medical Informatics (IBMI), Eberhard Karl University of Tübingen, Tübingen 72076, Germany
- German Center for Infection Research (DZIF), partner site Tübingen, Tübingen 72076, Germany
- Quantitative Biology Center (QBiC), Eberhard Karl University of Tübingen, Tübingen 72076, Germany
- Data Analytics and Bioinformatics, Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale) 06120, Germany
| |
Collapse
|
11
|
Kim YH, Kim HJ, Cho DH, Yoon Y. Evolutionary Approach to Optimal Oil Skimmer Assignment for Oil Spill Response: A Case Study. Biomimetics (Basel) 2024; 9:330. [PMID: 38921209 PMCID: PMC11202193 DOI: 10.3390/biomimetics9060330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 05/27/2024] [Accepted: 05/28/2024] [Indexed: 06/27/2024] Open
Abstract
We propose a genetic algorithm for optimizing oil skimmer assignments, introducing a tailored repair operation for constrained assignments. Methods essentially involve simulation-based evaluation to ensure adherence to South Korea's regulations. Results show that the optimized assignments, compared to current ones, reduced work time on average and led to a significant reduction in total skimmer capacity. Additionally, we present a deep neural network-based surrogate model, greatly enhancing efficiency compared to simulation-based optimization. Addressing inefficiencies in mobilizing locations that store oil skimmers, further optimization aimed to minimize mobilized locations and was validated through scenario-based simulations resembling actual situations. Based on major oil spills in South Korea, this strategy significantly reduced work time and required locations. These findings demonstrate the effectiveness of the proposed genetic algorithm and mobilized location minimization strategy in enhancing oil spill response operations.
Collapse
Affiliation(s)
- Yong-Hyuk Kim
- School of Software, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Republic of Korea;
| | - Hye-Jin Kim
- TmaxBI, 29 Hwangsaeul-ro, 258beon-gil, Bundang-gu, Seongnam-si 13595, Gyeonggi-do, Republic of Korea;
| | - Dong-Hee Cho
- Munhwa Broadcasting Corporation, 267 Seongam-ro, Mapo-gu, Seoul 03925, Republic of Korea;
| | - Yourim Yoon
- Department of Computer Engineering, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of Korea
| |
Collapse
|
12
|
Cain JY, Evarts JI, Yu JS, Bagheri N. Incorporating temporal information during feature engineering bolsters emulation of spatio-temporal emergence. Bioinformatics 2024; 40:btae131. [PMID: 38444088 PMCID: PMC10957516 DOI: 10.1093/bioinformatics/btae131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 02/08/2024] [Accepted: 03/01/2024] [Indexed: 03/07/2024] Open
Abstract
MOTIVATION Emergent biological dynamics derive from the evolution of lower-level spatial and temporal processes. A long-standing challenge for scientists and engineers is identifying simple low-level rules that give rise to complex higher-level dynamics. High-resolution biological data acquisition enables this identification and has evolved at a rapid pace for both experimental and computational approaches. Simultaneously harnessing the resolution and managing the expense of emerging technologies-e.g. live cell imaging, scRNAseq, agent-based models-requires a deeper understanding of how spatial and temporal axes impact biological systems. Effective emulation is a promising solution to manage the expense of increasingly complex high-resolution computational models. In this research, we focus on the emulation of a tumor microenvironment agent-based model to examine the relationship between spatial and temporal environment features, and emergent tumor properties. RESULTS Despite significant feature engineering, we find limited predictive capacity of tumor properties from initial system representations. However, incorporating temporal information derived from intermediate simulation states dramatically improves the predictive performance of machine learning models. We train a deep-learning emulator on intermediate simulation states and observe promising enhancements over emulators trained solely on initial conditions. Our results underscore the importance of incorporating temporal information in the evaluation of spatio-temporal emergent behavior. Nevertheless, the emulators exhibit inconsistent performance, suggesting that the underlying model characterizes unique cell populations dynamics that are not easily replaced. AVAILABILITY AND IMPLEMENTATION All source codes for the agent-based model, emulation, and analyses are publicly available at the corresponding DOIs: 10.5281/zenodo.10622155, 10.5281/zenodo.10611675, 10.5281/zenodo.10621244, respectively.
Collapse
Affiliation(s)
- Jason Y Cain
- Department of Chemical Engineering, University of Washington, Seattle, WA 98195, United States
| | - Jacob I Evarts
- Department of Biology, University of Washington, Seattle, WA 98195, United States
| | - Jessica S Yu
- Department of Biology, University of Washington, Seattle, WA 98195, United States
| | - Neda Bagheri
- Department of Chemical Engineering, University of Washington, Seattle, WA 98195, United States
- Department of Biology, University of Washington, Seattle, WA 98195, United States
| |
Collapse
|
13
|
Lu J, Fang Y, Han W. A novel adaptive-weight ensemble surrogate model base on distance and mixture error. PLoS One 2023; 18:e0293318. [PMID: 37906548 PMCID: PMC10617703 DOI: 10.1371/journal.pone.0293318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 10/09/2023] [Indexed: 11/02/2023] Open
Abstract
Surrogate models are commonly used as a substitute for the computation-intensive simulations in design optimization. However, building a high-accuracy surrogate model with limited samples remains a challenging task. In this paper, a novel adaptive-weight ensemble surrogate modeling method is proposed to address this challenge. Instead of using a single error metric, the proposed method takes into account the position of the prediction sample, the mixture error metric and the learning characteristics of the component surrogate models. The effectiveness of proposed ensemble models are tested on five highly nonlinear benchmark functions and a finite element model for the analysis of the frequency response of an automotive exhaust pipe. Comparative results demonstrate the effectiveness and promising potential of proposed method in achieving higher accuracy.
Collapse
Affiliation(s)
- Jun Lu
- National Center for Applied Mathematics in Chongqing, Chongqing Normal University, Chongqing, China;
| | - Yudong Fang
- School of Mechanical and Vehicle Engineering, Chongqing University, Chongqing, China
| | - Weijian Han
- Key Laboratory for Lightweight Materials, Nanjing Tech University, Nanjing, China
| |
Collapse
|
14
|
Misra S, Bland LC, Cardwell SG, Incorvia JAC, James CD, Kent AD, Schuman CD, Smith JD, Aimone JB. Probabilistic Neural Computing with Stochastic Devices. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2023; 35:e2204569. [PMID: 36395387 DOI: 10.1002/adma.202204569] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 08/03/2022] [Indexed: 06/16/2023]
Abstract
The brain has effectively proven a powerful inspiration for the development of computing architectures in which processing is tightly integrated with memory, communication is event-driven, and analog computation can be performed at scale. These neuromorphic systems increasingly show an ability to improve the efficiency and speed of scientific computing and artificial intelligence applications. Herein, it is proposed that the brain's ubiquitous stochasticity represents an additional source of inspiration for expanding the reach of neuromorphic computing to probabilistic applications. To date, many efforts exploring probabilistic computing have focused primarily on one scale of the microelectronics stack, such as implementing probabilistic algorithms on deterministic hardware or developing probabilistic devices and circuits with the expectation that they will be leveraged by eventual probabilistic architectures. A co-design vision is described by which large numbers of devices, such as magnetic tunnel junctions and tunnel diodes, can be operated in a stochastic regime and incorporated into a scalable neuromorphic architecture that can impact a number of probabilistic computing applications, such as Monte Carlo simulations and Bayesian neural networks. Finally, a framework is presented to categorize increasingly advanced hardware-based probabilistic computing technologies.
Collapse
Affiliation(s)
- Shashank Misra
- Microsystems Engineering, Science and Applications, Sandia National Laboratories, Albuquerque, NM, 87123, USA
| | - Leslie C Bland
- Department of Physics, Temple University, Philadelphia, PA, 19122-1801, USA
| | - Suma G Cardwell
- Neural Exploration and Research Laboratory, Sandia National Laboratories, Albuquerque, NM, 87123, USA
| | - Jean Anne C Incorvia
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Conrad D James
- Microsystems Engineering, Science and Applications, Sandia National Laboratories, Albuquerque, NM, 87123, USA
| | - Andrew D Kent
- Department of Physics, New York University, New York, NY, 10003, USA
| | - Catherine D Schuman
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, 37996, USA
| | - J Darby Smith
- Neural Exploration and Research Laboratory, Sandia National Laboratories, Albuquerque, NM, 87123, USA
| | - James B Aimone
- Neural Exploration and Research Laboratory, Sandia National Laboratories, Albuquerque, NM, 87123, USA
| |
Collapse
|
15
|
Ozelim LCDSM, Ribeiro DB, Schiavon JA, Domingues VR, de Queiroz PIB. HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models. PLoS One 2023; 18:e0290331. [PMID: 37651433 PMCID: PMC10470931 DOI: 10.1371/journal.pone.0290331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 08/04/2023] [Indexed: 09/02/2023] Open
Abstract
Surrogate models are frequently used to replace costly engineering simulations. A single surrogate is frequently chosen based on previous experience or by fitting multiple surrogates and selecting one based on mean cross-validation errors. A novel stacking strategy will be presented in this paper. This new strategy results from reinterpreting the model selection process based on the generalization error. For the first time, this problem is proposed to be translated into a well-studied financial problem: portfolio management and optimization. In short, it is demonstrated that the individual residues calculated by leave-one-out procedures are samples from a given random variable ϵi, whose second non-central moment is the i-th model's generalization error. Thus, a stacking methodology based solely on evaluating the behavior of the linear combination of the random variables ϵi is proposed. At first, several surrogate models are calibrated. The Directed Bubble Hierarchical Tree (DBHT) clustering algorithm is then used to determine which models are worth stacking. The stacking weights can be calculated using any financial approach to the portfolio optimization problem. This alternative understanding of the problem enables practitioners to use established financial methodologies to calculate the models' weights, significantly improving the ensemble of models' out-of-sample performance. A study case is carried out to demonstrate the applicability of the new methodology. Overall, a total of 124 models were trained using a specific dataset: 40 Machine Learning models and 84 Polynomial Chaos Expansion models (which considered 3 types of base random variables, 7 least square algorithms for fitting the up to fourth order expansion's coefficients). Among those, 99 models could be fitted without convergence and other numerical issues. The DBHT algorithm with Pearson correlation distance and generalization error similarity was able to select a subgroup of 23 models from the 99 fitted ones, implying a reduction of about 77% in the total number of models, representing a good filtering scheme which still preserves diversity. Finally, it has been demonstrated that the weights obtained by building a Hierarchical Risk Parity (HPR) portfolio perform better for various input random variables, indicating better out-of-sample performance. In this way, an economic stacking strategy has demonstrated its worth in improving the out-of-sample capabilities of stacked models, which illustrates how the new understanding of model stacking methodologies may be useful.
Collapse
|
16
|
Accelerating robust plausible virtual patient cohort generation by substituting ODE simulations with parameter space mapping. J Pharmacokinet Pharmacodyn 2022; 49:625-644. [DOI: 10.1007/s10928-022-09826-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 09/26/2022] [Indexed: 11/11/2022]
|
17
|
Jørgensen ACS, Ghosh A, Sturrock M, Shahrezaei V. Efficient Bayesian inference for stochastic agent-based models. PLoS Comput Biol 2022; 18:e1009508. [PMID: 36197919 PMCID: PMC9576090 DOI: 10.1371/journal.pcbi.1009508] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 10/17/2022] [Accepted: 09/21/2022] [Indexed: 11/14/2022] Open
Abstract
The modelling of many real-world problems relies on computationally heavy simulations of randomly interacting individuals or agents. However, the values of the parameters that underlie the interactions between agents are typically poorly known, and hence they need to be inferred from macroscopic observations of the system. Since statistical inference rests on repeated simulations to sample the parameter space, the high computational expense of these simulations can become a stumbling block. In this paper, we compare two ways to mitigate this issue in a Bayesian setting through the use of machine learning methods: One approach is to construct lightweight surrogate models to substitute the simulations used in inference. Alternatively, one might altogether circumvent the need for Bayesian sampling schemes and directly estimate the posterior distribution. We focus on stochastic simulations that track autonomous agents and present two case studies: tumour growths and the spread of infectious diseases. We demonstrate that good accuracy in inference can be achieved with a relatively small number of simulations, making our machine learning approaches orders of magnitude faster than classical simulation-based methods that rely on sampling the parameter space. However, we find that while some methods generally produce more robust results than others, no algorithm offers a one-size-fits-all solution when attempting to infer model parameters from observations. Instead, one must choose the inference technique with the specific real-world application in mind. The stochastic nature of the considered real-world phenomena poses an additional challenge that can become insurmountable for some approaches. Overall, we find machine learning approaches that create direct inference machines to be promising for real-world applications. We present our findings as general guidelines for modelling practitioners.
Collapse
Affiliation(s)
| | | | - Marc Sturrock
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, Ireland
| | - Vahid Shahrezaei
- Department of Mathematics, Faculty of Natural Sciences, Imperial College London, London, United Kingdom
| |
Collapse
|