1
|
Ilan Y. Using the Constrained Disorder Principle to Navigate Uncertainties in Biology and Medicine: Refining Fuzzy Algorithms. BIOLOGY 2024; 13:830. [PMID: 39452139 PMCID: PMC11505099 DOI: 10.3390/biology13100830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2024] [Revised: 09/17/2024] [Accepted: 10/15/2024] [Indexed: 10/26/2024]
Abstract
Uncertainty in biology refers to situations in which information is imperfect or unknown. Variability, on the other hand, is measured by the frequency distribution of observed data. Biological variability adds to the uncertainty. The Constrained Disorder Principle (CDP) defines all systems in the universe by their inherent variability. According to the CDP, systems exhibit a degree of variability necessary for their proper function, allowing them to adapt to changes in their environments. Per the CDP, while variability differs from uncertainty, it can be viewed as a regulated mechanism for efficient functionality rather than uncertainty. This paper explores the various aspects of un-certainties in biology. It focuses on using CDP-based platforms for refining fuzzy algorithms to address some of the challenges associated with biological and medical uncertainties. Developing a fuzzy decision tree that considers the natural variability of systems can help minimize uncertainty. This method can reveal previously unidentified classes, reduce the number of unknowns, improve the accuracy of modeling results, and generate algorithm outputs that are more biologically and clinically relevant.
Collapse
Affiliation(s)
- Yaron Ilan
- Department of Medicine, Hadassah Medical Center, Faculty of Medicine, Hebrew University, Jerusalem 9112001, Israel
| |
Collapse
|
2
|
Ponciano JM, Gómez JP, Ravel J, Forney LJ. Inferring stability and persistence in the vaginal microbiome: A stochastic model of ecological dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.02.581600. [PMID: 38464272 PMCID: PMC10925280 DOI: 10.1101/2024.03.02.581600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
The interplay of stochastic and ecological processes that govern the establishment and persistence of host-associated microbial communities is not well understood. Here we illustrate the conceptual and practical advantages of fitting stochastic population dynamics models to multi-species bacterial time series data. We show how the stability properties, fluctuation regimes and persistence probabilities of human vaginal microbial communities can be better understood by explicitly accommodating three sources of variability in ecological stochastic models of multi-species abundances: 1) stochastic biotic and abiotic forces, 2) ecological feedback and 3) sampling error. Rooting our modeling tool in stochastic population dynamics modeling theory was key to apply standardized measures of a community's reaction to environmental variation that ultimately depends on the nature and intensity of the intra-specific and inter-specific interaction strengths. Using estimates of model parameters, we developed a Risk Prediction Monitoring (RPM) tool that estimates temporal changes in persistence probabilities for any bacterial group of interest. This method mirrors approaches that are often used in conservation biology in which a measure of extinction risks is periodically updated with any change in a population or community. Additionally, we show how to use estimates of interaction strengths and persistence probabilities to formulate hypotheses regarding the molecular mechanisms and genetic composition that underpin different types of interactions. Instead of seeking a definition of "dysbiosis" we propose to translate concepts of theoretical ecology and conservation biology methods into practical approaches for the management of human-associated bacterial communities.
Collapse
Affiliation(s)
| | - Juan P. Gómez
- Departamento de Química y Biología, Universidad del Norte, Barranquilla, Colombia
| | - Jacques Ravel
- Institute for Genome Sciences and Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD
| | - Larry J. Forney
- Institute for Interdisciplinary Data Science and Department of Biological Sciences, University of Idaho, Moscow, ID
| |
Collapse
|
3
|
Shao Y, Ahmed A, Zamrini EY, Cheng Y, Goulet JL, Zeng-Treitler Q. Enhancing Clinical Data Analysis by Explaining Interaction Effects between Covariates in Deep Neural Network Models. J Pers Med 2023; 13:jpm13020217. [PMID: 36836451 PMCID: PMC9967882 DOI: 10.3390/jpm13020217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/21/2023] [Accepted: 01/24/2023] [Indexed: 01/28/2023] Open
Abstract
Deep neural network (DNN) is a powerful technology that is being utilized by a growing number and range of research projects, including disease risk prediction models. One of the key strengths of DNN is its ability to model non-linear relationships, which include covariate interactions. We developed a novel method called interaction scores for measuring the covariate interactions captured by DNN models. As the method is model-agnostic, it can also be applied to other types of machine learning models. It is designed to be a generalization of the coefficient of the interaction term in a logistic regression; hence, its values are easily interpretable. The interaction score can be calculated at both an individual level and population level. The individual-level score provides an individualized explanation for covariate interactions. We applied this method to two simulated datasets and a real-world clinical dataset on Alzheimer's disease and related dementia (ADRD). We also applied two existing interaction measurement methods to those datasets for comparison. The results on the simulated datasets showed that the interaction score method can explain the underlying interaction effects, there are strong correlations between the population-level interaction scores and the ground truth values, and the individual-level interaction scores vary when the interaction was designed to be non-uniform. Another validation of our new method is that the interactions discovered from the ADRD data included both known and novel relationships.
Collapse
Affiliation(s)
- Yijun Shao
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
- Correspondence:
| | - Ali Ahmed
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
- Department of Medicine, School of Medicine, Georgetown University, Washington, DC 20057, USA
| | - Edward Y. Zamrini
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
- Department of Neurology, School of Medicine, University of Utah, Salt Lake City, UT 84108, USA
- Irvine Clinical Research, Irvine, CA 92614, USA
- Cognitive Neurology Consulting, Newport Beach, CA 92614, USA
| | - Yan Cheng
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
| | - Joseph L. Goulet
- VA Connecticut Healthcare System, New Haven, CT 06516, USA
- Department of Emergency Medicine, Yale School of Medicine, Yale University, New Haven, CT 06516, USA
| | - Qing Zeng-Treitler
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
| |
Collapse
|
4
|
Taper ML, Ponciano JM, Dennis B. Entropy, Statistical Evidence, and Scientific Inference: Evidence Functions in Theory and Applications. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1273. [PMID: 36141159 PMCID: PMC9498250 DOI: 10.3390/e24091273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 08/26/2022] [Indexed: 06/16/2023]
Abstract
Scope and Goals of the Special Issue: There is a growing realization that despite being the essential tool of modern data-based scientific discovery and model testing, statistics has major problems [...].
Collapse
Affiliation(s)
- Mark L. Taper
- Department of Ecology, Montana State University, Bozeman, MT 59717, USA
| | - José Miguel Ponciano
- Biology Department, University of Florida, Gainesville, FL 32611, USA
- Mathematics Department, University of Florida, Gainesville, FL 32611, USA
| | - Brian Dennis
- Department of Mathematics and Statistical Science, University of Idaho, Moscow, ID 83844, USA
- Department of Fish and Wildlife Sciences, University of Idaho, Moscow, ID 83844, USA
| |
Collapse
|
5
|
Aslam M. Design of a new Z-test for the uncertainty of Covid-19 events under Neutrosophic statistics. BMC Med Res Methodol 2022; 22:99. [PMID: 35387604 PMCID: PMC8983806 DOI: 10.1186/s12874-022-01593-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 03/31/2022] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND The existing Z-test for uncertainty events does not give information about the measure of indeterminacy/uncertainty associated with the test. METHODS This paper introduces the Z-test for uncertainty events under neutrosophic statistics. The test statistic of the existing test is modified under the philosophy of the Neutrosophy. The testing process is introduced and applied to the Covid-19 data. RESULTS Based on the information, the proposed test is interpreted as the probability that there is no reduction in uncertainty of Covid-19 is accepted with a probability of 0.95, committing a type-I error is 0.05 with the measure of an indeterminacy 0.10. Based on the analysis, it is concluded that the proposed test is informative than the existing test. The proposed test is also better than the Z-test for uncertainty under fuzzy-logic as the test using fuzz-logic gives the value of the statistic from 2.20 to 2.42 without any information about the measure of indeterminacy. The test under interval statistic only considers the values within the interval rather than the crisp value. CONCLUSIONS From the Covid-19 data analysis, it is found that the proposed Z-test for uncertainty events under the neutrosophic statistics is efficient than the existing tests under classical statistics, fuzzy approach, and interval statistics in terms of information, flexibility, power of the test, and adequacy.
Collapse
Affiliation(s)
- Muhammad Aslam
- Department of Statistics, Faculty of Science, King Abdulaziz University, Jeddah, 21551, Saudi Arabia.
| |
Collapse
|
6
|
A Survey of Uncertainty Quantification in Machine Learning for Space Weather Prediction. GEOSCIENCES 2022. [DOI: 10.3390/geosciences12010027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
With the availability of data and computational technologies in the modern world, machine learning (ML) has emerged as a preferred methodology for data analysis and prediction. While ML holds great promise, the results from such models are not fully unreliable due to the challenges introduced by uncertainty. An ML model generates an optimal solution based on its training data. However, if the uncertainty in the data and the model parameters are not considered, such optimal solutions have a high risk of failure in actual world deployment. This paper surveys the different approaches used in ML to quantify uncertainty. The paper also exhibits the implications of quantifying uncertainty when using ML by performing two case studies with space physics in focus. The first case study consists of the classification of auroral images in predefined labels. In the second case study, the horizontal component of the perturbed magnetic field measured at the Earth’s surface was predicted for the study of Geomagnetically Induced Currents (GICs) by training the model using time series data. In both cases, a Bayesian Neural Network (BNN) was trained to generate predictions, along with epistemic and aleatoric uncertainties. Finally, the pros and cons of both Gaussian Process Regression (GPR) models and Bayesian Deep Learning (DL) are weighed. The paper also provides recommendations for the models that need exploration, focusing on space weather prediction.
Collapse
|
7
|
Taper ML, Lele SR, Ponciano JM, Dennis B, Jerde CL. Assessing the Global and Local Uncertainty of Scientific Evidence in the Presence of Model Misspecification. Front Ecol Evol 2021. [DOI: 10.3389/fevo.2021.679155] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Scientists need to compare the support for models based on observed phenomena. The main goal of the evidential paradigm is to quantify the strength of evidence in the data for a reference model relative to an alternative model. This is done via an evidence function, such as ΔSIC, an estimator of the sample size scaled difference of divergences between the generating mechanism and the competing models. To use evidence, either for decision making or as a guide to the accumulation of knowledge, an understanding of the uncertainty in the evidence is needed. This uncertainty is well characterized by the standard statistical theory of estimation. Unfortunately, the standard theory breaks down if the models are misspecified, as is commonly the case in scientific studies. We develop non-parametric bootstrap methodologies for estimating the sampling distribution of the evidence estimator under model misspecification. This sampling distribution allows us to determine how secure we are in our evidential statement. We characterize this uncertainty in the strength of evidence with two different types of confidence intervals, which we term “global” and “local.” We discuss how evidence uncertainty can be used to improve scientific inference and illustrate this with a reanalysis of the model identification problem in a prominent landscape ecology study using structural equations.
Collapse
|