1
|
Zou T, Wu W, Liu K, Wang K, Lv C. Bayesian Averaging Evaluation Method of Accelerated Degradation Testing Considering Model Uncertainty Based on Relative Entropy. Sensors (Basel) 2024; 24:1426. [PMID: 38474962 DOI: 10.3390/s24051426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 02/16/2024] [Accepted: 02/19/2024] [Indexed: 03/14/2024]
Abstract
To evaluate the lifetime and reliability of long-life, high-reliability products under limited resources, accelerated degradation testing (ADT) technology has been widely applied. Furthermore, the Bayesian evaluation method for ADT can comprehensively utilize historical information and overcome the limitations caused by small sample sizes, garnering significant attention from scholars. However, the traditional ADT Bayesian evaluation method has inherent shortcomings and limitations. Due to the constraints of small samples and an incomplete understanding of degradation mechanisms or accelerated mechanisms, the selected evaluation model may be inaccurate, leading to potentially inaccurate evaluation results. Therefore, describing and quantifying the impact of model uncertainty on evaluation results is a challenging issue that urgently needs resolution in the theoretical research of ADT Bayesian methods. This article addresses the issue of model uncertainty in the ADT Bayesian evaluation process. It analyzes the modeling process of ADT Bayesian and proposes a new model averaging evaluation method for ADT Bayesian based on relative entropy, which, to a certain extent, can resolve the issue of evaluation inaccuracy caused by model selection uncertainty. This study holds certain theoretical and engineering application value for conducting ADT Bayesian evaluation under model uncertainty.
Collapse
Affiliation(s)
- Tianji Zou
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
| | - Wenbo Wu
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
| | - Kai Liu
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
| | - Ke Wang
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
| | - Congmin Lv
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
| |
Collapse
|
2
|
Pan T, Shen W, Davis-Stober CP, Hu G. A Bayesian nonparametric approach for handling item and examinee heterogeneity in assessment data. Br J Math Stat Psychol 2024; 77:196-211. [PMID: 37727141 DOI: 10.1111/bmsp.12322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 08/09/2023] [Accepted: 08/14/2023] [Indexed: 09/21/2023]
Abstract
We propose a novel nonparametric Bayesian item response theory model that estimates clusters at the question level, while simultaneously allowing for heterogeneity at the examinee level under each question cluster, characterized by a mixture of binomial distributions. The main contribution of this work is threefold. First, we present our new model and demonstrate that it is identifiable under a set of conditions. Second, we show that our model can correctly identify question-level clusters asymptotically, and the parameters of interest that measure the proficiency of examinees in solving certain questions can be estimated at an rate (up to a log term). Third, we present a tractable sampling algorithm to obtain valid posterior samples from our proposed model. Compared to the existing methods, our model manages to reveal the multi-dimensionality of the examinees' proficiency level in handling different types of questions parsimoniously by imposing a nested clustering structure. The proposed model is evaluated via a series of simulations as well as apply it to an English proficiency assessment data set. This data analysis example nicely illustrates how our model can be used by test makers to distinguish different types of students and aid in the design of future tests.
Collapse
Affiliation(s)
- Tianyu Pan
- Department of Statistics, University of California, Irvine, California, USA
| | - Weining Shen
- Department of Statistics, University of California, Irvine, California, USA
| | - Clintin P Davis-Stober
- Department of Psychological Sciences, University of Missouri - Columbia, Columbia, Missouri, USA
| | - Guanyu Hu
- Department of Biostatistics and Data Science, Center for Spatial Temporal Modeling for Applications in Population Sciences, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
3
|
Verbeeck J, Geroldinger M, Thiel K, Hooker AC, Ueckert S, Karlsson M, Bathke AC, Bauer JW, Molenberghs G, Zimmermann G. How to analyze continuous and discrete repeated measures in small-sample cross-over trials? Biometrics 2023; 79:3998-4011. [PMID: 37587671 DOI: 10.1111/biom.13920] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 07/26/2023] [Indexed: 08/18/2023]
Abstract
To optimize the use of data from a small number of subjects in rare disease trials, an at first sight advantageous design is the repeated measures cross-over design. However, it is unclear how these within-treatment period and within-subject clustered data are best analyzed in small-sample trials. In a real-data simulation study based upon a recent epidermolysis bullosa simplex trial using this design, we compare non-parametric marginal models, generalized pairwise comparison models, GEE-type models and parametric model averaging for both repeated binary and count data. The recommendation of which methodology to use in rare disease trials with a repeated measures cross-over design depends on the type of outcome and the number of time points the treatment has an effect on. The non-parametric marginal model testing the treatment-time-interaction effect is suitable for detecting between group differences in the shapes of the longitudinal profiles. For binary outcomes with the treatment effect on a single time point, the parametric model averaging method is recommended, while in the other cases the unmatched generalized pairwise comparison methodology is recommended. Both provide an easily interpretable effect size measure, and do not require exclusion of periods or subjects due to incompleteness.
Collapse
Affiliation(s)
- Johan Verbeeck
- Data Science Institute (DSI), Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat), Hasselt University, Hasselt, Belgium
| | - Martin Geroldinger
- Team Biostatistics and Big Medical Data, Intelligent Data Analytics (IDA) Lab Salzburg, Paracelsus Medical University, Salzburg, Austria
- Research and Innovation Management, Paracelsus Medical University Salzburg, Salzburg, Austria
| | - Konstantin Thiel
- Team Biostatistics and Big Medical Data, Intelligent Data Analytics (IDA) Lab Salzburg, Paracelsus Medical University, Salzburg, Austria
- Research and Innovation Management, Paracelsus Medical University Salzburg, Salzburg, Austria
| | | | | | - Mats Karlsson
- Department of Pharmacy, Uppsala University, Uppsala, Sweden
| | - Arne Cornelius Bathke
- Intelligent Data Analytics (IDA) Lab Salzburg, Department of Artificial Intelligence and Human Interfaces, University of Salzburg, Salzburg, Austria
| | - Johann Wolfgang Bauer
- Department of Dermatology and Allergology, Paracelsus Medical University, Salzburg, Austria
| | - Geert Molenberghs
- Data Science Institute (DSI), Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat), Hasselt University, Hasselt, Belgium
- Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat), KULeuven, Leuven, Belgium
| | - Georg Zimmermann
- Team Biostatistics and Big Medical Data, Intelligent Data Analytics (IDA) Lab Salzburg, Paracelsus Medical University, Salzburg, Austria
- Research and Innovation Management, Paracelsus Medical University Salzburg, Salzburg, Austria
| |
Collapse
|
4
|
Zhang Z, Wang Z, Luo Y, Zhang J, Tian D, Zhang Y. Rapid Estimation of Soil Pb Concentration Based on Spectral Feature Screening and Multi-Strategy Spectral Fusion. Sensors (Basel) 2023; 23:7707. [PMID: 37765764 PMCID: PMC10538168 DOI: 10.3390/s23187707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 08/29/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023]
Abstract
Traditional methods for obtaining soil heavy metal content are expensive, inefficient, and limited in monitoring range. In order to meet the needs of soil environmental quality evaluation and health status assessment, visible near-infrared spectroscopy and XRF spectroscopy for monitoring heavy metal content in soil have attracted much attention, because of their rapid, nondestructive, economical, and environmentally friendly features. The use of either of these spectra alone cannot meet the accuracy requirements of traditional measurements, while the synergistic use of the two spectra can further improve the accuracy of monitoring heavy metal lead content in soil. Therefore, this study applied various spectral transformations and preprocessing to vis-NIR and XRF spectra; used the whale optimization algorithm (WOA) and competitive adaptive re-weighted sampling (CARS) algorithms to identify feature spectra; designed a combination variable model (CVM) based on multi-layer spectral data fusion, which improved the spectral preprocessing and spectral feature screening process to increase the efficiency of spectral fusion; and established a quantitative model for soil Pb concentration using partial least squares regression (PLSR). The estimation performance of three spectral fusion strategies, CVM, outer-product analysis (OPA), and Granger-Ramanathan averaging (GRA), was discussed. The results showed that the accuracy and efficiency of the CARS algorithm in the fused spectra estimation model were superior to those of the WOA algorithm, with an average coefficient of determination (R2) value of 0.9226 and an average root mean square error (RMSE) of 0.1984. The accuracy of the estimation models established, based on the different spectral types, to predict the Pb content of the soil was ranked as follows: the CVM model > the XRF spectral model > the vis-NIR spectral model. Within the CVM fusion strategy, the estimation model based on CARS and PLSR (CARS_D1+D2) performed the best, with R2 and RMSE values of 0.9546 and 0.2035, respectively. Among the three spectral fusion strategies, CVM had the highest accuracy, OPA had the smallest errors, and GRA showed a more balanced performance. This study provides technical means for on-site rapid estimation of Pb content based on multi-source spectral fusion and lays the foundation for subsequent research on dynamic, real-time, and large-scale quantitative monitoring of soil heavy metal pollution using high-spectral remote sensing images.
Collapse
Affiliation(s)
| | - Zhe Wang
- College of Environment and Resources, Southwest University of Science & Technology, Mianyang 621010, China
| | | | | | | | | |
Collapse
|
5
|
Liu H, Zhang X. Frequentist model averaging for undirected Gaussian graphical models. Biometrics 2023; 79:2050-2062. [PMID: 36106680 DOI: 10.1111/biom.13758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 07/12/2022] [Accepted: 08/30/2022] [Indexed: 11/30/2022]
Abstract
Advances in information technologies have made network data increasingly frequent in a spectrum of big data applications, which is often explored by probabilistic graphical models. To precisely estimate the precision matrix, we propose an optimal model averaging estimator for Gaussian graphs. We prove that the proposed estimator is asymptotically optimal when candidate models are misspecified. The consistency and the asymptotic distribution of model averaging estimator, and the weight convergence are also studied when at least one correct model is included in the candidate set. Furthermore, numerical simulations and a real data analysis on yeast genetic data are conducted to illustrate that the proposed method is promising.
Collapse
Affiliation(s)
- Huihang Liu
- School of Management, University of Science and Technology of China, Hefei, China
| | - Xinyu Zhang
- School of Management, University of Science and Technology of China, Hefei, China
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
6
|
van Erp B, Nuijten WWL, van de Laar T, de Vries B. Automating Model Comparison in Factor Graphs. Entropy (Basel) 2023; 25:1138. [PMID: 37628168 PMCID: PMC10453220 DOI: 10.3390/e25081138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 07/21/2023] [Accepted: 07/27/2023] [Indexed: 08/27/2023]
Abstract
Bayesian state and parameter estimation are automated effectively in a variety of probabilistic programming languages. The process of model comparison on the other hand, which still requires error-prone and time-consuming manual derivations, is often overlooked despite its importance. This paper efficiently automates Bayesian model averaging, selection, and combination by message passing on a Forney-style factor graph with a custom mixture node. Parameter and state inference, and model comparison can then be executed simultaneously using message passing with scale factors. This approach shortens the model design cycle and allows for the straightforward extension to hierarchical and temporal model priors to accommodate for modeling complicated time-varying processes.
Collapse
Affiliation(s)
- Bart van Erp
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands
| | - Wouter W. L. Nuijten
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands
| | - Thijs van de Laar
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands
| | - Bert de Vries
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands
- GN Hearing, 5612 AB Eindhoven, The Netherlands
| |
Collapse
|
7
|
Wang M, Zhang X, Wan ATK, You K, Zou G. Jackknife model averaging for high-dimensional quantile regression. Biometrics 2023; 79:178-189. [PMID: 34608993 DOI: 10.1111/biom.13574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 07/08/2021] [Accepted: 09/14/2021] [Indexed: 11/29/2022]
Abstract
In this paper, we propose a frequentist model averaging method for quantile regression with high-dimensional covariates. Although research on these subjects has proliferated as separate approaches, no study has considered them in conjunction. Our method entails reducing the covariate dimensions through ranking the covariates based on marginal quantile utilities. The second step of our method implements model averaging on the models containing the covariates that survive the screening of the first step. We use a delete-one cross-validation method to select the model weights, and prove that the resultant estimator possesses an optimal asymptotic property uniformly over any compact (0,1) subset of the quantile indices. Our proof, which relies on empirical process theory, is arguably more challenging than proofs of similar results in other contexts owing to the high-dimensional nature of the problem and our relaxation of the conventional assumption of the weights summing to one. Our investigation of finite-sample performance demonstrates that the proposed method exhibits very favorable properties compared to the least absolute shrinkage and selection operator (LASSO) and smoothly clipped absolute deviation (SCAD) penalized regression methods. The method is applied to a microarray gene expression data set.
Collapse
Affiliation(s)
- Miaomiao Wang
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Xinyu Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- Beijing Academy of Artificial Intelligence, Beijing, China
| | - Alan T K Wan
- Department of Management Sciences, City University of Hong Kong, Kowloon, Hong Kong
| | - Kang You
- School of Mathematical Sciences, Capital Normal University, Beijing, China
| | - Guohua Zou
- School of Mathematical Sciences, Capital Normal University, Beijing, China
| |
Collapse
|
8
|
Liu P, Li J, Kosorok MR. Change plane model averaging for subgroup identification. Stat Methods Med Res 2023; 32:773-788. [PMID: 36775991 DOI: 10.1177/09622802231154327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
Central to personalized medicine and tailored therapies is discovering the subpopulations that account for treatment effect heterogeneity and are likely to benefit more from given interventions. In this article, we introduce a change plane model averaging method to identify subgroups characterized by linear combinations of predictive variables and multiple cut-offs. We first fit a sequence of statistical models, each incorporating the thresholding effect of one particular covariate. The estimation of submodels is accomplished through an iterative integration of a change point detection method and numerical optimization algorithms. A frequentist model averaging approach is then employed to linearly combine the submodels with optimal weights. Our approach can deal with high-dimensional settings involving enormous potential grouping variables by adopting the sparsity-inducing penalties. Simulation studies are conducted to investigate the prediction and subgrouping performance of the proposed method, with a comparison to various competing subgroup detection methods. Our method is applied to a dataset from a warfarin pharmacogenetics study, producing some new findings.
Collapse
Affiliation(s)
- Pan Liu
- Department of Statistics and Data Science, 37580National University of Singapore, Singapore, Singapore
| | - Jialiang Li
- Department of Statistics and Data Science, 37580National University of Singapore, Singapore, Singapore.,Duke University NUS Graduate Medical School, Singapore, Singapore
| | - Michael R Kosorok
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, USA
| |
Collapse
|
9
|
Friedrich S, Groll A, Ickstadt K, Kneib T, Pauly M, Rahnenführer J, Friede T. Regularization approaches in clinical biostatistics: A review of methods and their applications. Stat Methods Med Res 2023; 32:425-440. [PMID: 36384320 PMCID: PMC9896544 DOI: 10.1177/09622802221133557] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A range of regularization approaches have been proposed in the data sciences to overcome overfitting, to exploit sparsity or to improve prediction. Using a broad definition of regularization, namely controlling model complexity by adding information in order to solve ill-posed problems or to prevent overfitting, we review a range of approaches within this framework including penalization, early stopping, ensembling and model averaging. Aspects of their practical implementation are discussed including available R-packages and examples are provided. To assess the extent to which these approaches are used in medicine, we conducted a review of three general medical journals. It revealed that regularization approaches are rarely applied in practical clinical applications, with the exception of random effects models. Hence, we suggest a more frequent use of regularization approaches in medical research. In situations where also other approaches work well, the only downside of the regularization approaches is increased complexity in the conduct of the analyses which can pose challenges in terms of computational resources and expertise on the side of the data analyst. In our view, both can and should be overcome by investments in appropriate computing facilities and educational resources.
Collapse
Affiliation(s)
- Sarah Friedrich
- Institute of Mathematics, University of
Augsburg, Augsburg, Germany
- Centre for Advanced Analytics and Predictive Sciences, University of
Augsburg, Augsburg, Germany
| | - Andreas Groll
- Department of Statistics, TU Dortmund
University, Dortmund, Germany
| | - Katja Ickstadt
- Department of Statistics, TU Dortmund
University, Dortmund, Germany
| | - Thomas Kneib
- Chair of Statistics and Campus Institute Data Science,
Georg-August-University Göttingen,
Göttingen, Germany
| | - Markus Pauly
- Department of Statistics, TU Dortmund
University, Dortmund, Germany
| | | | - Tim Friede
- Department of Medical Statistics, University Medical Center
Göttingen, Göttingen, Germany
- DZHK (German Center for Cardiovascular Research), partner site
Göttingen, Göttingen, Germany
| |
Collapse
|
10
|
Chasseloup E, Karlsson MO. Comparison of Seven Non-Linear Mixed Effect Model-Based Approaches to Test for Treatment Effect. Pharmaceutics 2023; 15:460. [PMID: 36839782 PMCID: PMC9959233 DOI: 10.3390/pharmaceutics15020460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 01/04/2023] [Accepted: 01/09/2023] [Indexed: 01/31/2023] Open
Abstract
Analyses of longitudinal data with non-linear mixed-effects models (NLMEM) are typically associated with high power, but sometimes at the cost of inflated type I error. Approaches to overcome this problem were published recently, such as model-averaging across drug models (MAD), individual model-averaging (IMA), and combined Likelihood Ratio Test (cLRT). This work aimed to assess seven NLMEM approaches in the same framework: treatment effect assessment in balanced two-armed designs using real natural history data with or without the addition of simulated treatment effect. The approaches are MAD, IMA, cLRT, standard model selection (STDs), structural similarity selection (SSs), randomized cLRT (rcLRT), and model-averaging across placebo and drug models (MAPD). The assessment included type I error, using Alzheimer's Disease Assessment Scale-cognitive (ADAS-cog) scores from 817 untreated patients and power and accuracy in the treatment effect estimates after the addition of simulated treatment effects. The model selection and averaging among a set of pre-selected candidate models were driven by the Akaike information criteria (AIC). The type I error rate was controlled only for IMA and rcLRT; the inflation observed otherwise was explained by the placebo model misspecification and selection bias. Both IMA and rcLRT had reasonable power and accuracy except under a low typical treatment effect.
Collapse
|
11
|
Greenberg I, Vohland M, Seidel M, Hutengs C, Bezard R, Ludwig B. Evaluation of Mid-Infrared and X-ray Fluorescence Data Fusion Approaches for Prediction of Soil Properties at the Field Scale. Sensors (Basel) 2023; 23:662. [PMID: 36679480 PMCID: PMC9861566 DOI: 10.3390/s23020662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 12/24/2022] [Accepted: 12/29/2022] [Indexed: 06/17/2023]
Abstract
Previous studies investigating multi-sensor fusion for the collection of soil information have shown variable improvements, and the underlying prediction mechanisms are not sufficiently understood for spectrally-active and -inactive properties. Our objective was to study prediction mechanisms and benefits of model fusion by measuring mid-infrared (MIR) and X-ray fluorescence (XRF) spectra, texture, total and labile organic carbon (OC) and nitrogen (N) content, pH, and cation exchange capacity (CEC) for n = 117 soils from an arable field in Germany. Partial least squares regression models underwent a three-fold training/testing procedure using MIR spectra or elemental concentrations derived from XRF spectra. Additionally, two sequential hybrid and two high-level fusion approaches were tested. For the studied field, MIR was superior for organic properties (ratio of prediction to interquartile distance of validation (RPIQV) for total OC = 7.7 and N = 5.0)), while XRF was superior for inorganic properties (RPIQV for clay = 3.4, silt = 3.0, and sand = 1.8). Even the optimal fusion approach brought little to no accuracy improvement for these properties. The high XRF accuracy for clay and silt is explained by the large number of elements with variable importance in the projection scores >1 (Fe ≈ Ni > Si ≈ Al ≈ Mg > Mn ≈ K ≈ Pb (clay only) ≈ Cr) with strong spearman correlations (±0.57 < rs < ±0.90) with clay and silt. For spectrally-inactive properties relying on indirect prediction mechanisms, the relative improvements from the optimal fusion approach compared to the best single spectrometer were marginal for pH (3.2% increase in RPIQV versus MIR alone) but more pronounced for labile OC (9.3% versus MIR) and CEC (12% versus XRF). Dominance of a suboptimal spectrometer in a fusion approach worsened performance compared to the best single spectrometer. Granger-Ramanathan averaging, which weights predictions according to accuracy in training, is therefore recommended as a robust approach to capturing the potential benefits of multiple sensors.
Collapse
Affiliation(s)
- Isabel Greenberg
- Department of Environmental Chemistry, University of Kassel, 37213 Witzenhausen, Germany
| | - Michael Vohland
- Geoinformatics and Remote Sensing, Institute for Geography, Leipzig University, 04103 Leipzig, Germany
- Remote Sensing Centre for Earth System Research, Leipzig University, 04103 Leipzig, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103 Leipzig, Germany
| | - Michael Seidel
- Geoinformatics and Remote Sensing, Institute for Geography, Leipzig University, 04103 Leipzig, Germany
- Remote Sensing Centre for Earth System Research, Leipzig University, 04103 Leipzig, Germany
| | - Christopher Hutengs
- Geoinformatics and Remote Sensing, Institute for Geography, Leipzig University, 04103 Leipzig, Germany
- Remote Sensing Centre for Earth System Research, Leipzig University, 04103 Leipzig, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103 Leipzig, Germany
| | - Rachel Bezard
- Department of Geochemistry and Isotope Geology, University of Göttingen, Goldschmidtstrasse 1, 37077 Göttingen, Germany
| | - Bernard Ludwig
- Department of Environmental Chemistry, University of Kassel, 37213 Witzenhausen, Germany
| |
Collapse
|
12
|
De Wael A, De Backer A, Yu CP, Sentürk DG, Lobato I, Faes C, Van Aert S. Three Approaches for Representing the Statistical Uncertainty on Atom-Counting Results in Quantitative ADF STEM. Microsc Microanal 2022; 29:1-9. [PMID: 36117265 DOI: 10.1017/s1431927622012284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A decade ago, a statistics-based method was introduced to count the number of atoms from annular dark-field scanning transmission electron microscopy (ADF STEM) images. In the past years, this method was successfully applied to nanocrystals of arbitrary shape, size, and composition (and its high accuracy and precision has been demonstrated). However, the counting results obtained from this statistical framework are so far presented without a visualization of the actual uncertainty about this estimate. In this paper, we present three approaches that can be used to represent counting results together with their statistical error, and discuss which approach is most suited for further use based on simulations and an experimental ADF STEM image.
Collapse
Affiliation(s)
- Annelies De Wael
- EMAT, University of Antwerp, Antwerp, Belgium
- NANOlab Center of Excellence, University of Antwerp, Antwerp, Belgium
| | - Annick De Backer
- EMAT, University of Antwerp, Antwerp, Belgium
- NANOlab Center of Excellence, University of Antwerp, Antwerp, Belgium
| | - Chu-Ping Yu
- EMAT, University of Antwerp, Antwerp, Belgium
- NANOlab Center of Excellence, University of Antwerp, Antwerp, Belgium
| | - Duygu Gizem Sentürk
- EMAT, University of Antwerp, Antwerp, Belgium
- NANOlab Center of Excellence, University of Antwerp, Antwerp, Belgium
| | - Ivan Lobato
- EMAT, University of Antwerp, Antwerp, Belgium
- NANOlab Center of Excellence, University of Antwerp, Antwerp, Belgium
| | - Christel Faes
- I-BioStat, Data Science Institute, Hasselt University, Hasselt, Belgium
| | - Sandra Van Aert
- EMAT, University of Antwerp, Antwerp, Belgium
- NANOlab Center of Excellence, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
13
|
Wali K, Khan HA, Farrell M, Henten EJV, Meers E. Determination of Bio-Based Fertilizer Composition Using Combined NIR and MIR Spectroscopy: A Model Averaging Approach. Sensors (Basel) 2022; 22:5919. [PMID: 35957475 PMCID: PMC9371422 DOI: 10.3390/s22155919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 07/30/2022] [Accepted: 08/01/2022] [Indexed: 06/15/2023]
Abstract
Application of bio-based fertilizers is considered a practical solution to enhance soil fertility and maintain soil quality. However, the composition of bio-based fertilizers needs to be quantified before their application to the soil. Non-destructive techniques such as near-infrared (NIR) and mid-infrared (MIR) are generally used to quantify the composition of bio-based fertilizers in a speedy and cost-effective manner. However, the prediction performances of these techniques need to be quantified before deployment. With this motive, this study investigates the potential of these techniques to characterize a diverse set of bio-based fertilizers for 25 different properties including nutrients, minerals, heavy metals, pH, and EC. A partial least square model with wavelength selection is employed to estimate each property of interest. Then a model averaging, approach is tested to examine if combining model outcomes of NIR with MIR could improve the prediction performances of these sensors. In total, 17 of the 25 elements could be predicted to have a good performance status using individual spectral methods. Combining model outcomes of NIR with MIR resulted in an improvement, increasing the number of properties that could be predicted from 17 to 21. Most notably the improvement in prediction performance was observed for Cd, Cr, Zn, Al, Ca, Fe, S, Cu, Ec, and Na. It was concluded that the combined use of NIR and MIR spectral methods can be used to monitor the composition of a diverse set of bio-based fertilizers.
Collapse
Affiliation(s)
- Khan Wali
- Farm Technology Group, Wageningen University & Research, 6708 PB Wageningen, The Netherlands
| | - Haris Ahmad Khan
- Farm Technology Group, Wageningen University & Research, 6708 PB Wageningen, The Netherlands
| | - Mark Farrell
- CSIRO Agriculture and Food, Kaurna Country, Locked Bag 2, Glen Osmond, SA 5064, Australia
| | - Eldert J. Van Henten
- Farm Technology Group, Wageningen University & Research, 6708 PB Wageningen, The Netherlands
| | - Erik Meers
- Department of Green Chemistry and Technology, University of Gent, 9820 Merelbeke, Belgium
| |
Collapse
|
14
|
Martínez-Huertas JÁ, Olmos R, Ferrer E. Model Selection and Model Averaging for Mixed-Effects Models with Crossed Random Effects for Subjects and Items. Multivariate Behav Res 2022; 57:603-619. [PMID: 33635157 DOI: 10.1080/00273171.2021.1889946] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A good deal of experimental research is characterized by the presence of random effects on subjects and items. A standard modeling approach that includes such sources of variability is the mixed-effects models (MEMs) with crossed random effects. However, under-parameterizing or over-parameterizing the random structure of MEMs bias the estimations of the Standard Errors (SEs) of fixed effects. In this simulation study, we examined two different but complementary perspectives: model selection with likelihood-ratio tests, AIC, and BIC; and model averaging with Akaike weights. Results showed that true model selection was constant across the different strategies examined (including ML and REML estimators). However, sample size and variance of random slopes were found to explain true model selection and SE bias of fixed effects. No relevant differences in SE bias were found for model selection and model averaging. Sample size and variance of random slopes interacted with the estimator to explain SE bias. Only the within-subjects effect showed significant underestimation of SEs with smaller number of items and larger item random slopes. SE bias was higher for ML than REML, but the variability of SE bias was the opposite. Such variability can be translated into high rates of unacceptable bias in many replications.
Collapse
Affiliation(s)
| | - Ricardo Olmos
- Department of Psychology, Universidad Autónoma de Madrid
| | - Emilio Ferrer
- Department of Psychology, University of California, Davis
| |
Collapse
|
15
|
Vervaart M, Strong M, Claxton KP, Welton NJ, Wisløff T, Aas E. An Efficient Method for Computing Expected Value of Sample Information for Survival Data from an Ongoing Trial. Med Decis Making 2022; 42:612-625. [PMID: 34967237 PMCID: PMC9189722 DOI: 10.1177/0272989x211068019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 11/30/2021] [Indexed: 11/15/2022]
Abstract
BACKGROUND Decisions about new health technologies are increasingly being made while trials are still in an early stage, which may result in substantial uncertainty around key decision drivers such as estimates of life expectancy and time to disease progression. Additional data collection can reduce uncertainty, and its value can be quantified by computing the expected value of sample information (EVSI), which has typically been described in the context of designing a future trial. In this article, we develop new methods for computing the EVSI of extending an existing trial's follow-up, first for an assumed survival model and then extending to capture uncertainty about the true survival model. METHODS We developed a nested Markov Chain Monte Carlo procedure and a nonparametric regression-based method. We compared the methods by computing single-model and model-averaged EVSI for collecting additional follow-up data in 2 synthetic case studies. RESULTS There was good agreement between the 2 methods. The regression-based method was fast and straightforward to implement, and scales easily to include any number of candidate survival models in the model uncertainty case. The nested Monte Carlo procedure, on the other hand, was extremely computationally demanding when we included model uncertainty. CONCLUSIONS We present a straightforward regression-based method for computing the EVSI of extending an existing trial's follow-up, both where a single known survival model is assumed and where we are uncertain about the true survival model. EVSI for ongoing trials can help decision makers determine whether early patient access to a new technology can be justified on the basis of the current evidence or whether more mature evidence is needed. HIGHLIGHTS Decisions about new health technologies are increasingly being made while trials are still in an early stage, which may result in substantial uncertainty around key decision drivers such as estimates of life-expectancy and time to disease progression. Additional data collection can reduce uncertainty, and its value can be quantified by computing the expected value of sample information (EVSI), which has typically been described in the context of designing a future trial.In this article, we have developed new methods for computing the EVSI of extending a trial's follow-up, both where a single known survival model is assumed and where we are uncertain about the true survival model. We extend a previously described nonparametric regression-based method for computing EVSI, which we demonstrate in synthetic case studies is fast, straightforward to implement, and scales easily to include any number of candidate survival models in the EVSI calculations.The EVSI methods that we present in this article can quantify the need for collecting additional follow-up data before making an adoption decision given any decision-making context.
Collapse
Affiliation(s)
- Mathyn Vervaart
- Department of Health Management and Health Economics, University of Oslo, Oslo, Norway
- Norwegian Medicines Agency, Oslo, Norway
| | - Mark Strong
- School of Health and Related Research, University of Sheffield, Sheffield, UK
| | - Karl P. Claxton
- Centre for Health Economics, University of York, York, UK
- Department of Economics and Related Studies, University of York, York, UK
| | - Nicky J. Welton
- Population Health Sciences, University of Bristol, Bristol, UK
| | - Torbjørn Wisløff
- Department of Community Medicine, UiT The Arctic University of Norway, Oslo, Norway
- Norwegian Institute of Public Health, Oslo, Norway
| | - Eline Aas
- Department of Health Management and Health Economics, University of Oslo, Oslo, Norway
| |
Collapse
|
16
|
Yu Q, Zhou Y, Li H, Jiang X. Reliability analysis of motorcycle crash severity outcomes: Consideration of model selection uncertainty. Traffic Inj Prev 2022; 23:377-383. [PMID: 35709312 DOI: 10.1080/15389588.2022.2086979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 06/03/2022] [Accepted: 06/03/2022] [Indexed: 06/15/2023]
Abstract
OBJECTIVE While a large amount of work has been conducted on different types of crash injury severity models, model selection uncertainty remains a critical issue in traffic safety research. The objective of this study is to handle model selection uncertainty by combining multiple models. METHODS Motorcycle crashes in Michigan from 2010 to 2014 are collected for the analysis. A model averaging approach is used to integrate useful information from three commonly used crash injury severity models: multinomial logit model, ordered logit model, and ordered probit model to deal with the situation where the model selection uncertainty exists in crash data analysis. The ratios of model posterior probabilities between models are used to quantify the model selection uncertainty. In addition, the effectiveness of the method is illustrated by comparing it with the single-best model. RESULTS The ratios of model posterior probabilities among models approximate to 1. It means that three models have the same importance in statistical analysis of motorcycle injury severity, resulting in model selection uncertainty. The comparison between the results of model averaging approach and single-best model shows that the single-best model tends to overestimate the effects of risk factors on motorcycle injury severities because of ignoring the model selection uncertainty; parameter errors and confidence intervals of model averaging are greater and wider than those of the single-best model due to between-model uncertainty included in the model averaging; some risk factors are significant in the model averaging approach while not in the single-best model. Results from model averaging approach reveal that drunk or riding under influence, angle/sideswipe/head on crashes, speed limit of 35 mph or higher, and signal control play significant roles in the motorcycle crashes. CONCLUSIONS The study contributes to the existing crash injury-severity literature by developing a model averaging approach to explore the relationship between motorcyclist's injury-severity and its contributing factors. The model averaging approach overcomes the limitations of the current crash injury-severity modeling approaches by (1) revealing the potential model selection uncertainty among injury-severity models with model posterior probabilities; (2) more reliably accounting for the effects of risk factors on motorcyclist' injury severities through integrating all information from the candidate models; and (3) better presenting the underlying unreliability of the analysis results from each individual model.
Collapse
Affiliation(s)
- Qiong Yu
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, China
- National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Chengdu, China
| | - Yue Zhou
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, China
- National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Chengdu, China
| | - Haibo Li
- School of Economics & Management, Southwest Jiaotong University, Chengdu, China
| | - Xinguo Jiang
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, China
- National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Chengdu, China
- School of Transportation, Fujian University of Technology, Fuzhou, China
| |
Collapse
|
17
|
Sugahara S, Aomi I, Ueno M. Bayesian Network Model Averaging Classifiers by Subbagging. Entropy (Basel) 2022; 24:743. [PMID: 35626626 DOI: 10.3390/e24050743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 05/15/2022] [Accepted: 05/19/2022] [Indexed: 12/10/2022]
Abstract
When applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achieved by maximizing the conditional log likelihood (CLL) of a class variable given the feature variables. Nevertheless, because ML has asymptotic consistency, the performance of Bayesian network structures achieved by maximizing ML is not necessarily worse than that achieved by maximizing CLL for large data. However, the error of learning structures by maximizing the ML becomes much larger for small sample sizes. That large error degrades the classification accuracy. As a method to resolve this shortcoming, model averaging has been proposed to marginalize the class variable posterior over all structures. However, the posterior standard error of each structure in the model averaging becomes large as the sample size becomes small; it subsequently degrades the classification accuracy. The main idea of this study is to improve the classification accuracy using subbagging, which is modified bagging using random sampling without replacement, to reduce the posterior standard error of each structure in model averaging. Moreover, to guarantee asymptotic consistency, we use the K-best method with the ML score. The experimentally obtained results demonstrate that our proposed method provides more accurate classification than earlier BNC methods and the other state-of-the-art ensemble methods do.
Collapse
|
18
|
Heus A, Uster DW, Grootaert V, Vermeulen N, Somers A, In't Veld DH, Wicha SG, De Cock PA. Model-informed precision dosing of vancomycin via continuous infusion: a clinical fit-for-purpose evaluation of published PK models. Int J Antimicrob Agents 2022; 59:106579. [PMID: 35341931 DOI: 10.1016/j.ijantimicag.2022.106579] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 03/08/2022] [Accepted: 03/20/2022] [Indexed: 11/18/2022]
Abstract
BACKGROUND Model-informed precision dosing (MIPD) is an innovative approach used to guide bedside vancomycin dosing. The use of Bayesian software requires suitable and externally validated population pharmacokinetic (popPK) models. OBJECTIVES Therefore, we aimed to identify suitable popPK models for a priori prediction and a posteriori forecasting of vancomycin in continuous infusion. Additionally, a model averaging (MAA) and a model selection approach (MSA) were compared with the identified popPK models. METHODS . Clinical PK data were retrospectively collected from patients receiving continuous vancomycin therapy and admitted to a general ward of three large Belgian hospitals. The predictive performance of the popPK models, identified in a systematic literature search, as well as the MAA/MSA was evaluated for the a priori and a posteriori scenarios using bias, root mean square errors, normalized prediction distribution errors and visual predictive checks. RESULTS The predictive performance of 23 popPK models was evaluated based on clinical data from 169 patients and 923 therapeutic drug monitoring samples. Overall, the best predictive performance was found using the Okada model (bias < -0.1 mg/L), followed by the Colin model. The MAA/MSA predicted with a constantly high precision and low inaccuracy and were clinically acceptable in the Bayesian forecasting. CONCLUSION This study identified the two-compartmental models of Okada et al. and Colin et al. as most suitable for non-ICU patients to forecast individual exposure profiles after continuous vancomycin infusion. The MAA/MSA performed equally good as the individual popPK models. Both approaches could therefore be used in clinical practice to guide dosing decisions.
Collapse
Affiliation(s)
- Astrid Heus
- Department of Pharmacy, Ghent University Hospital, Ghent, Belgium; Department of Pharmacy, General Hospital Sint-Jan Brugge-Oostende AV, Bruges, Belgium
| | - David W Uster
- Department of Clinical Pharmacy, Institute of Pharmacy, University of Hamburg, Hamburg, Germany
| | - Veerle Grootaert
- Department of Pharmacy, General Hospital Sint-Jan Brugge-Oostende AV, Bruges, Belgium
| | - Nele Vermeulen
- Department of Pharmacy, General hospital OLV Aalst, Aalst, Belgium
| | - Annemie Somers
- Department of Pharmacy, Ghent University Hospital, Ghent, Belgium
| | - Diana Huis In't Veld
- Department of Internal Medicine and Infectious Diseases Ghent University Hospital, Ghent, Belgium
| | - Sebastian G Wicha
- Department of Clinical Pharmacy, Institute of Pharmacy, University of Hamburg, Hamburg, Germany
| | - Pieter A De Cock
- Department of Pharmacy, Ghent University Hospital, Ghent, Belgium; Department of Paediatric Intensive Care, Ghent University Hospital, Ghent, Belgium; Faculty of Medicine and Health Sciences, Department of Basic and Applied Medical Sciences, Ghent University, Ghent, Belgium.
| |
Collapse
|
19
|
He B, Zhong T, Huang J, Liu Y, Zhang Q, Ma S. Histopathological imaging-based cancer heterogeneity analysis via penalized fusion with model averaging. Biometrics 2021; 77:1397-1408. [PMID: 32822084 PMCID: PMC9367644 DOI: 10.1111/biom.13357] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 08/15/2020] [Accepted: 08/17/2020] [Indexed: 04/17/2024]
Abstract
Heterogeneity is a hallmark of cancer. For various cancer outcomes/phenotypes, supervised heterogeneity analysis has been conducted, leading to a deeper understanding of disease biology and customized clinical decisions. In the literature, such analysis has been oftentimes based on demographic, clinical, and omics measurements. Recent studies have shown that high-dimensional histopathological imaging features contain valuable information on cancer outcomes. However, comparatively, heterogeneity analysis based on imaging features has been very limited. In this article, we conduct supervised cancer heterogeneity analysis using histopathological imaging features. The penalized fusion technique, which has notable advantages-such as greater flexibility-over the finite mixture modeling and other techniques, is adopted. A sparse penalization is further imposed to accommodate high dimensionality and select relevant imaging features. To improve computational feasibility and generate more reliable estimation, we employ model averaging. Computational and statistical properties of the proposed approach are carefully investigated. Simulation demonstrates its favorable performance. The analysis of The Cancer Genome Atlas (TCGA) data may provide a new way of defining/examining breast cancer heterogeneity.
Collapse
Affiliation(s)
- Baihua He
- School of Mathematics and Statistics, Wuhan University, Wuhan, China
| | - Tingyan Zhong
- SJTU-Yale Joint Center for Biostatistics, Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut
| | - Jian Huang
- Department of Statistics and Actuarial Science, University of Iowa, Iowa City, Iowa
| | - Yanyan Liu
- School of Mathematics and Statistics, Wuhan University, Wuhan, China
| | - Qingzhao Zhang
- Department of Statistics, School of Economics, Key Laboratory of Econometrics, Ministry of Education, The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut
| |
Collapse
|
20
|
Kellogg M, Mogstad M, Pouliot GA, Torgovitsky A. Combining Matching and Synthetic Control to Trade off Biases from Extrapolation and Interpolation. J Am Stat Assoc 2021; 116:1804-1816. [PMID: 35706442 PMCID: PMC9197080 DOI: 10.1080/01621459.2021.1979562] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Revised: 08/30/2021] [Accepted: 08/30/2021] [Indexed: 10/20/2022]
Abstract
The synthetic control (SC) method is widely used in comparative case studies to adjust for differences in pre-treatment characteristics. SC limits extrapolation bias at the potential expense of interpolation bias, whereas traditional matching estimators have the opposite properties. This complementarity motives us to propose a matching and synthetic control (or MASC) estimator as a model averaging estimator that combines the standard SC and matching estimators. We show how to use a rolling-origin cross-validation procedure to train the MASC to resolve trade-offs between interpolation and extrapolation bias. We use a series of empirically-based placebo and Monte Carlo simulations to shed light on when the SC, matching, MASC and penalized SC estimators do (and do not) perform well. Then, we apply these estimators to examine the economic costs of conflicts in the context of Spain.
Collapse
Affiliation(s)
| | - Magne Mogstad
- Kenneth C. Griffin Department of Economics, University of Chicago Statistics Norway, NBER
| | | | | |
Collapse
|
21
|
Wang SH, Satapathy SC, Anderson D, Chen SX, Zhang YD. Deep Fractional Max Pooling Neural Network for COVID-19 Recognition. Front Public Health 2021; 9:726144. [PMID: 34447739 PMCID: PMC8383320 DOI: 10.3389/fpubh.2021.726144] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 07/09/2021] [Indexed: 01/19/2023] Open
Abstract
Aim: Coronavirus disease 2019 (COVID-19) is a form of disease triggered by a new strain of coronavirus. This paper proposes a novel model termed "deep fractional max pooling neural network (DFMPNN)" to diagnose COVID-19 more efficiently. Methods: This 12-layer DFMPNN replaces max pooling (MP) and average pooling (AP) in ordinary neural networks with the help of a novel pooling method called "fractional max-pooling" (FMP). In addition, multiple-way data augmentation (DA) is employed to reduce overfitting. Model averaging (MA) is used to reduce randomness. Results: We ran our algorithm on a four-category dataset that contained COVID-19, community-acquired pneumonia, secondary pulmonary tuberculosis (SPT), and healthy control (HC). The 10 runs on the test set show that the micro-averaged F1 (MAF) score of our DFMPNN is 95.88%. Discussions: This proposed DFMPNN is superior to 10 state-of-the-art models. Besides, FMP outperforms traditional MP, AP, and L2-norm pooling (L2P).
Collapse
Affiliation(s)
- Shui-Hua Wang
- School of Mathematics and Actuarial Science, University of Leicester, Leicester, United Kingdom
| | | | - Donovan Anderson
- School of Mathematics and Actuarial Science, University of Leicester, Leicester, United Kingdom
| | - Shi-Xin Chen
- Nursing Department, The Fourth People's Hospital of Huai'an, Huai'an, China
| | - Yu-Dong Zhang
- School of Informatics, University of Leicester, Leicester, United Kingdom
| |
Collapse
|
22
|
Piepho HP, Williams ER. Regression models for order-of-addition experiments. Biom J 2021; 63:1673-1687. [PMID: 34327728 DOI: 10.1002/bimj.202100048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 06/21/2021] [Accepted: 06/27/2021] [Indexed: 11/06/2022]
Abstract
The purpose of order-of-addition (OofA) experiments is to identify the best order in a sequence of m components in a system. Such experiments may be analyzed by various regression models, the most popular ones being based on pairwise ordering (PWO) factors or on component-position (CP) factors. This paper reviews these models and extensions and proposes a new class of models based on response surface (RS) regression using component position numbers as predictor variables. Using two published examples, it is shown that RS models can be quite competitive. In case of model uncertainty, we advocate the use of model averaging for analysis. The averaging idea leads naturally to a design approach based on a compound optimality criterion assigning weights to each candidate model.
Collapse
Affiliation(s)
- Hans-Peter Piepho
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Stuttgart, Germany
| | - Emlyn R Williams
- Statistical Consulting Unit, Australian National University, Canberra, ACT 2600, Australia
| |
Collapse
|
23
|
Okuzaki Y. Effects of body size divergence on male mating tactics in the ground beetle Carabus japonicus. Evolution 2021; 75:2269-2285. [PMID: 34231214 DOI: 10.1111/evo.14302] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 05/14/2021] [Accepted: 05/18/2021] [Indexed: 11/27/2022]
Abstract
Animal body size is involved in reproduction in various ways. Carabus japonicus exhibits considerable variation in adult body size across geographical locations depending on the larval environment. To investigate the effects of body size divergence on male mating traits, spermatophore deposition and weight, copulation duration, and post-copulatory mounting were observed using male-female pairs from C. japonicus populations with different body sizes. Then, variables with high predictive power on the mating traits were identified from individual characteristics. When the male was slightly smaller than his mate, spermatophore deposition likely succeeded, suggesting that mechanical size-assortative insemination determined male body size. Although male reproductive organ size was positively correlated with male body size, spermatophore weight was not significantly affected by male body size, whereas copulation duration decreased with increasing male body size. Enlarged males, with a high capacity for spermatophore production, could increase paternity by decreasing copulation duration and increasing mating frequency. Such shifts in mating tactics would alter selection pressures of intra- and intersexual interactions (e.g., sperm competition and sexual conflict). Genital dimensions also affected mating traits other than copulatory duration. Thus, ecological heterogeneity has the potential to lead to divergences in sexual traits, such as genital morphology, through body size divergence.
Collapse
Affiliation(s)
- Yutaka Okuzaki
- Department of General Systems Studies, Graduate School of Arts and Sciences, The University of Tokyo, Meguro, Tokyo, Japan
| |
Collapse
|
24
|
Abstract
The modified Cholesky decomposition (MCD) is a powerful tool for estimating a covariance matrix. The regularization can be conveniently imposed on the linear regressions to encourage the sparsity in the estimated covariance matrix to accommodate the high-dimensional data. In this paper, we propose a Cholesky-based sparse ensemble estimate for covariance matrix by averaging a set of Cholesky factor estimates obtained from multiple variable orderings used in the MCD. The sparse estimation is enabled by encouraging the sparsity in the Cholesky factor. The theoretical consistent property is established under some regular conditions. The merits of the proposed method are illustrated through simulation and a maize genes data set.
Collapse
Affiliation(s)
- Chunshi Li
- The Third People's Hospital of Dalian, Dalian, China
| | - Mo Yang
- School of Finance, Dongbei University of Finance and Economics, Dalian, China
| | - Mingqiu Wang
- School of Statistics, Qufu Normal University, Qufu, China
| | - Hong Kang
- The Third People's Hospital of Dalian, Dalian, China
| | - Xiaoning Kang
- International Business College and Institute of Supply Chain Analytics, Dongbei University of Finance and Economics, Dalian, China
| |
Collapse
|
25
|
Buatois S, Ueckert S, Frey N, Retout S, Mentré F. cLRT-Mod: An efficient methodology for pharmacometric model-based analysis of longitudinal phase II dose finding studies under model uncertainty. Stat Med 2021; 40:2435-2451. [PMID: 33650148 DOI: 10.1002/sim.8913] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 12/14/2020] [Accepted: 02/01/2021] [Indexed: 11/07/2022]
Abstract
Within the challenging context of phase II dose-finding trials, longitudinal analyses may increase drug effect detection power compared to an end-of-treatment analysis. This work proposes cLRT-Mod, a pharmacometric adaptation of the MCP-Mod methodology, which allows the use of nonlinear mixed effect models to first detect a dose-response signal and then identify the doses for the confirmatory phase while accounting for model structure uncertainty. The method was evaluated through extensive clinical trial simulations of a hypothetical phase II dose-finding trial using different scenarios and comparing different methods such as MCP-Mod. The results show an increase in power using cLRT with longitudinal data compared to an EOT multiple contrast tests for scenarios with small sample size and weak drug effect while maintaining pre-specifiability of the models prior to data analysis and the nominal type I error. This work shows how model averaging provides better coverage probability of the drug effect in the prediction step, and avoids under-estimation of the size of the confidence interval. Finally, for illustration purpose cLRT-Mod was applied to the analysis of a real phase II dose-finding trial.
Collapse
Affiliation(s)
- Simon Buatois
- IAME, UMR 1137, INSERM, University Paris Diderot, Paris, France.,Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Sebastian Ueckert
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
| | - Nicolas Frey
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Sylvie Retout
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - France Mentré
- IAME, UMR 1137, INSERM, University Paris Diderot, Paris, France
| |
Collapse
|
26
|
Zhu R, Zhang X, Ma Y, Zou G. Model averaging estimation for high-dimensional covariance matrices with a network structure. Econom J 2021; 24:177-197. [PMID: 33746562 PMCID: PMC7946866 DOI: 10.1093/ectj/utaa030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 06/11/2020] [Indexed: 06/12/2023]
Abstract
In this paper, we develop a model averaging method to estimate a high-dimensional covariance matrix, where the candidate models are constructed by different orders of polynomial functions. We propose a Mallows-type model averaging criterion and select the weights by minimizing this criterion, which is an unbiased estimator of the expected in-sample squared error plus a constant. Then, we prove the asymptotic optimality of the resulting model average covariance estimators. Finally, we conduct numerical simulations and a case study on Chinese airport network structure data to demonstrate the usefulness of the proposed approaches.
Collapse
|
27
|
Wheeler MW, Westerhout J, Baumert JL, Remington BC. Bayesian Stacked Parametric Survival with Frailty Components and Interval-Censored Failure Times: An Application to Food Allergy Risk. Risk Anal 2021; 41:56-66. [PMID: 33063372 PMCID: PMC7894991 DOI: 10.1111/risa.13585] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 07/31/2020] [Accepted: 08/01/2020] [Indexed: 06/11/2023]
Abstract
To better understand the risk of exposure to food allergens, food challenge studies are designed to slowly increase the dose of an allergen delivered to allergic individuals until an objective reaction occurs. These dose-to-failure studies are used to determine acceptable intake levels and are analyzed using parametric failure time models. Though these models can provide estimates of the survival curve and risk, their parametric form may misrepresent the survival function for doses of interest. Different models that describe the data similarly may produce different dose-to-failure estimates. Motivated by predictive inference, we developed a Bayesian approach to combine survival estimates based on posterior predictive stacking, where the weights are formed to maximize posterior predictive accuracy. The approach defines a model space that is much larger than traditional parametric failure time modeling approaches. In our case, we use the approach to include random effects accounting for frailty components. The methodology is investigated in simulation, and is used to estimate allergic population eliciting doses for multiple food allergens.
Collapse
Affiliation(s)
- Matthew W Wheeler
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences Research, Triangle Park, NC, USA
| | - Joost Westerhout
- The Netherlands Organization, Utrechtseweg, Zeist, 3704 HE, The Netherlands
| | - Joe L Baumert
- Department of Food Science and Technology, FARRP, University of Nebraska-Lincoln, Lincoln, NE, USA
| | | |
Collapse
|
28
|
Mehrotra DV, Marceau West R. Survival analysis using a 5-step stratified testing and amalgamation routine (5-STAR) in randomized clinical trials. Stat Med 2020; 39:4724-4744. [PMID: 32954531 DOI: 10.1002/sim.8750] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 06/25/2020] [Accepted: 08/24/2020] [Indexed: 11/12/2022]
Abstract
Randomized clinical trials are often designed to assess whether a test treatment prolongs survival relative to a control treatment. Increased patient heterogeneity, while desirable for generalizability of results, can weaken the ability of common statistical approaches to detect treatment differences, potentially hampering the regulatory approval of safe and efficacious therapies. A novel solution to this problem is proposed. A list of baseline covariates that have the potential to be prognostic for survival under either treatment is pre-specified in the analysis plan. At the analysis stage, using all observed survival times but blinded to patient-level treatment assignment, "noise" covariates are removed with elastic net Cox regression. The shortened covariate list is used by a conditional inference tree algorithm to segment the heterogeneous trial population into subpopulations of prognostically homogeneous patients (risk strata). After patient-level treatment unblinding, a treatment comparison is done within each formed risk stratum and stratum-level results are combined for overall statistical inference. The impressive power-boosting performance of our proposed 5-step stratified testing and amalgamation routine (5-STAR), relative to that of the logrank test and other common approaches that do not leverage inherently structured patient heterogeneity, is illustrated using a hypothetical and two real datasets along with simulation results. Furthermore, the importance of reporting stratum-level comparative treatment effects (time ratios from accelerated failure time model fits in conjunction with model averaging and, as needed, hazard ratios from Cox proportional hazard model fits) is highlighted as a potential enabler of personalized medicine. An R package is available at https://github.com/rmarceauwest/fiveSTAR.
Collapse
Affiliation(s)
- Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, Pennsylvania, USA
| | - Rachel Marceau West
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, Pennsylvania, USA
| |
Collapse
|
29
|
van Havre Z, Maruff P, Villemagne VL, Mengersen K, Rousseau J, White N, Doecke JD. Identification of Pre-Clinical Alzheimer's Disease in a Population of Elderly Cognitively Normal Participants. J Alzheimers Dis 2020; 73:683-693. [PMID: 31868673 DOI: 10.3233/jad-191095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Alzheimer's disease (AD) has a long pathological process, with an approximate lead-time of 20 years. During the early stages of the disease process, little evidence of the building pathology is identifiable without cerebrospinal fluid and/or imaging analyses. Clinical manifestations of AD do not present until irreversible pathological changes have occurred. Given an opportunity to provide treatment prior to irreversible pathological change, this study aims to identify a subgroup of cognitively normal (CN) participants from the Australian Imaging, Biomarker & Lifestyle Flagship Study of Ageing (AIBL), where subtle changes in cognition are indicative of early AD-related pathology. Using a Bayesian method for unsupervised clustering via mixture models, we define an aggregate measure of posterior probabilities (AMPP score) establishing the likelihood of pre-clinical AD. From Baseline through to 54 months, visuo-spatial function had the greatest contribution to the AMPP score, followed by attention and processing speed and visual memory. Participants with the highest AMPP scores had both increasing neo-cortical amyloid burden and decreasing hippocampus volume over 54 months, compared to those in the lowest category with stable amyloid burden and hippocampus volume. The identification of a possible pre-clinical stage in CN participants via this method, without the aid of disease specific biomarkers, represents an important step in utilizing the strength of cognitive composite scores for the early detection of AD pathology.
Collapse
Affiliation(s)
- Zoe van Havre
- ACEMS, Queensland University of Technology, Queensland, Australia.,CEREMADE, Universite Paris Dauphine, Paris, France
| | - Paul Maruff
- Mental Health Research Institute, The University of Melbourne, Parkville, Victoria, Australia.,CogState Ltd., Victoria, Australia
| | - Victor L Villemagne
- Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Victoria, Australia.,Department of Nuclear Medicine and Centre for PET, Austin Health, Heidelberg, Victoria, Australia
| | - Kerrie Mengersen
- ACEMS, Queensland University of Technology, Queensland, Australia
| | | | - Nicole White
- ACEMS, Queensland University of Technology, Queensland, Australia
| | - James D Doecke
- CSIRO Health and Biosecurity/Australian e-Health Research Centre, Herston, Queensland, Australia
| |
Collapse
|
30
|
Aerts M, Wheeler MW, Abrahantes JC. An extended and unified modeling framework for benchmark dose estimation for both continuous and binary data. Environmetrics 2020; 31:e2630. [PMID: 36052215 PMCID: PMC9432821 DOI: 10.1002/env.2630] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 04/30/2020] [Indexed: 06/15/2023]
Abstract
Protection and safety authorities recommend the use of model averaging to determine the benchmark dose approach as a scientifically more advanced method compared with the no-observed-adverse-effect-level approach for obtaining a reference point and deriving health-based guidance values. Model averaging however highly depends on the set of candidate dose-response models and such a set should be rich enough to ensure that a well-fitting model is included. The currently applied set of candidate models for continuous endpoints is typically limited to two models, the exponential and Hill model, and differs completely from the richer set of candidate models currently used for binary endpoints. The objective of this article is to propose a general and wide framework of dose response models, which can be applied both to continuous and binary endpoints and covers the current models for both type of endpoints. In combination with the bootstrap, this framework offers a unified approach to benchmark dose estimation. The methodology is illustrated using two data sets, one with a continuous and another with a binary endpoint.
Collapse
Affiliation(s)
- Marc Aerts
- Data Science Institute, Interuniversity Institute for Biostatistics and statistical Bioinformatics, Hasselt University, Diepenbeek, Belgium
| | | | | |
Collapse
|
31
|
Abstract
Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number-typically less than 10-of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble-choosing good predictors-is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided.
Collapse
Affiliation(s)
- Michael P H Stumpf
- School of BioSciences and School of Mathematics and Statistics, University of Melbourne, Parkville, VIC 3010, Australia.,Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
32
|
Van Soom M, de Boer B. Detrending the Waveforms of Steady-State Vowels. Entropy (Basel) 2020; 22:E331. [PMID: 33286105 DOI: 10.3390/e22030331] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 03/02/2020] [Accepted: 03/11/2020] [Indexed: 11/17/2022]
Abstract
Steady-state vowels are vowels that are uttered with a momentarily fixed vocal tract configuration and with steady vibration of the vocal folds. In this steady-state, the vowel waveform appears as a quasi-periodic string of elementary units called pitch periods. Humans perceive this quasi-periodic regularity as a definite pitch. Likewise, so-called pitch-synchronous methods exploit this regularity by using the duration of the pitch periods as a natural time scale for their analysis. In this work, we present a simple pitch-synchronous method using a Bayesian approach for estimating formants that slightly generalizes the basic approach of modeling the pitch periods as a superposition of decaying sinusoids, one for each vowel formant, by explicitly taking into account the additional low-frequency content in the waveform which arises not from formants but rather from the glottal pulse. We model this low-frequency content in the time domain as a polynomial trend function that is added to the decaying sinusoids. The problem then reduces to a rather familiar one in macroeconomics: estimate the cycles (our decaying sinusoids) independently from the trend (our polynomial trend function); in other words, detrend the waveform of steady-state waveforms. We show how to do this efficiently.
Collapse
|
33
|
Abstract
The paucity of experimental data makes both inference and prediction particularly challenging in viral dynamic models. In the presence of several candidate models, a common strategy is model selection (MS), in which models are fitted to the data but only results obtained with the "best model" are presented. However, this approach ignores model uncertainty, which may lead to inaccurate predictions. When several models provide a good fit to the data, another approach is model averaging (MA) that weights the predictions of each model according to its consistency to the data. Here, we evaluated by simulations in a nonlinear mixed-effect model framework the performances of MS and MA in two realistic cases of acute viral infection, i.e., (1) inference in the presence of poorly identifiable parameters, namely, initial viral inoculum and eclipse phase duration, (2) uncertainty on the mechanisms of action of the immune response. MS was associated in some scenarios with a large rate of false selection. This led to a coverage rate lower than the nominal coverage rate of 0.95 in the majority of cases and below 0.50 in some scenarios. In contrast, MA provided better estimation of parameter uncertainty, with coverage rates ranging from 0.72 to 0.98 and mostly comprised within the nominal coverage rate. Finally, MA provided similar predictions than those obtained with MS. In conclusion, parameter estimates obtained with MS should be taken with caution, especially when several models well describe the data. In this situation, MA has better performances and could be performed to account for model uncertainty.
Collapse
Affiliation(s)
- Antonio Gonçalves
- Université de Paris, IAME, INSERM, Henri Huchard, F-75018, Paris, France.
| | - France Mentré
- Université de Paris, IAME, INSERM, Henri Huchard, F-75018, Paris, France
| | - Annabelle Lemenuel-Diot
- Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center, Basel, Switzerland
| | - Jérémie Guedj
- Université de Paris, IAME, INSERM, Henri Huchard, F-75018, Paris, France
| |
Collapse
|
34
|
Bast L, Calzolari F, Strasser MK, Hasenauer J, Theis FJ, Ninkovic J, Marr C. Increasing Neural Stem Cell Division Asymmetry and Quiescence Are Predicted to Contribute to the Age-Related Decline in Neurogenesis. Cell Rep 2019; 25:3231-3240.e8. [PMID: 30566852 DOI: 10.1016/j.celrep.2018.11.088] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 07/31/2018] [Accepted: 11/21/2018] [Indexed: 02/06/2023] Open
Abstract
Adult murine neural stem cells (NSCs) generate neurons in drastically declining numbers with age. How cellular dynamics sustain neurogenesis and how alterations with age may result in this decline are unresolved issues. We therefore clonally traced NSC lineages using confetti reporters in young and middle-aged adult mice. To understand the underlying mechanisms, we derived mathematical models that explain observed clonal cell type abundances. The best models consistently show self-renewal of transit-amplifying progenitors and rapid neuroblast cell cycle exit. In middle-aged mice, we identified an increased probability of asymmetric stem cell divisions at the expense of symmetric differentiation, accompanied by an extended persistence of quiescence between activation phases. Our model explains existing longitudinal population data and identifies particular cellular properties underlying adult NSC homeostasis and the aging of this stem cell compartment.
Collapse
Affiliation(s)
- Lisa Bast
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany; Department of Mathematics, Chair of Mathematical Modeling of Biological Systems, Technische Universität München, Garching, Germany
| | - Filippo Calzolari
- Institute for Physiological Chemistry, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany; Institute of Stem Cell Research, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany; Department of Physiological Genomics, Ludwig-Maximilians University Munich, Munich, Germany.
| | - Michael K Strasser
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany; Department of Mathematics, Chair of Mathematical Modeling of Biological Systems, Technische Universität München, Garching, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany; Department of Mathematics, Chair of Mathematical Modeling of Biological Systems, Technische Universität München, Garching, Germany
| | - Jovica Ninkovic
- Institute of Stem Cell Research, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany; Department of Physiological Genomics, Ludwig-Maximilians University Munich, Munich, Germany; Department for Cell Biology and Anatomy, Biomedical Center of LMU, Ludwig-Maximilians University Munich, Munich, Germany.
| | - Carsten Marr
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
| |
Collapse
|
35
|
Nakagawa S, De Villemereuil P. A General Method for Simultaneously Accounting for Phylogenetic and Species Sampling Uncertainty via Rubin's Rules in Comparative Analysis. Syst Biol 2019; 68:632-641. [PMID: 30597116 DOI: 10.1093/sysbio/syy089] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 12/06/2018] [Accepted: 12/15/2018] [Indexed: 11/13/2022] Open
Abstract
Phylogenetic comparative methods (PCMs), especially ones based on linear models, have played a central role in understanding species' trait evolution. These methods, however, usually assume that phylogenetic trees are known without error or uncertainty, but this assumption is most likely incorrect. So far, Markov chain Monte Carlo (MCMC)-based Bayesian methods have mainly been deployed to account for such "phylogenetic uncertainty" in PCMs. Herein, we propose an approach with which phylogenetic uncertainty is incorporated in a simple, readily implementable and reliable manner. Our approach uses Rubin's rules, which are an integral part of a standard multiple imputation procedure, often employed to recover missing data. We see true phylogenetic trees as missing data under this approach. Further, unmeasured species in comparative data (i.e., missing trait data) can be seen as another source of uncertainty in PCMs because arbitrary sampling of species in a given taxon or "species sampling uncertainty" can affect estimation in PCMs. Using two simulation studies, we show our method can account for phylogenetic uncertainty under many different scenarios (e.g., uncertainty in topology and branch lengths) and, at the same time, it can handle missing trait data (i.e., species sampling uncertainty). A unique property of the multiple imputation procedure is that an index, named "relative efficiency," could be used to quantify the number of trees required for incorporating phylogenetic uncertainty. Thus, by using the relative efficiency, we show the required tree number is surprisingly small ($\sim$50 trees). However, the most notable advantage of our method is that it could be combined seamlessly with PCMs that utilize multiple imputation to handle simultaneously phylogenetic uncertainty (i.e., missing true trees) and species sampling uncertainty (i.e., missing trait data) in PCMs.
Collapse
Affiliation(s)
- Shinichi Nakagawa
- Evolution & Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW 2052, Australia.,Diabetes and Metabolism Division, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Pierre De Villemereuil
- CEFE, CNRS, Université de Montpellier, Université Paul Valéry Montpellier 3, EPHE, IRD, Montpellier, France Shinichi Nakagawa and Pierre de Villemereuil contributed equally to this article
| |
Collapse
|
36
|
Ponciano JM, Taper ML. Model Projections in Model Space: A Geometric Interpretation of the AIC Allows Estimating the Distance Between Truth and Approximating Models. Front Ecol Evol 2019; 7:413. [PMID: 33796541 PMCID: PMC8011695 DOI: 10.3389/fevo.2019.00413] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Information criteria have had a profound impact on modern ecological science. They allow researchers to estimate which probabilistic approximating models are closest to the generating process. Unfortunately, information criterion comparison does not tell how good the best model is. In this work, we show that this shortcoming can be resolved by extending the geometric interpretation of Hirotugu Akaike's original work. Standard information criterion analysis considers only the divergences of each model from the generating process. It is ignored that there are also estimable divergence relationships amongst all of the approximating models. We then show that using both sets of divergences and an estimator of the negative self entropy, a model space can be constructed that includes an estimated location for the generating process. Thus, not only can an analyst determine which model is closest to the generating process, she/he can also determine how close to the generating process the best approximating model is. Properties of the generating process estimated from these projections are more accurate than those estimated by model averaging. We illustrate in detail our findings and our methods with two ecological examples for which we use and test two different neg-selfentropy estimators. The applications of our proposed model projection in model space extend to all areas of science where model selection through information criteria is done.
Collapse
Affiliation(s)
| | - Mark L. Taper
- Biology Department, University of Florida, Gainesville, FL, United States
- Department of Ecology, Montana State University, Bozeman, MT, United States
| |
Collapse
|
37
|
Jensen SM, Kluxen FM, Ritz C. A Review of Recent Advances in Benchmark Dose Methodology. Risk Anal 2019; 39:2295-2315. [PMID: 31046141 DOI: 10.1111/risa.13324] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 02/01/2019] [Accepted: 04/03/2019] [Indexed: 06/09/2023]
Abstract
In this review, recent methodological developments for the benchmark dose (BMD) methodology are summarized. Specifically, we introduce the advances for the main steps in BMD derivation: selecting the procedure for defining a BMD from a predefined benchmark response (BMR), setting a BMR, selecting a dose-response model, and estimating the corresponding BMD lower limit (BMDL). Although the last decade has shown major progress in the development of BMD methodology, there is still room for improvement. Remaining challenges are the implementation of new statistical methods in user-friendly software and the lack of consensus about how to derive the BMDL.
Collapse
Affiliation(s)
- Signe M Jensen
- Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen, Denmark
| | | | - Christian Ritz
- Department of Nutrition, Sports and Exercise, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
38
|
Cao X, Shen Q, Shang C, Yang H, Liu L, Cheng J. Determinants of Shoot Biomass Production in Mulberry: Combined Selection with Leaf Morphological and Physiological Traits. Plants (Basel) 2019; 8:plants8050118. [PMID: 31064066 PMCID: PMC6571901 DOI: 10.3390/plants8050118] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 04/26/2019] [Accepted: 04/29/2019] [Indexed: 11/16/2022]
Abstract
Physiological and morphological traits have a considerable impact on the biomass production of fast-growing trees. To compare cultivar difference in shoot biomass and investigate its relationships with leaf functional traits in mulberry, agronomic traits and 20 physiological and morphological attributes of 3-year-old mulberry trees from eight cultivars growing in a common garden were analyzed. The cultivars Xiang7920, Yu711, and Yunsang2 had higher shoot fresh biomass (SFB), which was closely associated with their rapid leaf expansion rate, large leaf area, and high stable carbon isotope composition (δ13C). Conversely, the cultivars 7307, Husang32, Wupu, Yunguo1, and Liaolu11 were less productive, and this was primarily the result of slower leaf expansion and smaller leaf size. Growth performance was negatively correlated with leaf δ13C and positively correlated with the total nitrogen concentration, indicating that a compromise exists in mulberry between water use efficiency (WUE) (low δ13C) and high nitrogen consumption for rapid growth. Several morphological traits, including the maximum leaf area (LAmax), leaf width and length, petiole width and length, leaf number per shoot, and final shoot height were correlated with SFB. The physiological traits that were also influential factors of shoot biomass were the leaf δ13C, the total nitrogen concentration, and the water content. Among the studied leaf traits, LAmax, leaf δ13C, and concentrations of chlorophyll a and b were identified as the most representative predictor variables for SFB, accounting for 73% of the variability in SFB. In conclusion, a combination of LAmax, leaf δ13C, and chlorophyll should be considered in selection programs for high-yield mulberry cultivars.
Collapse
Affiliation(s)
- Xu Cao
- College of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
- Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Areas, Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang 212018, China.
| | - Qiudi Shen
- College of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
- Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Areas, Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang 212018, China.
| | - Chunqiong Shang
- College of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
- Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Areas, Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang 212018, China.
| | - Honglei Yang
- College of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
- Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Areas, Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang 212018, China.
| | - Li Liu
- College of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
- Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Areas, Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang 212018, China.
| | - Jialing Cheng
- College of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
- Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Areas, Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang 212018, China.
| |
Collapse
|
39
|
Abstract
Analysis of "big data" frequently involves statistical comparison of millions of competing hypotheses to discover hidden processes underlying observed patterns of data, for example, in the search for genetic determinants of disease in genome-wide association studies (GWAS). Controlling the familywise error rate (FWER) is considered the strongest protection against false positives but makes it difficult to reach the multiple testing-corrected significance threshold. Here, I introduce the harmonic mean p-value (HMP), which controls the FWER while greatly improving statistical power by combining dependent tests using generalized central limit theorem. I show that the HMP effortlessly combines information to detect statistically significant signals among groups of individually nonsignificant hypotheses in examples of a human GWAS for neuroticism and a joint human-pathogen GWAS for hepatitis C viral load. The HMP simultaneously tests all ways to group hypotheses, allowing the smallest groups of hypotheses that retain significance to be sought. The power of the HMP to detect significant hypothesis groups is greater than the power of the Benjamini-Hochberg procedure to detect significant hypotheses, although the latter only controls the weaker false discovery rate (FDR). The HMP has broad implications for the analysis of large datasets, because it enhances the potential for scientific discovery.
Collapse
|
40
|
Correia HE. Spatiotemporally explicit model averaging for forecasting of Alaskan groundfish catch. Ecol Evol 2018; 8:12308-12321. [PMID: 30619547 PMCID: PMC6308877 DOI: 10.1002/ece3.4488] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 04/09/2018] [Accepted: 07/27/2018] [Indexed: 11/26/2022] Open
Abstract
Fisheries management is dominated by the need to forecast catch and abundance of commercially and ecologically important species. The influence of spatial information and environmental factors on forecasting error is not often considered. I propose a forecasting method called spatiotemporally explicit model averaging (STEMA) to combine spatial and temporal information through model averaging. I examine the performance of STEMA against two popular forecasting models and a modern spatial prediction model: the autoregressive integrated moving averages with explanatory variables (ARIMAX) model, the Bayesian hierarchical model, and the varying coefficient model. I focus on applying the methods to four species of Alaskan groundfish for which catch data are available. My method reduces forecasting errors significantly for most of the tested models when compared to ARIMAX, Bayesian, and varying coefficient methods. I also consider the effect of sea surface temperature (SST) on the forecasting of catch, as multiple studies reveal a potential influence of water temperature on the survival and growth of juvenile groundfish. For most of the preferred models, inclusion of SST in the model improved forecasting of catch. It is advisable to consider both spatial information and relevant environmental factors in forecasting models to obtain more accurate projections of population abundance. The STEMA method is capable of accounting for spatial information in forecasting and can be applied to various types of data because of its flexible varying coefficient model structure. It is therefore a suitable forecasting method for application to many fields including ecology, epidemiology, and climatology.
Collapse
Affiliation(s)
- Hannah E Correia
- Department of Biological Sciences Auburn University Auburn Alabama.,Norwegian Institute for Nature Research (NINA) Tromsø Norway
| |
Collapse
|
41
|
Cenci S, Saavedra S. Uncertainty quantification of the effects of biotic interactions on community dynamics from nonlinear time-series data. J R Soc Interface 2018; 15:rsif.2018.0695. [PMID: 30381342 DOI: 10.1098/rsif.2018.0695] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 10/04/2018] [Indexed: 12/11/2022] Open
Abstract
Biotic interactions are expected to play a major role in shaping the dynamics of ecological systems. Yet, quantifying the effects of biotic interactions has been challenging due to a lack of appropriate methods to extract accurate measurements of interaction parameters from experimental data. One of the main limitations of existing methods is that the parameters inferred from noisy, sparsely sampled, nonlinear data are seldom uniquely identifiable. That is, many different parameters can be compatible with the same dataset and can generalize to independent data equally well. Hence, it is difficult to justify conclusive assertions about the effect of biotic interactions without information about their associated uncertainty. Here, we develop an ensemble method based on model averaging to quantify the uncertainty associated with the effect of biotic interactions on community dynamics from non-equilibrium ecological time-series data. Our method is able to detect the most informative time intervals for each biotic interaction within a multivariate time series and can be easily adapted to different regression schemes. Overall, this novel approach can be used to associate a time-dependent uncertainty with the effect of biotic interactions. Moreover, because we quantify uncertainty with minimal assumptions about the data-generating process, our approach can be applied to any data for which interactions among variables strongly affect the overall dynamics of the system.
Collapse
Affiliation(s)
- Simone Cenci
- Department of Civil and Environmental Engineering, MIT, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Serguei Saavedra
- Department of Civil and Environmental Engineering, MIT, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| |
Collapse
|
42
|
Xu R, Mehrotra DV, Shaw PA. Incorporating baseline measurements into the analysis of crossover trials with time-to-event endpoints. Stat Med 2018; 37:3280-3292. [PMID: 29888552 DOI: 10.1002/sim.7834] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 02/12/2018] [Accepted: 05/05/2018] [Indexed: 11/06/2022]
Abstract
Two-period two-treatment (2×2) crossover designs are commonly used in clinical trials. For continuous endpoints, it has been shown that baseline (pretreatment) measurements collected before the start of each treatment period can be useful in improving the power of the analysis. Methods to achieve a corresponding gain for censored time-to-event endpoints have not been adequately studied. We propose a method in which censored values are treated as missing data and multiply imputed using prespecified parametric event time models. The event times in each imputed data set are then log-transformed and analyzed using a linear model suitable for a 2×2 crossover design with continuous endpoints, with the difference in period-specific baselines included as a covariate. Results obtained from the imputed data sets are synthesized for point and confidence interval estimation of the treatment ratio of geometric mean event times using model averaging in conjunction with Rubin's combination rule. We use simulations to illustrate the favorable operating characteristics of our method relative to two other methods for crossover trials with censored time-to-event data, ie, a hierarchical rank test that ignores the baselines and a stratified Cox model that uses each study subject as a stratum and includes period-specific baselines as a covariate. Application to a real data example is provided.
Collapse
Affiliation(s)
- Rengyi Xu
- Department of Epidemiology and Biostatistics, University of Pennsylvania, Philadelphia, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Philadelphia, USA
| | - Pamela A Shaw
- Department of Epidemiology and Biostatistics, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
43
|
Burgess S, Zuber V, Gkatzionis A, Foley CN. Modal-based estimation via heterogeneity-penalized weighting: model averaging for consistent and efficient estimation in Mendelian randomization when a plurality of candidate instruments are valid. Int J Epidemiol 2018; 47:1242-1254. [PMID: 29846613 PMCID: PMC6124628 DOI: 10.1093/ije/dyy080] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2018] [Indexed: 11/14/2022] Open
Abstract
Background A robust method for Mendelian randomization does not require all genetic variants to be valid instruments to give consistent estimates of a causal parameter. Several such methods have been developed, including a mode-based estimation method giving consistent estimates if a plurality of genetic variants are valid instruments; i.e. there is no larger subset of invalid instruments estimating the same causal parameter than the subset of valid instruments. Methods We here develop a model-averaging method that gives consistent estimates under the same 'plurality of valid instruments' assumption. The method considers a mixture distribution of estimates derived from each subset of genetic variants. The estimates are weighted such that subsets with more genetic variants receive more weight, unless variants in the subset have heterogeneous causal estimates, in which case that subset is severely down-weighted. The mode of this mixture distribution is the causal estimate. This heterogeneity-penalized model-averaging method has several technical advantages over the previously proposed mode-based estimation method. Results The heterogeneity-penalized model-averaging method outperformed the mode-based estimation in terms of efficiency and outperformed other robust methods in terms of Type 1 error rate in an extensive simulation analysis. The proposed method suggests two distinct mechanisms by which inflammation affects coronary heart disease risk, with subsets of variants suggesting both positive and negative causal effects. Conclusions The heterogeneity-penalized model-averaging method is an additional robust method for Mendelian randomization with excellent theoretical and practical properties, and can reveal features in the data such as the presence of multiple causal mechanisms.
Collapse
Affiliation(s)
- Stephen Burgess
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
- Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Verena Zuber
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
| | | | | |
Collapse
|
44
|
Soch J, Allefeld C. MACS - a new SPM toolbox for model assessment, comparison and selection. J Neurosci Methods 2018; 306:19-31. [PMID: 29842901 DOI: 10.1016/j.jneumeth.2018.05.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 03/12/2018] [Accepted: 05/21/2018] [Indexed: 10/16/2022]
Abstract
BACKGROUND In cognitive neuroscience, functional magnetic resonance imaging (fMRI) data are widely analyzed using general linear models (GLMs). However, model quality of GLMs for fMRI is rarely assessed, in part due to the lack of formal measures for statistical model inference. NEW METHOD We introduce a new SPM toolbox for model assessment, comparison and selection (MACS) of GLMs applied to fMRI data. MACS includes classical, information-theoretic and Bayesian methods of model assessment previously applied to GLMs for fMRI as well as recent methodological developments of model selection and model averaging in fMRI data analysis. RESULTS The toolbox - which is freely available from GitHub - directly builds on the Statistical Parametric Mapping (SPM) software package and is easy-to-use, general-purpose, modular, readable and extendable. We validate the toolbox by reproducing model selection and model averaging results from earlier publications. COMPARISON WITH EXISTING METHODS A previous toolbox for model diagnosis in fMRI has been discontinued and other approaches to model comparison between GLMs have not been translated into reusable computational resources in the past. CONCLUSIONS Increased attention on model quality will lead to lower false-positive rates in cognitive neuroscience and increased application of the MACS toolbox will increase the reproducibility of GLM analyses and is likely to increase the replicability of fMRI studies.
Collapse
Affiliation(s)
- Joram Soch
- Bernstein Center for Computational Neuroscience, Berlin, Germany; Department of Psychology, Humboldt-Universität zu Berlin, Germany.
| | - Carsten Allefeld
- Bernstein Center for Computational Neuroscience, Berlin, Germany; Berlin Center for Advanced Neuroimaging, Berlin, Germany
| |
Collapse
|
45
|
Vaudour E, Cerovic ZG, Ebengo DM, Latouche G. Predicting Key Agronomic Soil Properties with UV-Vis Fluorescence Measurements Combined with Vis-NIR-SWIR Reflectance Spectroscopy: A Farm-Scale Study in a Mediterranean Viticultural Agroecosystem. Sensors (Basel) 2018; 18:E1157. [PMID: 29642640 DOI: 10.3390/s18041157] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Revised: 04/06/2018] [Accepted: 04/08/2018] [Indexed: 12/31/2022]
Abstract
For adequate crop and soil management, rapid and accurate techniques for monitoring soil properties are particularly important when a farmer starts up his activities and needs a diagnosis of his cultivated fields. This study aimed to evaluate the potential of fluorescence measured directly on 146 whole soil solid samples, for predicting key soil properties at the scale of a 6 ha Mediterranean wine estate with contrasting soils. UV-Vis fluorescence measurements were carried out in conjunction with reflectance measurements in the Vis-NIR-SWIR range. Combining PLSR predictions from Vis-NIR-SWIR reflectance spectra and from a set of fluorescence signals enabled us to improve the power of prediction of a number of key agronomic soil properties including SOC, Ntot, CaCO₃, iron, fine particle-sizes (clay, fine silt, fine sand), CEC, pH and exchangeable Ca2+ with cross-validation RPD ≥ 2 and R² ≥ 0.75, while exchangeable K⁺, Na⁺, Mg2+, coarse silt and coarse sand contents were fairly predicted (1.42 ≤ RPD < 2 and 0.54 ≤ R² < 0.75). Predictions of SOC, Ntot, CaCO₃, iron contents, and pH were still good (RPD ≥ 1.8, R² ≥ 0.68) when using a single fluorescence signal or index such as SFR_R or FERARI, highlighting the unexpected importance of red excitations and indices derived from plant studies. The predictive ability of single fluorescence indices or original signals was very significant for topsoil: this is very important for a farmer who wishes to update information on soil nutrient for the purpose of fertility diagnosis and particularly nitrogen fertilization. These results open encouraging perspectives for using miniaturized fluorescence devices enabling red excitation coupled with red or far-red fluorescence emissions directly in the field.
Collapse
|
46
|
Haber LT, Dourson ML, Allen BC, Hertzberg RC, Parker A, Vincent MJ, Maier A, Boobis AR. Benchmark dose (BMD) modeling: current practice, issues, and challenges. Crit Rev Toxicol 2018. [PMID: 29516780 DOI: 10.1080/10408444.2018.1430121] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Benchmark dose (BMD) modeling is now the state of the science for determining the point of departure for risk assessment. Key advantages include the fact that the modeling takes account of all of the data for a particular effect from a particular experiment, increased consistency, and better accounting for statistical uncertainties. Despite these strong advantages, disagreements remain as to several specific aspects of the modeling, including differences in the recommendations of the US Environmental Protection Agency (US EPA) and the European Food Safety Authority (EFSA). Differences exist in the choice of the benchmark response (BMR) for continuous data, the use of unrestricted models, and the mathematical models used; these can lead to differences in the final BMDL. It is important to take confidence in the model into account in choosing the BMDL, rather than simply choosing the lowest value. The field is moving in the direction of model averaging, which will avoid many of the challenges of choosing a single best model when the underlying biology does not suggest one, but additional research would be useful into methods of incorporating biological considerations into the weights used in the averaging. Additional research is also needed regarding the interplay between the BMR and the UF to ensure appropriate use for studies supporting a lower BMR than default values, such as for epidemiology data. Addressing these issues will aid in harmonizing methods and moving the field of risk assessment forward.
Collapse
Affiliation(s)
- Lynne T Haber
- a Risk Science Center , University of Cincinnati , Cincinnati , OH , USA
| | - Michael L Dourson
- a Risk Science Center , University of Cincinnati , Cincinnati , OH , USA
| | | | - Richard C Hertzberg
- c Department of Environmental Health , Emory University , Atlanta , GA , USA
| | - Ann Parker
- a Risk Science Center , University of Cincinnati , Cincinnati , OH , USA
| | - Melissa J Vincent
- a Risk Science Center , University of Cincinnati , Cincinnati , OH , USA
| | - Andrew Maier
- a Risk Science Center , University of Cincinnati , Cincinnati , OH , USA
| | | |
Collapse
|
47
|
Marrot P, Charmantier A, Blondel J, Garant D. Current spring warming as a driver of selection on reproductive timing in a wild passerine. J Anim Ecol 2018; 87:754-764. [PMID: 29337354 DOI: 10.1111/1365-2656.12794] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Accepted: 12/06/2017] [Indexed: 11/30/2022]
Abstract
Evolutionary adaptation as a response to climate change is expected for fitness-related traits affected by climate and exhibiting genetic variance. Although the relationship between warmer spring temperature and earlier timing of reproduction is well documented, quantifications and predictions of the impact of global warming on natural selection acting on phenology in wild populations remain rare. If global warming affects fitness in a similar way across individuals within a population, or if fitness consequences are independent of phenotypic variation in key-adaptive traits, then no evolutionary response is expected for these traits. Here, we quantified the selection pressures acting on laying date during a 24-year monitoring of blue tits in southern Mediterranean France, a hot spot of climate warming. We explored the temporal fluctuation in annual selection gradients and we determined its temperature-related drivers. We first investigated the month-specific warming since 1970 in our study site and tested its influence on selection pressures, using a model averaging approach. Then, we quantified the selection strength associated with temperature anomalies experienced by the blue tit population. We found that natural selection acting on laying date significantly fluctuated both in magnitude and in sign across years. After identifying a significant warming in spring and summer, we showed that warmer daily maximum temperatures in April were significantly associated with stronger selection pressures for reproductive timing. Our results indicated an increase in the strength of selection by 46% for every +1°C anomaly. Our results confirm the general assumption that recent climate change translates into strong selection favouring earlier breeders in passerine birds. Our findings also suggest that differences in fitness among individuals varying in their breeding phenology increase with climate warming. Such climate-driven influence on the strength of directional selection acting on laying date could favour an adaptive response in this trait, since it is heritable.
Collapse
Affiliation(s)
- Pascal Marrot
- Département de Biologie, Faculté des Sciences, Université de Sherbrooke, Sherbrooke, Québec, Canada.,CEFE-UMR 5175, Montpellier, France
| | | | | | - Dany Garant
- Département de Biologie, Faculté des Sciences, Université de Sherbrooke, Sherbrooke, Québec, Canada
| |
Collapse
|
48
|
Antonelli J, Han B, Cefalu M. A synthetic estimator for the efficacy of clinical trials with all-or-nothing compliance. Stat Med 2017; 36:4604-4615. [PMID: 28833307 DOI: 10.1002/sim.7447] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 07/24/2017] [Accepted: 08/03/2017] [Indexed: 11/10/2022]
Abstract
A critical issue in the analysis of clinical trials is patients' noncompliance to assigned treatments. In the context of a binary treatment with all or nothing compliance, the intent-to-treat analysis is a straightforward approach to estimating the effectiveness of the trial. In contrast, there exist 3 commonly used estimators with varying statistical properties for the efficacy of the trial, formally known as the complier-average causal effect. The instrumental variable estimator may be unbiased but can be extremely variable in many settings. The as treated and per protocol estimators are usually more efficient than the instrumental variable estimator, but they may suffer from selection bias. We propose a synthetic approach that incorporates all 3 estimators in a data-driven manner. The synthetic estimator is a linear convex combination of the instrumental variable, per protocol, and as treated estimators, resembling the popular model-averaging approach in the statistical literature. However, our synthetic approach is nonparametric; thus, it is applicable to a variety of outcome types without specific distributional assumptions. We also discuss the construction of the synthetic estimator using an analytic form derived from a simple normal mixture distribution. We apply the synthetic approach to a clinical trial for post-traumatic stress disorder.
Collapse
Affiliation(s)
- Joseph Antonelli
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 655 Huntington Avenue, Boston, MA 02115, U.S.A
| | - Bing Han
- RAND Corporation, 1776 Main Street, Santa Monica, CA 90401, U.S.A
| | - Matthew Cefalu
- RAND Corporation, 1776 Main Street, Santa Monica, CA 90401, U.S.A
| |
Collapse
|
49
|
Vieira RAM, Rohem Júnior NM, Gomes RS, Oliveira TS, Bendia LCR, Azevedo FHV, Barbosa DL, Glória LS, Rodrigues MT. The ontogenetic allometry of body morphology and chemical composition in dairy goat wethers. Animal 2018; 12:538-53. [PMID: 28770697 DOI: 10.1017/S1751731117001884] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
We studied the ontogenetic growth of goat wethers (castrated male goats) of the Saanen and Swiss Alpine breeds based on a large range of intraspecific body mass (BM). The body parts and the chemical constituents of the empty body were described by the allometric function by using BM and the empty body mass (EBM) as the predictors for morphological traits and chemical composition, respectively. We fitted the allometric scaling function by applying the SAS NLMIXED procedure, but to evaluate assumptions regarding variances in morphological and compositional traits, we combined the scaling function with homoscedastic (MOD1), and the heteroscedastic exponential (MOD2) and power-of-the-mean (MOD3) variance functions. We also predicted the ontogenetic growth by using the traditional log-log transformation and back-transformed results into the arithmetic scale (MOD4). We obtained predictions from MOD4 in the arithmetic scale by a two-step process, and evaluated MOD1, MOD2 and MOD3 by a model selection framework, and compared MOD4 with MOD1, MOD2 and MOD3 based on goodness-of-fit measures. Based on information criteria for model selection, heterogeneous variance functions were more likely to describe 10 over 36 traits with a low level of model selection uncertainty. One trait was predicted by averaging the MOD1 and MOD2 variance functions; and nine traits were better described by averaging the MOD2 and MOD3 variance functions. The predictions for other 16 traits were averaged from MOD1, MOD2 and MOD3. However, MOD4 better described 11 traits according to the goodness-of-fit measures. Depending on the variable being analyzed, the body parts and the chemical amounts exhibited the three types of allometric behavior with respect to BM and EBM, that is, positive, negative and isometric ontogenetic growth. Reference BMs, that is, 20, 27, 35 and 45 kg, were used to compute the net protein and energy requirements based on the first derivative of the scaling function, and the results were presented in reference to the EBM and EBM0.75. Both the net protein and energy requirements scaled to EBM0.75 increased from 20 to 45 kg of BM.
Collapse
|
50
|
Dosne AG, Bergstrand M, Karlsson MO, Renard D, Heimann G. Model averaging for robust assessment of QT prolongation by concentration-response analysis. Stat Med 2017; 36:3844-3857. [PMID: 28703360 DOI: 10.1002/sim.7395] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Revised: 05/22/2017] [Accepted: 06/11/2017] [Indexed: 11/07/2022]
Abstract
Assessing the QT prolongation potential of a drug is typically done based on pivotal safety studies called thorough QT studies. Model-based estimation of the drug-induced QT prolongation at the estimated mean maximum drug concentration could increase efficiency over the currently used intersection-union test. However, robustness against model misspecification needs to be guaranteed in pivotal settings. The objective of this work was to develop an efficient, fully prespecified model-based inference method for thorough QT studies, which controls the type I error and provides satisfactory test power. This is achieved by model averaging: The proposed estimator of the concentration-response relationship is a weighted average of a parametric (linear) and a nonparametric (monotonic I-splines) estimator, with weights based on mean integrated square error. The desired properties of the method were confirmed in an extensive simulation study, which demonstrated that the proposed method controlled the type I error adequately, and that its power was higher than the power of the nonparametric method alone. The method can be extended from thorough QT studies to the analysis of QT data from pooled phase I studies.
Collapse
Affiliation(s)
| | | | | | - D Renard
- Novartis Pharma AG, Basel, Switzerland
| | - G Heimann
- Novartis Pharma AG, Basel, Switzerland
| |
Collapse
|