1
|
Zamanian A, Ahmidi N, Drton M. Assessable and interpretable sensitivity analysis in the pattern graph framework for nonignorable missingness mechanisms. Stat Med 2023; 42:5419-5450. [PMID: 37759370 DOI: 10.1002/sim.9920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/12/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
The pattern graph framework solves a wide range of missing data problems with nonignorable mechanisms. However, it faces two challenges of assessability and interpretability, particularly important in safety-critical problems such as clinical diagnosis: (i) How can one assess the validity of the framework's a priori assumption and make necessary adjustments to accommodate known information about the problem? (ii) How can one interpret the process of exponential tilting used for sensitivity analysis in the pattern graph framework and choose the tilt perturbations based on meaningful real-world quantities? In this paper, we introduce Informed Sensitivity Analysis, an extension of the pattern graph framework that enables us to incorporate substantive knowledge about the missingness mechanism into the pattern graph framework. Our extension allows us to examine the validity of assumptions underlying pattern graphs and interpret sensitivity analysis results in terms of realistic problem characteristics. We apply our method to a prevalent nonignorable missing data scenario in clinical research. We validate and compare our method's results of our method with a number of widely-used missing data methods, including Unweighted CCA, KNN Imputer, MICE, and MissForest. The validation is done using both boot-strapped simulated experiments as well as real-world clinical observations in the MIMIC-III public dataset.
Collapse
Affiliation(s)
- Alireza Zamanian
- TUM School of Computation, Information and Technology, Department of Computer Science, Technical University of Munich, Munich, Germany
- Department of Reasoned AI Decisions, Fraunhofer Institute for Cognitive Systems IKS, Munich, Germany
| | - Narges Ahmidi
- Department of Reasoned AI Decisions, Fraunhofer Institute for Cognitive Systems IKS, Munich, Germany
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
| | - Mathias Drton
- TUM School of Computation, Information and Technology, Department of Mathematics, Technical University of Munich, Munich, Germany
| |
Collapse
|
2
|
Drton M, Shi H, Strieder D. Discussion of “A note on universal inference” by Timmy Tse and Anthony Davison. Stat (Int Stat Inst) 2023. [DOI: 10.1002/sta4.572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Affiliation(s)
- Mathias Drton
- TUMAM04 Lehrstuhl für Mathematische Statistik 85748 Garching b. München, Boltzmannstr. 3 Germany
| | - Hongjian Shi
- TUMAM04 Lehrstuhl für Mathematische Statistik 85748 Garching b. München, Boltzmannstr. 3 Germany
| | - David Strieder
- TUMAM04 Lehrstuhl für Mathematische Statistik 85748 Garching b. München, Boltzmannstr. 3 Germany
| |
Collapse
|
3
|
Barber RF, Drton M, Sturma N, Weihs L. Half-trek criterion for identifiability of latent variable models. Ann Stat 2022. [DOI: 10.1214/22-aos2221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
| | - Mathias Drton
- School of Computation, Information and Technology, Technical University of Munich
| | - Nils Sturma
- School of Computation, Information and Technology, Technical University of Munich
| | | |
Collapse
|
4
|
Affiliation(s)
- Hongjian Shi
- Department of Mathematics, Technical University of Munich
| | - Marc Hallin
- ECARES and Department of Mathematics, Université Libre de Bruxelles
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich
| | - Fang Han
- Department of Statistics, University of Washington
| |
Collapse
|
5
|
Abstract
Estimation of density functions supported on general domains arises when the data are naturally restricted to a proper subset of the real space. This problem is complicated by typically intractable normalizing constants. Score matching provides a powerful tool for estimating densities with such intractable normalizing constants but as originally proposed is limited to densities on [Formula: see text] and [Formula: see text]. In this paper, we offer a natural generalization of score matching that accommodates densities supported on a very general class of domains. We apply the framework to truncated graphical and pairwise interaction models and provide theoretical guarantees for the resulting estimators. We also generalize a recently proposed method from bounded to unbounded domains and empirically demonstrate the advantages of our method.
Collapse
Affiliation(s)
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, 85748 Garching bei München, Germany
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington, 98195, USA
| |
Collapse
|
6
|
Strieder D, Drton M. On the choice of the splitting ratio for the split likelihood ratio test. Electron J Stat 2022. [DOI: 10.1214/22-ejs2099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- David Strieder
- Technical University of Munich; TUM School of Computation, Information and Technology, Munich Center for Machine Learning (MCML), Munich Data Science Institute (MDSI)
| | - Mathias Drton
- Technical University of Munich; TUM School of Computation, Information and Technology, Munich Center for Machine Learning (MCML), Munich Data Science Institute (MDSI)
| |
Collapse
|
7
|
Yu S, Drton M, Promislow DEL, Shojaie A. CorDiffViz: an R package for visualizing multi-omics differential correlation networks. BMC Bioinformatics 2021; 22:486. [PMID: 34627139 PMCID: PMC8501646 DOI: 10.1186/s12859-021-04383-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 09/20/2021] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Differential correlation networks are increasingly used to delineate changes in interactions among biomolecules. They characterize differences between omics networks under two different conditions, and can be used to delineate mechanisms of disease initiation and progression. RESULTS We present a new R package, CorDiffViz, that facilitates the estimation and visualization of differential correlation networks using multiple correlation measures and inference methods. The software is implemented in R, HTML and Javascript, and is available at https://github.com/sqyu/CorDiffViz . Visualization has been tested for the Chrome and Firefox web browsers. A demo is available at https://diffcornet.github.io/CorDiffViz/demo.html . CONCLUSIONS Our software offers considerable flexibility by allowing the user to interact with the visualization and choose from different estimation methods and visualizations. It also allows the user to easily toggle between correlation networks for samples under one condition and differential correlations between samples under two conditions. Moreover, the software facilitates integrative analysis of cross-correlation networks between two omics data sets.
Collapse
Affiliation(s)
- Shiqing Yu
- Department of Statistics, University of Washington, NE Stevens Way, Seattle, WA, 98195, USA.
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Boltzmannstraße, 85748, Garching bei München, Germany
| | - Daniel E L Promislow
- Departments of Pathology and Biology, University of Washington, NE Pacific St, Seattle, WA, 98195, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, NE Pacific St, Seattle, WA, 98195, USA
| |
Collapse
|
8
|
Abstract
Summary
Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much attention recently. The coefficient has the unusual appeal that it not only estimates a population quantity first proposed by Dette et al. (2013) that is zero if and only if the underlying pair of random variables is independent, but also is asymptotically normal under independence. This paper compares Chatterjee’s new correlation coefficient with three established rank correlations that also facilitate consistent tests of independence, namely Hoeffding’s $D$, Blum–Kiefer–Rosenblatt’s $R$, and Bergsma–Dassios–Yanagimoto’s $\tau^*$. We compare the computational efficiency of these rank correlation coefficients in light of recent advances, and investigate their power against local rotation and mixture alternatives. Our main results show that Chatterjee’s coefficient is unfortunately rate-suboptimal compared to $D$, $R$ and $\tau^*$. The situation is more subtle for a related earlier estimator of Dette et al. (2013). These results favour $D$, $R$ and $\tau^*$ over Chatterjee’s new correlation coefficient for the purpose of testing independence.
Collapse
Affiliation(s)
- H Shi
- Department of Statistics, University of Washington, Box 354322, Seattle, Washington 98195, U.S.A
| | - M Drton
- Department of Mathematics, Technical University of Munich, Boltzmannstraße 3, 85748 Garching b. München, Germany
| | - F Han
- Department of Statistics, University of Washington, Box 354322, Seattle, Washington 98195, U.S.A
| |
Collapse
|
9
|
Affiliation(s)
- Mathias Drton
- Department of Mathematics, Technical University of Munich
| | | | - Peter Hoff
- Department of Statistical Science, Duke University
| |
Collapse
|
10
|
|
11
|
|
12
|
Abstract
This paper concerns the development of an inferential framework for high-dimensional linear mixed effect models. These are suitable models, for instance, when we have n repeated measurements for M subjects. We consider a scenario where the number of fixed effects p is large (and may be larger than M), but the number of random effects q is small. Our framework is inspired by a recent line of work that proposes de-biasing penalized estimators to perform inference for high-dimensional linear models with fixed effects only. In particular, we demonstrate how to correct a 'naive' ridge estimator in extension of work by Bühlmann (2013) to build asymptotically valid confidence intervals for mixed effect models. We validate our theoretical results with numerical experiments, in which we show our method outperforms those that fail to account for correlation induced by the random effects. For a practical demonstration we consider a riboflavin production dataset that exhibits group structure, and show that conclusions drawn using our method are consistent with those obtained on a similar dataset without group structure.
Collapse
Affiliation(s)
- Lina Lin
- Department of Statistics, University of Washington
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| |
Collapse
|
13
|
Affiliation(s)
- Hongjian Shi
- Department of Statistics, University of Washington, Seattle, WA
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Garching bei München, Germany
| | - Fang Han
- Department of Statistics, University of Washington, Seattle, WA
| |
Collapse
|
14
|
Jin K, Wilson KA, Beck JN, Nelson CS, Brownridge GW, Harrison BR, Djukovic D, Raftery D, Brem RB, Yu S, Drton M, Shojaie A, Kapahi P, Promislow D. Genetic and metabolomic architecture of variation in diet restriction-mediated lifespan extension in Drosophila. PLoS Genet 2020; 16:e1008835. [PMID: 32644988 PMCID: PMC7347105 DOI: 10.1371/journal.pgen.1008835] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 05/06/2020] [Indexed: 01/08/2023] Open
Abstract
In most organisms, dietary restriction (DR) increases lifespan. However, several studies have found that genotypes within the same species vary widely in how they respond to DR. To explore the mechanisms underlying this variation, we exposed 178 inbred Drosophila melanogaster lines to a DR or ad libitum (AL) diet, and measured a panel of 105 metabolites under both diets. Twenty four out of 105 metabolites were associated with the magnitude of the lifespan response. These included proteinogenic amino acids and metabolites involved in α-ketoglutarate (α-KG)/glutamine metabolism. We confirm the role of α-KG/glutamine synthesis pathways in the DR response through genetic manipulations. We used covariance network analysis to investigate diet-dependent interactions between metabolites, identifying the essential amino acids threonine and arginine as “hub” metabolites in the DR response. Finally, we employ a novel metabolic and genetic bipartite network analysis to reveal multiple genes that influence DR lifespan response, some of which have not previously been implicated in DR regulation. One of these is CCHa2R, a gene that encodes a neuropeptide receptor that influences satiety response and insulin signaling. Across the lines, variation in an intronic single nucleotide variant of CCHa2R correlated with variation in levels of five metabolites, all of which in turn were correlated with DR lifespan response. Inhibition of adult CCHa2R expression extended DR lifespan of flies, confirming the role of CCHa2R in lifespan response. These results provide support for the power of combined genomic and metabolomic analysis to identify key pathways underlying variation in this complex quantitative trait. Dietary restriction extends lifespan across most organisms in which it has been tested. However, several studies have now demonstrated that this effect can vary dramatically across different genotypes within a population. Within a population, dietary restriction might be beneficial for some, yet detrimental for others. Here, we measure the metabolome of 178 genetically characterized fly strains on fully fed and restricted diets. The fly strains vary widely in their lifespan response to dietary restriction. We then use information about each strain’s genome and metabolome (a measure of small molecules circulating in flies) to pinpoint cellular pathways that govern this variation in response. We identify a novel pathway involving the gene CCHa2R, which encodes a neuropeptide receptor that has not previously been implicated in dietary restriction or age-related signaling pathways. This study demonstrates the power of leveraging systems biology and network biology methods to understand how and why different individuals vary in their response to health and lifespan-extending interventions.
Collapse
Affiliation(s)
- Kelly Jin
- Department of Pathology, University of Washington School of Medicine, Seattle, Washington, United States of America
| | - Kenneth A. Wilson
- Buck Institute for Research on Aging, Novato, California, United States of America
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, California, United States of America
| | - Jennifer N. Beck
- Buck Institute for Research on Aging, Novato, California, United States of America
| | | | - George W. Brownridge
- Buck Institute for Research on Aging, Novato, California, United States of America
- Dominican University of California, San Rafael, California, United States of America
| | - Benjamin R. Harrison
- Department of Pathology, University of Washington School of Medicine, Seattle, Washington, United States of America
| | - Danijel Djukovic
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, Washington, United States of America
| | - Daniel Raftery
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, Washington, United States of America
| | - Rachel B. Brem
- Buck Institute for Research on Aging, Novato, California, United States of America
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, California, United States of America
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Shiqing Yu
- Department of Statistics, University of Washington, Seattle, Washington, United States of America
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Pankaj Kapahi
- Buck Institute for Research on Aging, Novato, California, United States of America
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, California, United States of America
| | - Daniel Promislow
- Department of Pathology, University of Washington School of Medicine, Seattle, Washington, United States of America
- Department of Biology, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
15
|
Abstract
Summary
We consider graphical models based on a recursive system of linear structural equations. This implies that there is an ordering, $\sigma$, of the variables such that each observed variable $Y_v$ is a linear function of a variable-specific error term and the other observed variables $Y_u$ with $\sigma(u) < \sigma (v)$. The causal relationships, i.e., which other variables the linear functions depend on, can be described using a directed graph. It has previously been shown that when the variable-specific error terms are non-Gaussian, the exact causal graph, as opposed to a Markov equivalence class, can be consistently estimated from observational data. We propose an algorithm that yields consistent estimates of the graph also in high-dimensional settings in which the number of variables may grow at a faster rate than the number of observations, but in which the underlying causal structure features suitable sparsity; specifically, the maximum in-degree of the graph is controlled. Our theoretical analysis is couched in the setting of log-concave error distributions.
Collapse
Affiliation(s)
- Y Samuel Wang
- Booth School of Business, The University of Chicago, 5807 South Woodlawn Avenue, Chicago, Illinois, U.S.A
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Boltzmannstraße 3, Garching bei München, Germany
| |
Collapse
|
16
|
Abstract
Summary
Prior work has shown that causal structure can be uniquely identified from observational data when these follow a structural equation model whose error terms have equal variance. We show that this fact is implied by an ordering among conditional variances. We demonstrate that ordering estimates of these variances yields a simple yet state-of-the-art method for causal structure learning that is readily extendable to high-dimensional problems.
Collapse
Affiliation(s)
- Wenyu Chen
- Department of Statistics, University of Washington, Box 354322, Seattle, Washington, U.S.A
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Boltzmannstraße 3, 85748 Garching bei München, Germany
| | - Y Samuel Wang
- Booth School of Business, The University of Chicago, 5807 South Woodlawn Avenue, Chicago, Illinois, U.S.A
| |
Collapse
|
17
|
|
18
|
Abstract
Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene co-regulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional independences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The method is applied to data for T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It infers network structure not revealed by other methods; or in bulk data sets. An R implementation is available at https://github.com/amcdavid/HurdleNormal.
Collapse
Affiliation(s)
- Andrew McDavid
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center; Rochester, New York
| | - Raphael Gottardo
- Vaccine and Infectuous Disease Division, Fred Hutchinson Cancer Research Center
- Department of Statistic, University of Washington; Seattle, Washington
| | - Noah Simon
- Department of Biostatistics, University of Washington; Seattle, Washington
| | - Mathias Drton
- Department of Statistic, University of Washington; Seattle, Washington
- Department of Mathematical Sciences, University of Copenhagen; Denmark
| |
Collapse
|
19
|
Yu S, Drton M, Shojaie A. Generalized Score Matching for Non-Negative Data. J Mach Learn Res 2019; 20:76. [PMID: 34290571 PMCID: PMC8291733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
A common challenge in estimating parameters of probability density functions is the intractability of the normalizing constant. While in such cases maximum likelihood estimation may be implemented using numerical integration, the approach becomes computationally intensive. The score matching method of Hyvärinen (2005) avoids direct calculation of the normalizing constant and yields closed-form estimates for exponential families of continuous distributions over R m . Hyvärinen (2007) extended the approach to distributions supported on the non-negative orthant, R + m . In this paper, we give a generalized form of score matching for non-negative data that improves estimation efficiency. As an example, we consider a general class of pairwise interaction models. Addressing an overlooked inexistence problem, we generalize the regularized score matching method of Lin et al. (2016) and improve its theoretical guarantees for non-negative Gaussian graphical models.
Collapse
Affiliation(s)
- Shiqing Yu
- Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Mathias Drton
- Department of Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, U.S.A
| |
Collapse
|
20
|
Drton M, Fox C, Wang YS. Computation of maximum likelihood estimates in cyclic structural equation models. Ann Stat 2019. [DOI: 10.1214/17-aos1602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
Affiliation(s)
- L Weihs
- Department of Statistics, University of Washington, Box 354322, Seattle, Washington, U.S.A
| | - M Drton
- Department of Statistics, University of Washington, Box 354322, Seattle, Washington, U.S.A
| | - N Meinshausen
- Seminar for Statistics, Eidgenössische Technische Hochschule Zürich, Rämistrasse 101, Zürich, Switzerland
| |
Collapse
|
22
|
Affiliation(s)
- Shota Katayama
- Department of Industrial Engineering and Economics; Tokyo Institute of Technology; 2-12-1 Ookayama Meguro-ku 152-8552 Tokyo Japan
| | - Hironori Fujisawa
- The Institute of Statistical Mathematics; 10-3 Midori-cho, Tachikawa; 190-8562 Tokyo Japan
- Graduate School of Medicine; Nagoya University; 65 Tsurumai-cho, Showa-ku Nagoya 466-8550 Japan
| | - Mathias Drton
- Department of Statistics; University of Washington; Seattle 98195-4322 WA USA
| |
Collapse
|
23
|
|
24
|
Affiliation(s)
- Y. Samuel Wang
- Department of Statistics; University of Washington; Seattle 98103 WA USA
| | - Mathias Drton
- Department of Statistics; University of Washington; Seattle 98103 WA USA
| |
Collapse
|
25
|
|
26
|
Keller JP, Drton M, Larson T, Kaufman JD, Sandler DP, Szpiro AA. COVARIATE-ADAPTIVE CLUSTERING OF EXPOSURES FOR AIR POLLUTION EPIDEMIOLOGY COHORTS. Ann Appl Stat 2017; 11:93-113. [PMID: 28572869 PMCID: PMC5448716 DOI: 10.1214/16-aoas992] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations. Our predictive k-means procedure identifies centers using a mixture model and is followed by multi-class spatial prediction. In simulations, we demonstrate that predictive k-means can reduce misclassification error by over 50% compared to ordinary k-means, with minimal loss in cluster representativeness. The improved prediction accuracy results in large gains of 30% or more in power for detecting effect modification by cluster in a simulated health analysis. In an analysis of the NIEHS Sister Study cohort using predictive k-means, we find that the association between systolic blood pressure (SBP) and long-term fine particulate matter (PM2.5) exposure varies significantly between different clusters of PM2.5 component profiles. Our cluster-based analysis shows that for subjects assigned to a cluster located in the Midwestern U.S., a 10 μg/m3 difference in exposure is associated with 4.37 mmHg (95% CI, 2.38, 6.35) higher SBP.
Collapse
Affiliation(s)
- Joshua P Keller
- Department of Biostatistics, University of Washington, Box 357232, Health Sciences Building, F-600 1705 NE Pacific Street Seattle, WA 98195
| | - Mathias Drton
- Department of Statistics University of Washington, Box 354322, Seattle, WA 98195
| | - Timothy Larson
- Department of Civil and Environmental Engineering, University of Washington, Box 352700, 201 More Hall Seattle, WA 98195
| | - Joel D Kaufman
- Department of Environmental and Occupational Health Sciences, University of Washington, Box 354695, 4225 Roosevelt Way NE Seattle, WA 98105
| | - Dale P Sandler
- Epidemiology Branch National Institute of Environmental Health Sciences, P.O. Box 12233, Mail Drop A3-05 111 T W Alexander Dr Research Triangle Park, NC 27709
| | - Adam A Szpiro
- Department of Biostatistics, University of Washington, Box 357232, Health Sciences Building, F-600 1705 NE Pacific Street Seattle, WA 98195
| |
Collapse
|
27
|
|
28
|
Affiliation(s)
| | - Luca Weihs
- Department of Statistics University of Washington
| |
Collapse
|
29
|
|
30
|
|
31
|
|
32
|
|
33
|
|
34
|
Abstract
Graphical models are widely used to model stochastic dependences among large collections of variables. We introduce a new method of estimating undirected conditional independence graphs based on the score matching loss, introduced by Hyvärinen (2005), and subsequently extended in Hyvärinen (2007). The regularized score matching method we propose applies to settings with continuous observations and allows for computationally efficient treatment of possibly non-Gaussian exponential family models. In the well-explored Gaussian setting, regularized score matching avoids issues of asymmetry that arise when applying the technique of neighborhood selection, and compared to existing methods that directly yield symmetric estimates, the score matching approach has the advantage that the considered loss is quadratic and gives piecewise linear solution paths under ℓ1 regularization. Under suitable irrepresentability conditions, we show that ℓ1-regularized score matching is consistent for graph estimation in sparse high-dimensional settings. Through numerical experiments and an application to RNAseq data, we confirm that regularized score matching achieves state-of-the-art performance in the Gaussian case and provides a valuable tool for computationally efficient estimation in non-Gaussian graphical models.
Collapse
Affiliation(s)
- Lina Lin
- Department of Statistics, University of Washington, Seattle, WA 98195, U.S.A
| | - Mathias Drton
- Department of Statistics, University of Washington, Seattle, WA 98195, U.S.A
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA 98195, U.S.A
| |
Collapse
|
35
|
Kwok H, Coult J, Drton M, Rea TD, Sherman L. Adaptive rhythm sequencing: A method for dynamic rhythm classification during CPR. Resuscitation 2015; 91:26-31. [DOI: 10.1016/j.resuscitation.2015.02.031] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Revised: 01/31/2015] [Accepted: 02/18/2015] [Indexed: 10/23/2022]
|
36
|
|
37
|
|
38
|
|
39
|
|
40
|
|
41
|
Abstract
In phylogenetic inference, an evolutionary model describes the substitution processes along each edge of a phylogenetic tree. Misspecification of the model has important implications for the analysis of phylogenetic data. Conventionally, however, the selection of a suitable evolutionary model is based on heuristics or relies on the choice of an approximate input tree. We introduce a method for model Selection in Phylogenetics based on linear INvariants (SPIn), which uses recent insights on linear invariants to characterize a model of nucleotide evolution for phylogenetic mixtures on any number of components. Linear invariants are constraints among the joint probabilities of the bases in the operational taxonomic units that hold irrespective of the tree topologies appearing in the mixtures. SPIn therefore requires no input tree and is designed to deal with nonhomogeneous phylogenetic data consisting of multiple sequence alignments showing different patterns of evolution, for example, concatenated genes, exons, and/or introns. Here, we report on the results of the proposed method evaluated on multiple sequence alignments simulated under a variety of single-tree and mixture settings for both continuous- and discrete-time models. In the simulations, SPIn successfully recovers the underlying evolutionary model and is shown to perform better than existing approaches.
Collapse
Affiliation(s)
- A M Kedzierska
- Bioinformatics and Genomics Group, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Catalonia, Spain
| | | | | | | |
Collapse
|
42
|
|
43
|
|
44
|
|
45
|
|
46
|
|
47
|
|
48
|
|
49
|
|
50
|
|