26
|
Biasing the input: A yoked-scientist demonstration of the distorting effects of optional stopping on Bayesian inference. Behav Res Methods 2021; 54:1131-1147. [PMID: 34494220 DOI: 10.3758/s13428-021-01618-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/05/2021] [Indexed: 11/08/2022]
Abstract
Prior work by Michael R. Dougherty and colleagues (Yu et al., 2014) shows that when a scientist monitors the p value during data collection and uses a critical p as the signal to stop collecting data, the resulting p is distorted due to Type I error-rate inflation. They argued similarly that the use of a critical Bayes factor (BF(crit)) for stopping distorts the obtained Bayes factor (BF), a position that has met with controversy. The present paper clarified that when BF(crit) is used as a stopping criterion, the sample becomes biased in that data consistent with large effects have a greater chance to be included than do other data, thus biasing the input to Bayesian inference. We report simulations of yoked pairs of scientists in which Scientist A uses BF(crit) to optionally stop, while Scientist B, sampling from the same population, stops when A stops. Thus, optional stopping is compared not to a hypothetical in which no stopping occurs, but to a situation in which B stops for reasons unrelated to the characteristics of B's sample. The results indicated that optional stopping biased the input for Bayesian inference. We also simulated the use of effect-size stabilization as a stopping criterion and found no bias in that case.
Collapse
|
27
|
Khan U, Khan AM, Alkatheery N, Khan U. Pandemic and its effect on professional environment on the Kingdom of Saudi Arabia. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:41162-41168. [PMID: 33779902 PMCID: PMC8006104 DOI: 10.1007/s11356-021-13501-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 03/15/2021] [Indexed: 06/12/2023]
Abstract
The pandemic has affected the world from many different perspectives, including environmental change. This research study aims to investigate the pandemic and its associated effect on the professional environment by measuring some of the parameters that are likely to disclose the impact of the pandemic. A structural questionnaire elicits design to capture the effect of COVID-19, where 284 respondents participated and present their views on a different statement based on the Likert scale. The factor analysis reveals five factors, which were further tested by hypothesis testing and binary logistic regression-and found factors 2, 3, and 5 to be significant in both tests.
Collapse
|
28
|
Chen S, Lin X. Analysis in case-control sequencing association studies with different sequencing depths. Biostatistics 2021; 21:577-593. [PMID: 30590456 PMCID: PMC7308042 DOI: 10.1093/biostatistics/kxy073] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2018] [Revised: 10/17/2018] [Accepted: 10/21/2018] [Indexed: 01/09/2023] Open
Abstract
With the advent of next-generation sequencing, investigators have access to higher quality sequencing data. However, to sequence all samples in a study using next generation sequencing can still be prohibitively expensive. One potential remedy could be to combine next generation sequencing data from cases with publicly available sequencing data for controls, but there could be a systematic difference in quality of sequenced data, such as sequencing depths, between sequenced study cases and publicly available controls. We propose a regression calibration (RC)-based method and a maximum-likelihood method for conducting an association study with such a combined sample by accounting for differential sequencing errors between cases and controls. The methods allow for adjusting for covariates, such as population stratification as confounders. Both methods control type I error and have comparable power to analysis conducted using the true genotype with sufficiently high but different sequencing depths. We show that the RC method allows for analysis using naive variance estimate (closely approximates true variance in practice) and standard software under certain circumstances. We evaluate the performance of the proposed methods using simulation studies and apply our methods to a combined data set of exome sequenced acute lung injury cases and healthy controls from the 1000 Genomes project.
Collapse
|
29
|
Abstract
Recently, optional stopping has been a subject of debate in the Bayesian psychology community. Rouder (Psychonomic Bulletin & Review 21(2), 301-308, 2014) argues that optional stopping is no problem for Bayesians, and even recommends the use of optional stopping in practice, as do (Wagenmakers, Wetzels, Borsboom, van der Maas & Kievit, Perspectives on Psychological Science 7, 627-633, 2012). This article addresses the question of whether optional stopping is problematic for Bayesian methods, and specifies under which circumstances and in which sense it is and is not. By slightly varying and extending Rouder's (Psychonomic Bulletin & Review 21(2), 301-308, 2014) experiments, we illustrate that, as soon as the parameters of interest are equipped with default or pragmatic priors-which means, in most practical applications of Bayes factor hypothesis testing-resilience to optional stopping can break down. We distinguish between three types of default priors, each having their own specific issues with optional stopping, ranging from no-problem-at-all (type 0 priors) to quite severe (type II priors).
Collapse
|
30
|
Liu Z, Shen J, Barfield R, Schwartz J, Baccarelli AA, Lin X. Large-Scale Hypothesis Testing for Causal Mediation Effects with Applications in Genome-wide Epigenetic Studies. J Am Stat Assoc 2021; 117:67-81. [PMID: 35989709 PMCID: PMC9385159 DOI: 10.1080/01621459.2021.1914634] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 04/04/2021] [Accepted: 04/04/2021] [Indexed: 01/03/2023]
Abstract
In genome-wide epigenetic studies, it is of great scientific interest to assess whether the effect of an exposure on a clinical outcome is mediated through DNA methylations. However, statistical inference for causal mediation effects is challenged by the fact that one needs to test a large number of composite null hypotheses across the whole epigenome. Two popular tests, the Wald-type Sobel's test and the joint significant test using the traditional null distribution are underpowered and thus can miss important scientific discoveries. In this paper, we show that the null distribution of Sobel's test is not the standard normal distribution and the null distribution of the joint significant test is not uniform under the composite null of no mediation effect, especially in finite samples and under the singular point null case that the exposure has no effect on the mediator and the mediator has no effect on the outcome. Our results explain why these two tests are underpowered, and more importantly motivate us to develop a more powerful Divide-Aggregate Composite-null Test (DACT) for the composite null hypothesis of no mediation effect by leveraging epigenome-wide data. We adopted Efron's empirical null framework for assessing statistical significance of the DACT test. We showed analytically that the proposed DACT method had improved power, and could well control type I error rate. Our extensive simulation studies showed that, in finite samples, the DACT method properly controlled the type I error rate and outperformed Sobel's test and the joint significance test for detecting mediation effects. We applied the DACT method to the US Department of Veterans Affairs Normative Aging Study, an ongoing prospective cohort study which included men who were aged 21 to 80 years at entry. We identified multiple DNA methylation CpG sites that might mediate the effect of smoking on lung function with effect sizes ranging from -0.18 to -0.79 and false discovery rate controlled at level 0.05, including the CpG sites in the genes AHRR and F2RL3. Our sensitivity analysis found small residual correlations (less than 0.01) of the error terms between the outcome and mediator regressions, suggesting that our results are robust to unmeasured confounding factors.
Collapse
|
31
|
Abstract
When data are not normally distributed, researchers are often uncertain whether it is legitimate to use tests that assume Gaussian errors, or whether one has to either model a more specific error structure or use randomization techniques. Here we use Monte Carlo simulations to explore the pros and cons of fitting Gaussian models to non-normal data in terms of risk of type I error, power and utility for parameter estimation. We find that Gaussian models are robust to non-normality over a wide range of conditions, meaning that p values remain fairly reliable except for data with influential outliers judged at strict alpha levels. Gaussian models also performed well in terms of power across all simulated scenarios. Parameter estimates were mostly unbiased and precise except if sample sizes were small or the distribution of the predictor was highly skewed. Transformation of data before analysis is often advisable and visual inspection for outliers and heteroscedasticity is important for assessment. In strong contrast, some non-Gaussian models and randomization techniques bear a range of risks that are often insufficiently known. High rates of false-positive conclusions can arise for instance when overdispersion in count data is not controlled appropriately or when randomization procedures ignore existing non-independencies in the data. Hence, newly developed statistical methods not only bring new opportunities, but they can also pose new threats to reliability. We argue that violating the normality assumption bears risks that are limited and manageable, while several more sophisticated approaches are relatively error prone and particularly difficult to check during peer review. Scientists and reviewers who are not fully aware of the risks might benefit from preferentially trusting Gaussian mixed models in which random effects account for non-independencies in the data.
Collapse
|
32
|
Robinson MA, Vanrenterghem J, Pataky TC. Sample size estimation for biomechanical waveforms: Current practice, recommendations and a comparison to discrete power analysis. J Biomech 2021; 122:110451. [PMID: 33933866 DOI: 10.1016/j.jbiomech.2021.110451] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 04/06/2021] [Accepted: 04/09/2021] [Indexed: 12/16/2022]
Abstract
Testing a prediction is fundamental to scientific experiments. Where biomechanical experiments involve analysis of 1-Dimensional (waveform) data, sample size estimation should consider both 1D variance and hypothesised 1D effects. This study exemplifies 1D sample size estimation using typical biomechanical signals and contrasts this with 0D (discrete) power analysis. For context, biomechanics papers from 2018 and 2019 were reviewed to characterise current practice. Sample size estimation occurred in approximately 4% of 653 papers and reporting practice was mixed. To estimate sample sizes, common biomechanical signals were sourced from the literature and 1D effects were generated artificially using the open-source power1d software. Smooth Gaussian noise was added to the modelled 1D effect to numerically estimate the sample size required. Sample sizes estimated using 1D power procedures varied according to the characteristics of the dataset, requiring only small-to-moderate sample sizes of approximately 5-40 to achieve target powers of 0.8 for reported 1D effects, but were always larger than 0D sample sizes (from N + 1 to >N + 20). The importance of a priori sample size estimation is highlighted and recommendations are provided to improve the consistency of reporting. This study should enable researchers to construct 1D biomechanical effects to address adequately powered, hypothesis-driven, predictive research questions.
Collapse
|
33
|
Li G, Walter SD, Thabane L. Shifting the focus away from binary thinking of statistical significance and towards education for key stakeholders: revisiting the debate on whether it's time to de-emphasize or get rid of statistical significance. J Clin Epidemiol 2021; 137:104-112. [PMID: 33839240 DOI: 10.1016/j.jclinepi.2021.03.033] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 03/03/2021] [Accepted: 03/10/2021] [Indexed: 01/01/2023]
Abstract
There has been a long-standing controversy among scientists regarding the appropriate use of P-values and statistical significance in clinical research. This debate has resurfaced through recent calls to modify the threshold of P-value required to declare significance, or to retire statistical significance entirely. In this article, we revisit the issue by discussing: i) the connection between statistical thinking and evidence-based practice; ii) some history of statistical significance and P-values; iii) some practical challenges with statistical significance or P-value thresholds in clinical research; iv) the on-going debate on what to do with statistical significance; v) suggestions to shift the focus away from binary thinking of statistical significance and towards education for key stakeholders on research essentials including statistical thinking, critical thinking, good reporting, basic clinical research concepts and methods, and more. We then conclude with remarks and illustrations of the potential deleterious public health consequences of poor methods including selective choice of analysis approach and misguided reliance on binary use of P-values to report and interpret scientific findings.
Collapse
|
34
|
Huang Y, Cho J, Fong Y. Threshold-based subgroup testing in logistic regression models in two-phase sampling designs. J R Stat Soc Ser C Appl Stat 2021; 70:291-311. [PMID: 33840863 PMCID: PMC8032557 DOI: 10.1111/rssc.12459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The effect of treatment on binary disease outcome can differ across subgroups characterized by other covariates. Testing for the existence of subgroups that are associated with heterogeneous treatment effects can provide valuable insight regarding the optimal treatment recommendation in practice. Our research in this paper is motivated by the question of whether host genetics could modify a vaccine's effect on HIV acquisition risk. To answer this question, we used data from an HIV vaccine trial with a two-phase sampling design and developed a general threshold-based model framework to test for the existence of subgroups associated with the heterogeneity in disease risks, allowing for subgroups based on multivariate covariates. We developed a testing procedure based on maximum of likelihood-ratio statistics over change planes and demonstrated its advantage over alternative methods. We further developed the testing procedure to account for bias sampling of expensive (i.e. resource-intensive to measure) covariates through the incorporation of inverse probability weighting techniques. We used the proposed method to analyze the motivating HIV vaccine trial data. Our proposed testing procedure also has broad applications in epidemiological studies for assessing heterogeneity in disease risk with respect to univariate or multivariate predictors.
Collapse
|
35
|
Artificial cognition: How experimental psychology can help generate explainable artificial intelligence. Psychon Bull Rev 2020; 28:454-475. [PMID: 33159244 DOI: 10.3758/s13423-020-01825-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/02/2020] [Indexed: 11/08/2022]
Abstract
Artificial intelligence powered by deep neural networks has reached a level of complexity where it can be difficult or impossible to express how a model makes its decisions. This black-box problem is especially concerning when the model makes decisions with consequences for human well-being. In response, an emerging field called explainable artificial intelligence (XAI) aims to increase the interpretability, fairness, and transparency of machine learning. In this paper, we describe how cognitive psychologists can make contributions to XAI. The human mind is also a black box, and cognitive psychologists have over 150 years of experience modeling it through experimentation. We ought to translate the methods and rigor of cognitive psychology to the study of artificial black boxes in the service of explainability. We provide a review of XAI for psychologists, arguing that current methods possess a blind spot that can be complemented by the experimental cognitive tradition. We also provide a framework for research in XAI, highlight exemplary cases of experimentation within XAI inspired by psychological science, and provide a tutorial on experimenting with machines. We end by noting the advantages of an experimental approach and invite other psychologists to conduct research in this exciting new field.
Collapse
|
36
|
Fudge DS, Turko AJ. The best predictions in experimental biology are critical and persuasive. ACTA ACUST UNITED AC 2020; 223:223/19/jeb231894. [PMID: 33046579 DOI: 10.1242/jeb.231894] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
A powerful way to evaluate scientific explanations (hypotheses) is to test the predictions that they make. In this way, predictions serve as an important bridge between abstract hypotheses and concrete experiments. Experimental biologists, however, generally receive little guidance on how to generate quality predictions. Here, we identify two important components of good predictions - criticality and persuasiveness - which relate to the ability of a prediction (and the experiment it implies) to disprove a hypothesis or to convince a skeptic that the hypothesis has merit. Using a detailed example, we demonstrate how striving for predictions that are both critical and persuasive can speed scientific progress by leading us to more powerful experiments. Finally, we provide a quality control checklist to assist students and researchers as they navigate the hypothetico-deductive method from puzzling observations to experimental tests.
Collapse
|
37
|
Abstract
While the applied psychology community relies on statistics to assist drawing conclusions from quantitative data, the methods being used mostly today do not reflect several of the advances in statistics that have been realized over the past decades. We show in this paper how a number of issues with how statistical analyses are presently executed and reported in the literature can be addressed by applying more modern methods. Unfortunately, such new methods are not always supported by widely available statistical packages, such as SPSS, which is why we also introduce a new software platform, called ILLMO (for Interactive Log-Likelihood MOdeling), which offers an intuitive interface to such modern statistical methods. In order to limit the complexity of the material being covered in this paper, we focus the discussion on a fairly simple, but nevertheless very frequent and important statistical task, i.e., comparing two experimental conditions.
Collapse
|
38
|
Garrido Wainer JM, Espinosa JF, Hirmas N, Trujillo N. Free-viewing as experimental system to test the Temporal Correlation Hypothesis: A case of theory-generative experimental practice. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2020; 83:101307. [PMID: 32467019 DOI: 10.1016/j.shpsc.2020.101307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 05/13/2020] [Accepted: 05/14/2020] [Indexed: 06/11/2023]
Abstract
Theory-free characterizations of experimental systems miss normative and conceptual components that sometimes are crucial to understanding their historical development. In the following paper, we show that these components may be part of the intrinsic capacities of experimental systems themselves. We study a case of non-exploratory and theory-oriented research in experimental neuroscience that concerns the construction of free-viewing as an experimental system to test one particular pre-existing hypothesis, the Temporal Correlation Hypothesis (TCH), at a laboratory in Santiago de Chile, during 2002-2008. We show that the system does not take well-formulated pre-existing predictions or hypotheses to test them directly, but re-creates them and re-signifies them in terms that are not implied by the theoretical background from which they originally derived. Therefore, we conclude that there is a sui generis way in which experimental systems produce proper theoretical knowledge.
Collapse
|
39
|
Abstract
Background In medical research and practice, the p-value is arguably the most often used statistic and yet it is widely misconstrued as the probability of the type I error, which comes with serious consequences. This misunderstanding can greatly affect the reproducibility in research, treatment selection in medical practice, and model specification in empirical analyses. By using plain language and concrete examples, this paper is intended to elucidate the p-value confusion from its root, to explicate the difference between significance and hypothesis testing, to illuminate the consequences of the confusion, and to present a viable alternative to the conventional p-value. Main text The confusion with p-values has plagued the research community and medical practitioners for decades. However, efforts to clarify it have been largely futile, in part, because intuitive yet mathematically rigorous educational materials are scarce. Additionally, the lack of a practical alternative to the p-value for guarding against randomness also plays a role. The p-value confusion is rooted in the misconception of significance and hypothesis testing. Most, including many statisticians, are unaware that p-values and significance testing formed by Fisher are incomparable to the hypothesis testing paradigm created by Neyman and Pearson. And most otherwise great statistics textbooks tend to cobble the two paradigms together and make no effort to elucidate the subtle but fundamental differences between them. The p-value is a practical tool gauging the “strength of evidence” against the null hypothesis. It informs investigators that a p-value of 0.001, for example, is stronger than 0.05. However, p-values produced in significance testing are not the probabilities of type I errors as commonly misconceived. For a p-value of 0.05, the chance a treatment does not work is not 5%; rather, it is at least 28.9%. Conclusions A long-overdue effort to understand p-values correctly is much needed. However, in medical research and practice, just banning significance testing and accepting uncertainty are not enough. Researchers, clinicians, and patients alike need to know the probability a treatment will or will not work. Thus, the calibrated p-values (the probability that a treatment does not work) should be reported in research papers.
Collapse
|
40
|
Sreekumar S, Cohen A, Gündüz D. Privacy-Aware Distributed Hypothesis Testing. ENTROPY 2020; 22:e22060665. [PMID: 33286437 PMCID: PMC7517198 DOI: 10.3390/e22060665] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 06/11/2020] [Accepted: 06/12/2020] [Indexed: 11/16/2022]
Abstract
A distributed binary hypothesis testing (HT) problem involving two parties, a remote observer and a detector, is studied. The remote observer has access to a discrete memoryless source, and communicates its observations to the detector via a rate-limited noiseless channel. The detector observes another discrete memoryless source, and performs a binary hypothesis test on the joint distribution of its own observations with those of the observer. While the goal of the observer is to maximize the type II error exponent of the test for a given type I error probability constraint, it also wants to keep a private part of its observations as oblivious to the detector as possible. Considering both equivocation and average distortion under a causal disclosure assumption as possible measures of privacy, the trade-off between the communication rate from the observer to the detector, the type II error exponent, and privacy is studied. For the general HT problem, we establish single-letter inner bounds on both the rate-error exponent-equivocation and rate-error exponent-distortion trade-offs. Subsequently, single-letter characterizations for both trade-offs are obtained (i) for testing against conditional independence of the observer's observations from those of the detector, given some additional side information at the detector; and (ii) when the communication rate constraint over the channel is zero. Finally, we show by providing a counter-example where the strong converse which holds for distributed HT without a privacy constraint does not hold when a privacy constraint is imposed. This implies that in general, the rate-error exponent-equivocation and rate-error exponent-distortion trade-offs are not independent of the type I error probability constraint.
Collapse
|
41
|
A method for reducing animal use whilst maintaining statistical power in electrophysiological recordings from rodent nerves. Heliyon 2020; 6:e04143. [PMID: 32529085 PMCID: PMC7281824 DOI: 10.1016/j.heliyon.2020.e04143] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 11/21/2019] [Accepted: 06/02/2020] [Indexed: 12/29/2022] Open
Abstract
The stimulus evoked compound action potential, recorded from ex vivo nerve trunks such as the rodent optic and sciatic nerve, is a popular model system used to study aspects of nervous system metabolism. This includes (1) the role of glycogen in supporting axon conduction, (2) the injury mechanisms resulting from metabolic insults, and (3) to test putative benefits of clinically relevant neuroprotective strategies. We demonstrate the benefit of simultaneously recording from pairs of nerves in the same superfusion chamber compared with conventional recordings from single nerves. Experiments carried out on mouse optic and sciatic nerves demonstrate that our new recording configuration decreased the relative standard deviation from samples when compared with recordings from an equivalent number of individually recorded nerves. The new method reduces the number of animals required to produce equivalent Power compared with the existing method, where single nerves are used. Adopting this method leads to increased experimental efficiency and productivity. We demonstrate that reduced animal use and increased Power can be achieved by recording from pairs of rodent nerve trunks simultaneously.
Collapse
|
42
|
Russo L, Russo S. Search engines, cognitive biases and the man-computer interaction: a theoretical framework for empirical researches about cognitive biases in online search on health-related topics. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2020; 23:237-246. [PMID: 32056071 DOI: 10.1007/s11019-020-09940-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The widespread use of online search engines to answer the general public's needs for information has raised concerns about possible biases and the emerging of a 'filter bubble' in which users are isolated from attitude-discordant messages. Research is split between approaches that largely focus on the intrinsic limitations of search engines and approaches that investigate user search behavior. This work evaluates the findings and limitations of both approaches and advances a theoretical framework for empirical investigations of cognitive biases in online search activities about health-related topics. We aim to investigate the interaction between the user and the search engine as a whole. Online search activity about health-related topics is considered as a hypothesis-testing process. Two questions emerge: whether the retrieved information provided by the search engines are fit to fulfill their role as evidence, and whether the use of this information by users is cognitively and epistemologically valid and unbiased.
Collapse
|
43
|
Xia Y. Correlation and association analyses in microbiome study integrating multiomics in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 171:309-491. [PMID: 32475527 DOI: 10.1016/bs.pmbts.2020.04.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Correlation and association analyses are one of the most widely used statistical methods in research fields, including microbiome and integrative multiomics studies. Correlation and association have two implications: dependence and co-occurrence. Microbiome data are structured as phylogenetic tree and have several unique characteristics, including high dimensionality, compositionality, sparsity with excess zeros, and heterogeneity. These unique characteristics cause several statistical issues when analyzing microbiome data and integrating multiomics data, such as large p and small n, dependency, overdispersion, and zero-inflation. In microbiome research, on the one hand, classic correlation and association methods are still applied in real studies and used for the development of new methods; on the other hand, new methods have been developed to target statistical issues arising from unique characteristics of microbiome data. Here, we first provide a comprehensive view of classic and newly developed univariate correlation and association-based methods. We discuss the appropriateness and limitations of using classic methods and demonstrate how the newly developed methods mitigate the issues of microbiome data. Second, we emphasize that concepts of correlation and association analyses have been shifted by introducing network analysis, microbe-metabolite interactions, functional analysis, etc. Third, we introduce multivariate correlation and association-based methods, which are organized by the categories of exploratory, interpretive, and discriminatory analyses and classification methods. Fourth, we focus on the hypothesis testing of univariate and multivariate regression-based association methods, including alpha and beta diversities-based, count-based, and relative abundance (or compositional)-based association analyses. We demonstrate the characteristics and limitations of each approaches. Fifth, we introduce two specific microbiome-based methods: phylogenetic tree-based association analysis and testing for survival outcomes. Sixth, we provide an overall view of longitudinal methods in analysis of microbiome and omics data, which cover standard, static, regression-based time series methods, principal trend analysis, and newly developed univariate overdispersed and zero-inflated as well as multivariate distance/kernel-based longitudinal models. Finally, we comment on current association analysis and future direction of association analysis in microbiome and multiomics studies.
Collapse
|
44
|
Hu M, Crainiceanu C, Schindler MK, Dewey B, Reich DS, Shinohara RT, Eloyan A. Matrix decomposition for modeling lesion development processes in multiple sclerosis. Biostatistics 2020; 23:83-100. [PMID: 32318692 DOI: 10.1093/biostatistics/kxaa016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 03/11/2019] [Accepted: 03/12/2020] [Indexed: 11/14/2022] Open
Abstract
Our main goal is to study and quantify the evolution of multiple sclerosis lesions observed longitudinally over many years in multi-sequence structural magnetic resonance imaging (sMRI). To achieve that, we propose a class of functional models for capturing the temporal dynamics and spatial distribution of the voxel-specific intensity trajectories in all sMRI sequences. To accommodate the hierarchical data structure (observations nested within voxels, which are nested within lesions, which, in turn, are nested within study participants), we use structured functional principal component analysis. We propose and evaluate the finite sample properties of hypothesis tests of therapeutic intervention effects on lesion evolution while accounting for the multilevel structure of the data. Using this novel testing strategy, we found statistically significant differences in lesion evolution between treatment groups.
Collapse
|
45
|
Baduashvili A, Evans AT, Cutler T. How to understand and teach P values: a diagnostic test framework. J Clin Epidemiol 2020; 122:49-55. [PMID: 32169596 DOI: 10.1016/j.jclinepi.2020.03.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 02/09/2020] [Accepted: 03/05/2020] [Indexed: 11/25/2022]
Abstract
OBJECTIVES The aim of the tutorial is to help educators address misconceptions about P values and provide a tool that can be used to teach a more contemporary interpretation. STUDY DESIGN AND SETTING A scripted tutorial using problem-based learning and a diagnostic test analogy to deconstruct the misunderstandings about P values and develop a more Bayesian approach to study interpretation. RESULTS A diagnostic test analogy is an effective teaching tool. Learners' understanding of Bayes' theorem in diagnostic testing can be used as a bridge to the realization that the prestudy probability of a true difference is crucial for study interpretation. The analogy has several caveats and shortcomings. The limitations of this analogy and the conceptual difficulties with the Bayesian study analyses are addressed. CONCLUSION P values do not provide the information many assume they do-they are not equivalent to a probability of a chance finding. This tutorial helps move learners from these incorrect notions to new insights.
Collapse
|
46
|
Abstract
Preclinical studies using animals to study the potential of a therapeutic drug or strategy are important steps before translation to clinical trials. However, evidence has shown that poor quality in the design and conduct of these studies has not only impeded clinical translation but also led to significant waste of valuable research resources. It is clear that experimental biases are related to the poor quality seen with preclinical studies. In this chapter, we will focus on hypothesis testing type of preclinical studies and explain general concepts and principles in relation to the design of in vivo experiments, provide definitions of experimental biases and how to avoid them, and discuss major sources contributing to experimental biases and how to mitigate these sources. We will also explore the differences between confirmatory and exploratory studies, and discuss available guidelines on preclinical studies and how to use them. This chapter, together with relevant information in other chapters in the handbook, provides a powerful tool to enhance scientific rigour for preclinical studies without restricting creativity.
Collapse
|
47
|
Adams RH, Castoe TA. Probabilistic Species Tree Distances: Implementing the Multispecies Coalescent to Compare Species Trees Within the Same Model-Based Framework Used to Estimate Them. Syst Biol 2020; 69:194-207. [PMID: 31086978 DOI: 10.1093/sysbio/syz031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 05/02/2019] [Indexed: 11/14/2022] Open
Abstract
Despite the ubiquitous use of statistical models for phylogenomic and population genomic inferences, this model-based rigor is rarely applied to post hoc comparison of trees. In a recent study, Garba et al. derived new methods for measuring the distance between two gene trees computed as the difference in their site pattern probability distributions. Unlike traditional metrics that compare trees solely in terms of geometry, these measures consider gene trees and associated parameters as probabilistic models that can be compared using standard information theoretic approaches. Consequently, probabilistic measures of phylogenetic tree distance can be far more informative than simply comparisons of topology and/or branch lengths alone. However, in their current form, these distance measures are not suitable for the comparison of species tree models in the presence of gene tree heterogeneity. Here, we demonstrate an approach for how the theory of Garba et al. (2018), which is based on gene tree distances, can be extended naturally to the comparison of species tree models. Multispecies coalescent (MSC) models parameterize the discrete probability distribution of gene trees conditioned upon a species tree with a particular topology and set of divergence times (in coalescent units), and thus provide a framework for measuring distances between species tree models in terms of their corresponding gene tree topology probabilities. We describe the computation of probabilistic species tree distances in the context of standard MSC models, which assume complete genetic isolation postspeciation, as well as recent theoretical extensions to the MSC in the form of network-based MSC models that relax this assumption and permit hybridization among taxa. We demonstrate these metrics using simulations and empirical species tree estimates and discuss both the benefits and limitations of these approaches. We make our species tree distance approach available as an R package called pSTDistanceR, for open use by the community.
Collapse
|
48
|
Raybould A. Problem formulation and phenotypic characterisation for the development of novel crops. Transgenic Res 2020; 28:135-145. [PMID: 31321696 DOI: 10.1007/s11248-019-00147-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Phenotypic characterisation provides important information about novel crops that helps their developers to make technical and commercial decisions. Phenotypic characterisation comprises two activities. Product characterisation checks that the novel crop has the qualities of a viable product-the intended traits have been introduced and work as expected, and no unintended changes have been made that will adversely affect the performance of the final product. Risk assessment evaluates whether the intended and unintended changes are likely to harm human health or the environment. Product characterisation follows the principles of problem formulation, namely that the characteristics required in the final product are defined and criteria to decide whether the novel crop will have these properties are set. The hypothesis that the novel crop meets the criteria are tested during product development. If the hypothesis is corroborated, development continues, and if the hypothesis is falsified, the product is redesigned or its development is halted. Risk assessment should follow the same principles. Criteria that indicate the crop poses unacceptable risk should be set, and the hypothesis that the crop does not possess those properties should be tested. However, risk assessment, particularly when considering unintended changes introduced by new plant breeding methods such as gene editing, often ignores these principles. Instead, phenotypic characterisation seeks to catalogue all unintended changes by profiling methods and then proceeds to work out whether any of the changes are important. This paper argues that profiling is an inefficient and ineffective method of phenotypic characterisation for risk assessment. It discusses reasons why profiling is favoured and corrects some misconceptions about problem formulation.
Collapse
|
49
|
Yang Q, An X, Pan W. Computing and graphing probability values of pearson distributions: a SAS/IML macro. SOURCE CODE FOR BIOLOGY AND MEDICINE 2020; 14:6. [PMID: 31889995 PMCID: PMC6923921 DOI: 10.1186/s13029-019-0076-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Accepted: 11/22/2019] [Indexed: 11/10/2022]
Abstract
Background Any empirical data can be approximated to one of Pearson distributions using the first four moments of the data (Elderton WP, Johnson NL. Systems of Frequency Curves. 1969; Pearson K. Philos Trans R Soc Lond Ser A. 186:343–414 1895; Solomon H, Stephens MA. J Am Stat Assoc. 73(361):153–60 1978). Thus, Pearson distributions made statistical analysis possible for data with unknown distributions. There are both extant, old-fashioned in-print tables (Pearson ES, Hartley HO. Biometrika Tables for Statisticians, vol. II. 1972) and contemporary computer programs (Amos DE, Daniel SL. Tables of percentage points of standardized pearson distributions. 1971; Bouver H, Bargmann RE. Tables of the standardized percentage points of the pearson system of curves in terms of β1 and β2. 1974; Bowman KO, Shenton LR. Biometrika. 66(1):147–51 1979; Davis CS, Stephens MA. Appl Stat. 32(3):322–7 1983; Pan W. J Stat Softw. 31(Code Snippet 2):1–6 2009) available for obtaining percentage points of Pearson distributions corresponding to certain pre-specified percentages (or probability values; e.g., 1.0%, 2.5%, 5.0%, etc.), but they are little useful in statistical analysis because we have to rely on unwieldy second difference interpolation to calculate a probability value of a Pearson distribution corresponding to a given percentage point, such as an observed test statistic in hypothesis testing. Results The present study develops a SAS/IML macro program to identify the appropriate type of Pearson distribution based on either input of dataset or the values of four moments and then compute and graph probability values of Pearson distributions for any given percentage points. Conclusions The SAS macro program returns accurate approximations to Pearson distributions and can efficiently facilitate researchers to conduct statistical analysis on data with unknown distributions.
Collapse
|
50
|
Abstract
Expression quantitative trait loci (eQTL) analysis identifies genetic variants that regulate the expression level of a gene. The genetic regulation may persist or vary in different tissues. When data are available on multiple tissues, it is often desired to borrow information across tissues and conduct an integrative analysis. Here we describe a multi-tissue eQTL analysis procedure, which improves the identification of different types of eQTL and facilitates the assessment of tissue specificity.
Collapse
|