1
|
Novack GD. Pipeline: Ocular biostatistics: Proper use of proportions. Ocul Surf 2024; 32:120-122. [PMID: 38387782 DOI: 10.1016/j.jtos.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Affiliation(s)
- Gary D Novack
- Department of Ophthalmology & Visual Sciences, University of California, Davis, USA; PharmaLogic Development Inc., San Rafael CA, USA; Department of Ophthalmology, USA.
| |
Collapse
|
2
|
Cao C, Zhang S, Wang J, Tian M, Ji X, Huang D, Yang S, Gu N. PGS-Depot: a comprehensive resource for polygenic scores constructed by summary statistics based methods. Nucleic Acids Res 2024; 52:D963-D971. [PMID: 37953384 PMCID: PMC10767792 DOI: 10.1093/nar/gkad1029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 10/04/2023] [Accepted: 10/20/2023] [Indexed: 11/14/2023] Open
Abstract
Polygenic score (PGS) is an important tool for the genetic prediction of complex traits. However, there are currently no resources providing comprehensive PGSs computed from published summary statistics, and it is difficult to implement and run different PGS methods due to the complexity of their pipelines and parameter settings. To address these issues, we introduce a new resource called PGS-Depot containing the most comprehensive set of publicly available disease-related GWAS summary statistics. PGS-Depot includes 5585 high quality summary statistics (1933 quantitative and 3652 binary trait statistics) curated from 1564 traits in European and East Asian populations. A standardized best-practice pipeline is used to implement 11 summary statistics-based PGS methods, each with different model assumptions and estimation procedures. The prediction performance of each method can be compared for both in- and cross-ancestry populations, and users can also submit their own summary statistics to obtain custom PGS with the available methods. Other features include searching for PGSs by trait name, publication, cohort information, population, or the MeSH ontology tree and searching for trait descriptions with the experimental factor ontology (EFO). All scores, SNP effect sizes and summary statistics can be downloaded via FTP. PGS-Depot is freely available at http://www.pgsdepot.net.
Collapse
Affiliation(s)
- Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Shuting Zhang
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Jianhua Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300203, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Xiaolong Ji
- Department of Biostatistics, Centre for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Dandan Huang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300203, China
| | - Sheng Yang
- Department of Biostatistics, Centre for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Ning Gu
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Medical School, Nanjing University, Nanjing, Jiangsu 210093, China
| |
Collapse
|
3
|
Rodriguez Duque D, Moodie EEM, Stephens DA. Bayesian inference for optimal dynamic treatment regimes in practice. Int J Biostat 2023; 19:309-331. [PMID: 37192544 DOI: 10.1515/ijb-2022-0073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 03/21/2023] [Indexed: 05/18/2023]
Abstract
In this work, we examine recently developed methods for Bayesian inference of optimal dynamic treatment regimes (DTRs). DTRs are a set of treatment decision rules aimed at tailoring patient care to patient-specific characteristics, thereby falling within the realm of precision medicine. In this field, researchers seek to tailor therapy with the intention of improving health outcomes; therefore, they are most interested in identifying optimal DTRs. Recent work has developed Bayesian methods for identifying optimal DTRs in a family indexed by ψ via Bayesian dynamic marginal structural models (MSMs) (Rodriguez Duque D, Stephens DA, Moodie EEM, Klein MB. Semiparametric Bayesian inference for dynamic treatment regimes via dynamic regime marginal structural models. Biostatistics; 2022. (In Press)); we review the proposed estimation procedure and illustrate its use via the new BayesDTR R package. Although methods in Rodriguez Duque D, Stephens DA, Moodie EEM, Klein MB. (Semiparametric Bayesian inference for dynamic treatment regimes via dynamic regime marginal structural models. Biostatistics; 2022. (In Press)) can estimate optimal DTRs well, they may lead to biased estimators when the model for the expected outcome if everyone in a population were to follow a given treatment strategy, known as a value function, is misspecified or when a grid search for the optimum is employed. We describe recent work that uses a Gaussian process ( G P ) prior on the value function as a means to robustly identify optimal DTRs (Rodriguez Duque D, Stephens DA, Moodie EEM. Estimation of optimal dynamic treatment regimes using Gaussian processes; 2022. Available from: https://doi.org/10.48550/arXiv.2105.12259). We demonstrate how a G P approach may be implemented with the BayesDTR package and contrast it with other value-search approaches to identifying optimal DTRs. We use data from an HIV therapeutic trial in order to illustrate a standard analysis with these methods, using both the original observed trial data and an additional simulated component to showcase a longitudinal (two-stage DTR) analysis.
Collapse
Affiliation(s)
| | - Erica E M Moodie
- Department of Epidemiology & Biostatistics, McGill University, Montréal, QC, Canada
| | - David A Stephens
- Department of Mathematics and Statistics, McGill University, Montréal, QC, Canada
| |
Collapse
|
4
|
Lu H, Cai F, Li Y, Ou X. Accurate interval estimation for the risk difference in an incomplete correlated 2 × 2 table: Calf immunity analysis. PLoS One 2022; 17:e0272007. [PMID: 35867721 PMCID: PMC9307212 DOI: 10.1371/journal.pone.0272007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 07/11/2022] [Indexed: 11/19/2022] Open
Abstract
Interval estimation with accurate coverage for risk difference (RD) in a correlated 2 × 2 table with structural zero is a fundamental and important problem in biostatistics. The score test-based and Bayesian tail-based confidence intervals (CIs) have good coverage performance among the existing methods. However, as approximation approaches, they have coverage probabilities lower than the nominal confidence level for finite and moderate sample sizes. In this paper, we propose three new CIs for RD based on the fiducial, inferential model (IM) and modified IM (MIM) methods. The IM interval is proven to be valid. Moreover, simulation studies show that the CIs of fiducial and MIM methods can guarantee the preset coverage rate even for small sample sizes. More importantly, in terms of coverage probability and expected length, the MIM interval outperforms other intervals. Finally, a real example illustrates the application of the proposed methods.
Collapse
Affiliation(s)
- Hezhi Lu
- School of Economics and Statistics, Guangzhou University, Guangzhou, 510006, PRC
| | - Fengjing Cai
- College of Mathematics and Physics, Wenzhou University, Wenzhou, 325035, PRC
| | - Yuan Li
- Research Centre for Applied Mathematics, Shenzhen Polytechnic, Shenzhen, 518000, PRC
| | - Xionghui Ou
- School of Mathematical Science, South China Normal University, Guangzhou, 510631, PRC
| |
Collapse
|
5
|
Cornel T. Contested Numbers: The failed negotiation of objective statistics in a methodological review of Kinsey et al.'s sex research. Hist Philos Life Sci 2021; 43:13. [PMID: 33528820 DOI: 10.1007/s40656-020-00363-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 12/30/2020] [Indexed: 06/12/2023]
Abstract
From 1950 to 1952, statisticians W.G. Cochran, C.F. Mosteller, and J.W. Tukey reviewed A.C. Kinsey and colleagues' methodology. Neither the history-and-philosophy of science literature nor contemporary theories of interdisciplinarity seem to offer a conceptual model that fits this forced interaction, which was characterized by significant power asymmetries and disagreements on multiple levels. The statisticians initially attempted to exclude all non-technical matters from their evaluation, but their political and personal investments interfered with this agenda. In the face of McCarthy's witch hunts, negotiations with Kinsey and his funding institutions became integral to the review group's work. This paper analyzes the heavy burden of emotional and affective labor in this collaboration, the conflicts caused by competing visions of objectivity, and the uses of statistical knowledge to gain and sustain authority. Kinsey's refusal to adopt the recommended probability sample damaged his already precarious position even further and marked him as a biased researcher who put his personal agenda above methodological rigor. Kinsey's uncooperative demeanor can be explained by distrust resulting from numerous adverse reactions to his work and by fear of having his sexuality exposed. This case study illustrates that the very concept of valid numbers can become an arena for power struggles and that quantification alone does not guarantee productive exchanges across disciplines. It calls for a deeper conceptual analysis of the prerequisites for successful scientific collaborations.
Collapse
Affiliation(s)
- Tabea Cornel
- Division of Humanities, New College of Florida, Sarasota, FL, USA.
| |
Collapse
|
6
|
Voets PJGM, Vogtländer NPJ, Kaasjager KAH. Comparing the Voets equation and the Adrogue-Madias equation for predicting the plasma sodium response to intravenous fluid therapy in SIADH patients. PLoS One 2021; 16:e0245499. [PMID: 33449937 PMCID: PMC7810276 DOI: 10.1371/journal.pone.0245499] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Accepted: 12/30/2020] [Indexed: 11/18/2022] Open
Abstract
Background The syndrome of inappropriate antidiuretic hormone secretion (SIADH) is one of the most common causes of hypotonic hyponatremia. In our previous work, we have derived a novel model (Voets equation) that can be used by clinicians to predict the effect of crystalloid intravenous fluid therapy on the plasma sodium concentration in SIADH. Methods In this retrospective chart review, the predictive accuracy of the Voets equation and the Adrogue-Madias equation for the plasma sodium response to crystalloid infusate was compared for fifteen plasma sodium response measurements (n = 15) in twelve SIADH patients. The medical records of these patients were accessed anonymously and none of the authors were their treating physicians. The Pearson correlation coefficient r and corresponding p-value were calculated for the predictions by the Voets model compared to the measured plasma sodium response and for the predictions by the Adrogue-Madias model compared to the measured plasma sodium response. Results and conclusion The presented results show that the Voets model (r = 0.94, p < 0.001) predicted the aforementioned plasma sodium response significantly more accurately than the Adrogue-Madias model (r = 0.49, p = 0.07) in SIADH patients and could therefore be a clinically useful addition to the existing prediction models.
Collapse
Affiliation(s)
- Philip J. G. M. Voets
- Department of Nephrology, University Medical Centre Utrecht, Utrecht, The Netherlands
- Department of Nephrology, Gelre Hospital, Apeldoorn, The Netherlands
- * E-mail:
| | | | - Karin A. H. Kaasjager
- Department of Nephrology, University Medical Centre Utrecht, Utrecht, The Netherlands
| |
Collapse
|
7
|
Foroughi Pour A, Loveless I, Rempala G, Pietrzak M. Binary Classification for Failure Risk Assessment. Methods Mol Biol 2021; 2194:77-105. [PMID: 32926363 DOI: 10.1007/978-1-0716-0849-4_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Survival analysis is tremendously powerful, and is a popular methodology for analyzing time to event models in bioinformatics. Furthermore, several of its extensions can simultaneously perform variable selection in conjunction with model estimation. While this flexibility is extremely desirable, under certain scenarios, binary class variable selection and classification methods might provide more reliable risk estimates. Synthetic simulations and real data case studies suggest that when (1) randomly censored points comprise only a small portion of data, (2) biological markers are weak, (3) it is desired to compute risk across predetermined time intervals, and (4) the assumptions of the competing time to event models are violated, binary class models tend to perform superior. In practice, it might be prudent to test both model families to guarantee adequate analysis. Here we describe the pipeline of binary class feature selection and classification for time to event risk assessment.
Collapse
Affiliation(s)
- Ali Foroughi Pour
- Department of Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA
- Department of Mathematics, The Ohio State University, Columbus, OH, USA
| | - Ian Loveless
- College of Public Health, The Ohio State University, Columbus, OH, USA
| | - Grzegorz Rempala
- Department of Mathematics, The Ohio State University, Columbus, OH, USA
- College of Public Health, The Ohio State University, Columbus, OH, USA
| | - Maciej Pietrzak
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
8
|
Bach P, Wallisch C, Klein N, Hafermann L, Sauerbrei W, Steyerberg EW, Heinze G, Rauch G. Systematic review of education and practical guidance on regression modeling for medical researchers who lack a strong statistical background: Study protocol. PLoS One 2020; 15:e0241427. [PMID: 33347441 PMCID: PMC7751867 DOI: 10.1371/journal.pone.0241427] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 10/14/2020] [Indexed: 12/23/2022] Open
Abstract
In the last decades, statistical methodology has developed rapidly, in particular in the field of regression modeling. Multivariable regression models are applied in almost all medical research projects. Therefore, the potential impact of statistical misconceptions within this field can be enormous Indeed, the current theoretical statistical knowledge is not always adequately transferred to the current practice in medical statistics. Some medical journals have identified this problem and published isolated statistical articles and even whole series thereof. In this systematic review, we aim to assess the current level of education on regression modeling that is provided to medical researchers via series of statistical articles published in medical journals. The present manuscript is a protocol for a systematic review that aims to assess which aspects of regression modeling are covered by statistical series published in medical journals that intend to train and guide applied medical researchers with limited statistical knowledge. Statistical paper series cannot easily be summarized and identified by common keywords in an electronic search engine like Scopus. We therefore identified series by a systematic request to statistical experts who are part or related to the STRATOS Initiative (STRengthening Analytical Thinking for Observational Studies). Within each identified article, two raters will independently check the content of the articles with respect to a predefined list of key aspects related to regression modeling. The content analysis of the topic-relevant articles will be performed using a predefined report form to assess the content as objectively as possible. Any disputes will be resolved by a third reviewer. Summary analyses will identify potential methodological gaps and misconceptions that may have an important impact on the quality of analyses in medical research. This review will thus provide a basis for future guidance papers and tutorials in the field of regression modeling which will enable medical researchers 1) to interpret publications in a correct way, 2) to perform basic statistical analyses in a correct way and 3) to identify situations when the help of a statistical expert is required.
Collapse
Affiliation(s)
- Paul Bach
- Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Berlin, Germany
- School of Business and Economics, Applied Statistics, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Christine Wallisch
- Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Berlin, Germany
- Section for Clinical Biometrics, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Nadja Klein
- School of Business and Economics, Applied Statistics, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Lorena Hafermann
- Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Berlin, Germany
| | - Willi Sauerbrei
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center—University of Freiburg, Freiburg, Germany
| | - Ewout W. Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Georg Heinze
- Section for Clinical Biometrics, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Geraldine Rauch
- Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Berlin, Germany
| | | |
Collapse
|
9
|
Molenberghs G, Buyse M, Abrams S, Hens N, Beutels P, Faes C, Verbeke G, Van Damme P, Goossens H, Neyens T, Herzog S, Theeten H, Pepermans K, Abad AA, Van Keilegom I, Speybroeck N, Legrand C, De Buyser S, Hulstaert F. Infectious diseases epidemiology, quantitative methodology, and clinical research in the midst of the COVID-19 pandemic: Perspective from a European country. Contemp Clin Trials 2020; 99:106189. [PMID: 33132155 PMCID: PMC7581408 DOI: 10.1016/j.cct.2020.106189] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 10/04/2020] [Accepted: 10/16/2020] [Indexed: 01/08/2023]
Abstract
Starting from historic reflections, the current SARS-CoV-2 induced COVID-19 pandemic is examined from various perspectives, in terms of what it implies for the implementation of non-pharmaceutical interventions, the modeling and monitoring of the epidemic, the development of early-warning systems, the study of mortality, prevalence estimation, diagnostic and serological testing, vaccine development, and ultimately clinical trials. Emphasis is placed on how the pandemic had led to unprecedented speed in methodological and clinical development, the pitfalls thereof, but also the opportunities that it engenders for national and international collaboration, and how it has simplified and sped up procedures. We also study the impact of the pandemic on clinical trials in other indications. We note that it has placed biostatistics, epidemiology, virology, infectiology, and vaccinology, and related fields in the spotlight in an unprecedented way, implying great opportunities, but also the need to communicate effectively, often amidst controversy.
Collapse
Affiliation(s)
- Geert Molenberghs
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Data Science Institute, Hasselt University, Belgium; Interuniversity Institute for Biostatistics and statistical Bioinformatics, KU Leuven, Belgium
| | - Marc Buyse
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Data Science Institute, Hasselt University, Belgium; International Drug Development Institute, Belgium; CluePoints, Belgium.
| | - Steven Abrams
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Data Science Institute, Hasselt University, Belgium; Global Health Institute, Department of Epidemiology and Social Medicine, University of Antwerp, Belgium
| | - Niel Hens
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Data Science Institute, Hasselt University, Belgium; Centre for Health Economics Research and Modelling of Infectious Diseases, University of Antwerp, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Belgium
| | - Philippe Beutels
- Centre for Health Economics Research and Modelling of Infectious Diseases, University of Antwerp, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Belgium
| | - Christel Faes
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Data Science Institute, Hasselt University, Belgium
| | - Geert Verbeke
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Data Science Institute, Hasselt University, Belgium; Interuniversity Institute for Biostatistics and statistical Bioinformatics, KU Leuven, Belgium
| | - Pierre Van Damme
- Centre for Health Economics Research and Modelling of Infectious Diseases, University of Antwerp, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Belgium
| | | | - Thomas Neyens
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Data Science Institute, Hasselt University, Belgium; Interuniversity Institute for Biostatistics and statistical Bioinformatics, KU Leuven, Belgium
| | - Sereina Herzog
- Centre for Health Economics Research and Modelling of Infectious Diseases, University of Antwerp, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Belgium
| | - Heidi Theeten
- Centre for Health Economics Research and Modelling of Infectious Diseases, University of Antwerp, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Belgium
| | - Koen Pepermans
- Centre for Health Economics Research and Modelling of Infectious Diseases, University of Antwerp, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Belgium
| | - Ariel Alonso Abad
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, KU Leuven, Belgium
| | | | | | - Catherine Legrand
- Institute of Statistics, Biostatistics and Actuarial Sciences, UC Louvain, Belgium
| | | | | |
Collapse
|
10
|
Baessler F, Zafar A, Ciprianidis A, Wagner FL, Klein SB, Schweizer S, Bartolovic M, Roesch-Ely D, Ditzen B, Nikendei C, Schultz JH. Analysis of risk communication teaching in psychosocial and other medical departments. Med Educ Online 2020; 25:1746014. [PMID: 32249706 PMCID: PMC7170276 DOI: 10.1080/10872981.2020.1746014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Aims: Teaching students about risk communication is an important aspect at medical schools given the growing importance of informed consent in healthcare. This observational study analyzes the quality of teaching content on risk communication and biostatistics at a medical school.Methods: Based on the concept of curriculum mapping, purpose-designed questionnaires were used via participant observers to record the frequency, characteristics and context of risk communication employed by lecturers during teaching sessions for one semester. The data was analyzed quantitatively and descriptively.Results: Teaching about risk communication was observed in 24.4% (n = 95 of 390) sessions. Prevalence varied significantly among different departments with dermatology having the highest rate (67.9%) but lesser in-depth teaching than medical psychology where risk communication concepts were discussed on a higher scale in 61.4% sessions. Relevant statistical values were not mentioned at all in 69% of these 95 sessions and clinical contexts were used rarely (55.8%). Supplementary teaching material was provided in 50.5% sessions while students asked questions in 18.9% sessions.Conclusions: Students are infrequently taught about communicating risks. When they are, the teaching does not include the mention of core biostatistics values nor does the teaching involve methods for demonstrating risk communication.
Collapse
Affiliation(s)
- Franziska Baessler
- Department of General Internal and Psychosomatic Medicine, Heidelberg University Hospital, Heidelberg, Germany
- CONTACT Franziska Baessler Department for General Internal and Psychosomatic Medicine, Centre for Psychosocial Medicine, Heidelberg University Hospital,Im Neuenheimer Feld 410, Heidelberg 69120, Germany
| | - Ali Zafar
- Department of General Internal and Psychosomatic Medicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Anja Ciprianidis
- Department of General Internal and Psychosomatic Medicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Fabienne Louise Wagner
- Department of General Internal and Psychosomatic Medicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Sonja Bettina Klein
- Department of General Internal and Psychosomatic Medicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Sophie Schweizer
- Department of Gynecology and Obstetrics, Heidelberg University Hospital, Heidelberg, Germany
| | - Marina Bartolovic
- Department of General Adult Psychiatry, Centre for Psychosocial Medicine, University of Heidelberg, Heidelberg, Germany
| | - Daniela Roesch-Ely
- Department of General Adult Psychiatry, Centre for Psychosocial Medicine, University of Heidelberg, Heidelberg, Germany
| | - Beate Ditzen
- Department of Gynecology and Obstetrics, Heidelberg University Hospital, Heidelberg, Germany
| | - Christoph Nikendei
- Department of General Internal and Psychosomatic Medicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Jobst-Hendrik Schultz
- Department of General Internal and Psychosomatic Medicine, Heidelberg University Hospital, Heidelberg, Germany
| |
Collapse
|
11
|
Ma Y, Jenkins HE, Sebastiani P, Ellner JJ, Jones-López EC, Dietze R, Horsburgh, Jr. CR, White LF. Using Cure Models to Estimate the Serial Interval of Tuberculosis With Limited Follow-up. Am J Epidemiol 2020; 189:1421-1426. [PMID: 32458995 PMCID: PMC7731991 DOI: 10.1093/aje/kwaa090] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 05/14/2020] [Accepted: 05/15/2020] [Indexed: 12/26/2022] Open
Abstract
Serial interval (SI), defined as the time between symptom onset in an infector and infectee pair, is commonly used to understand infectious diseases transmission. Slow progression to active disease, as well as the small percentage of individuals who will eventually develop active disease, complicate the estimation of the SI for tuberculosis (TB). In this paper, we showed via simulation studies that when there is credible information on the percentage of those who will develop TB disease following infection, a cure model, first introduced by Boag in 1949, should be used to estimate the SI for TB. This model includes a parameter in the likelihood function to account for the study population being composed of those who will have the event of interest and those who will never have the event. We estimated the SI for TB to be approximately 0.5 years for the United States and Canada (January 2002 to December 2006) and approximately 2.0 years for Brazil (March 2008 to June 2012), which might imply a higher occurrence of reinfection TB in a developing country like Brazil.
Collapse
Affiliation(s)
- Yicheng Ma
- Correspondence to Dr. Yicheng Ma, Department of Biostatistics, 801 Massachusetts Avenue, Boston, MA 02118 (e-mail: )
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Sparapani RA, Rein LE, Tarima SS, Jackson TA, Meurer JR. Non-parametric recurrent events analysis with BART and an application to the hospital admissions of patients with diabetes. Biostatistics 2020; 21:69-85. [PMID: 30059992 DOI: 10.1093/biostatistics/kxy032] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2017] [Accepted: 04/23/2018] [Indexed: 11/12/2022] Open
Abstract
Much of survival analysis is concerned with absorbing events, i.e., subjects can only experience a single event such as mortality. This article is focused on non-absorbing or recurrent events, i.e., subjects are capable of experiencing multiple events. Recurrent events have been studied by many; however, most rely on the restrictive assumptions of linearity and proportionality. We propose a new method for analyzing recurrent events with Bayesian Additive Regression Trees (BART) avoiding such restrictive assumptions. We explore this new method via a motivating example of hospital admissions for diabetes patients and simulated data sets.
Collapse
Affiliation(s)
- Rodney A Sparapani
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - Lisa E Rein
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - Sergey S Tarima
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - Tourette A Jackson
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - John R Meurer
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| |
Collapse
|
13
|
Kroc E. Measurement protocols, random-variable-valued measurements, and response process error: Estimation and inference when sample data are not deterministic. PLoS One 2020; 15:e0239821. [PMID: 33002051 PMCID: PMC7529193 DOI: 10.1371/journal.pone.0239821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 09/15/2020] [Indexed: 12/03/2022] Open
Abstract
Random-variable-valued measurements (RVVMs) are proposed as a new framework for treating measurement processes that generate non-deterministic sample data. They operate by assigning a probability measure to each observed sample instantiation of a global measurement process for some particular random quantity of interest, thus allowing for the explicit quantification of response process error. Common methodologies to date treat only measurement processes that generate fixed values for each sample unit, thus generating full (though possibly inaccurate) information on the random quantity of interest. However, many applied research situations in the non-experimental sciences naturally contain response process error, e.g. when psychologists assess patient agreement with various diagnostic survey items or when conservation biologists perform formal assessments to classify species-at-risk. Ignoring the sample-unit-level uncertainty of response process error in such measurement processes can greatly compromise the quality of resulting inferences. In this paper, a general theory of RVVMs is proposed to handle response process error, and several applications are considered.
Collapse
Affiliation(s)
- Edward Kroc
- Measurement, Evaluation, and Research Methodology Program, Department of Educational and Counselling Psychology, and Special Education, University of British Columbia, Vancouver, Canada
| |
Collapse
|
14
|
Suibkitwanchai K, Sykulski AM, Perez Algorta G, Waller D, Walshe C. Nonparametric time series summary statistics for high-frequency accelerometry data from individuals with advanced dementia. PLoS One 2020; 15:e0239368. [PMID: 32976498 PMCID: PMC7518630 DOI: 10.1371/journal.pone.0239368] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 09/06/2020] [Indexed: 11/18/2022] Open
Abstract
Accelerometry data has been widely used to measure activity and the circadian rhythm of individuals across the health sciences, in particular with people with advanced dementia. Modern accelerometers can record continuous observations on a single individual for several days at a sampling frequency of the order of one hertz. Such rich and lengthy data sets provide new opportunities for statistical insight, but also pose challenges in selecting from a wide range of possible summary statistics, and how the calculation of such statistics should be optimally tuned and implemented. In this paper, we build on existing approaches, as well as propose new summary statistics, and detail how these should be implemented with high frequency accelerometry data. We test and validate our methods on an observed data set from 26 recordings from individuals with advanced dementia and 14 recordings from individuals without dementia. We study four metrics: Interdaily stability (IS), intradaily variability (IV), the scaling exponent from detrended fluctuation analysis (DFA), and a novel nonparametric estimator which we call the proportion of variance (PoV), which calculates the strength of the circadian rhythm using spectral density estimation. We perform a detailed analysis indicating how the time series should be optimally subsampled to calculate IV, and recommend a subsampling rate of approximately 5 minutes for the dataset that has been studied. In addition, we propose the use of the DFA scaling exponent separately for daytime and nighttime, to further separate effects between individuals. We compare the relationships between all these methods and show that they effectively capture different features of the time series.
Collapse
Affiliation(s)
- Keerati Suibkitwanchai
- Department of Mathematics and Statistics, Lancaster University, Lancaster, United Kingdom
- * E-mail:
| | - Adam M. Sykulski
- Department of Mathematics and Statistics, Lancaster University, Lancaster, United Kingdom
| | | | - Daniel Waller
- Department of Mathematics and Statistics, Lancaster University, Lancaster, United Kingdom
| | - Catherine Walshe
- Division of Health Research, Lancaster University, Lancaster, United Kingdom
| |
Collapse
|
15
|
van Eenige R, Verhave PS, Koemans PJ, Tiebosch IACW, Rensen PCN, Kooijman S. RandoMice, a novel, user-friendly randomization tool in animal research. PLoS One 2020; 15:e0237096. [PMID: 32756603 PMCID: PMC7406044 DOI: 10.1371/journal.pone.0237096] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 07/20/2020] [Indexed: 11/18/2022] Open
Abstract
Careful design of experiments using living organisms (e.g. mice) is of critical importance from both an ethical and a scientific standpoint. Randomization should, whenever possible, be an integral part of such experimental design to reduce bias thereby increasing its reliability and reproducibility. To keep the sample size as low as possible, one might take randomization one step further by controlling for baseline variations in the dependent variable(s) and/or certain known covariates. To give an example, in animal experiments aimed to study atherosclerosis development, one would want to control for baseline characteristics such as plasma triglyceride and total cholesterol levels and body weight. This can be done by first defining blocks to create balance among groups in terms of group size and baseline characteristics, followed by random assignment of the blocks to the various control and intervention groups. In the current study we developed a novel, user-friendly tool that allows users to easily randomize animals into blocks and identify random block divisions that are well-balanced based on given baseline characteristics, making randomization time-efficient and easy-to-use. Here, we present the resulting software tool that we have named RandoMice.
Collapse
Affiliation(s)
- Robin van Eenige
- Department of Medicine, Division of Endocrinology, Leiden University Medical Center, Leiden, the Netherlands
- Einthoven Laboratory for Experimental Vascular Medicine, Leiden University Medical Center, Leiden, The Netherlands
| | | | | | | | - Patrick C. N. Rensen
- Department of Medicine, Division of Endocrinology, Leiden University Medical Center, Leiden, the Netherlands
- Einthoven Laboratory for Experimental Vascular Medicine, Leiden University Medical Center, Leiden, The Netherlands
| | - Sander Kooijman
- Department of Medicine, Division of Endocrinology, Leiden University Medical Center, Leiden, the Netherlands
- Einthoven Laboratory for Experimental Vascular Medicine, Leiden University Medical Center, Leiden, The Netherlands
- * E-mail:
| |
Collapse
|
16
|
Abstract
Oncology clinical trials are undergoing transformation to evaluate targeted therapies addressing a wider variety of biologically defined cancer subgroups. Multiarm basket and umbrella trials conducted under master protocols have become more prominent mechanisms for the clinical evaluation of promising new biologically driven anticancer therapies that are integral to precision oncology medicine. These new trial designs permit efficient clinical evaluation of multiple therapies in a variety of histologically and biologically defined cancers. These complex trials require extensive planning and attention to many factors, including choice of biomarker assay platform, mechanism for processing clinicopathologic and biomarker data to assign patients to substudies, and statistical design, monitoring, and analysis of substudies. Trial teams have expanded to include expertise in the interface between biology, clinical oncology, bioinformatics, and statistics. Strategies for the design, conduct, and analysis of these complex trials will continue to evolve to meet new challenges and opportunities in precision oncology medicine.
Collapse
Affiliation(s)
- Laura M. Yee
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD
| | - Lisa M. McShane
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD
| | - Boris Freidlin
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD
| | - Margaret M. Mooney
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD
| | - Edward L. Korn
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD
| |
Collapse
|
17
|
Li Z, Chang C, Kundu S, Long Q. Bayesian generalized biclustering analysis via adaptive structured shrinkage. Biostatistics 2020; 21:610-624. [PMID: 30596887 PMCID: PMC7307984 DOI: 10.1093/biostatistics/kxy081] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 09/18/2018] [Accepted: 11/21/2018] [Indexed: 12/13/2022] Open
Abstract
Biclustering techniques can identify local patterns of a data matrix by clustering feature space and sample space at the same time. Various biclustering methods have been proposed and successfully applied to analysis of gene expression data. While existing biclustering methods have many desirable features, most of them are developed for continuous data and few of them can efficiently handle -omics data of various types, for example, binomial data as in single nucleotide polymorphism data or negative binomial data as in RNA-seq data. In addition, none of existing methods can utilize biological information such as those from functional genomics or proteomics. Recent work has shown that incorporating biological information can improve variable selection and prediction performance in analyses such as linear regression and multivariate analysis. In this article, we propose a novel Bayesian biclustering method that can handle multiple data types including Gaussian, Binomial, and Negative Binomial. In addition, our method uses a Bayesian adaptive structured shrinkage prior that enables feature selection guided by existing biological information. Our simulation studies and application to multi-omics datasets demonstrate robust and superior performance of the proposed method, compared to other existing biclustering methods.
Collapse
Affiliation(s)
- Ziyi Li
- Department of Biostatistics and Bioinformatics, Emory University, 1518 Clifton Road, NE, Atlanta, GA, USA
| | - Changgee Chang
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA, USA
| | - Suprateek Kundu
- Department of Biostatistics and Bioinformatics, Emory University, 1518 Clifton Road, NE, Atlanta, GA, USA
| | - Qi Long
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA, USA
| |
Collapse
|
18
|
Kirby T. Matthias Egger: a man with a method. Lancet Infect Dis 2020; 19:250. [PMID: 30833067 DOI: 10.1016/s1473-3099(19)30077-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
19
|
Wit EC, Augugliaro L, Pazira H, González J, Abegaz F. Sparse relative risk regression models. Biostatistics 2020; 21:e131-e147. [PMID: 30380025 PMCID: PMC7868056 DOI: 10.1093/biostatistics/kxy060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2017] [Revised: 09/20/2018] [Accepted: 09/24/2018] [Indexed: 11/15/2022] Open
Abstract
Clinical studies where patients are routinely screened for many genomic features are becoming more routine. In principle, this holds the promise of being able to find genomic signatures for a particular disease. In particular, cancer survival is thought to be closely linked to the genomic constitution of the tumor. Discovering such signatures will be useful in the diagnosis of the patient, may be used for treatment decisions and, perhaps, even the development of new treatments. However, genomic data are typically noisy and high-dimensional, not rarely outstripping the number of patients included in the study. Regularized survival models have been proposed to deal with such scenarios. These methods typically induce sparsity by means of a coincidental match of the geometry of the convex likelihood and a (near) non-convex regularizer. The disadvantages of such methods are that they are typically non-invariant to scale changes of the covariates, they struggle with highly correlated covariates, and they have a practical problem of determining the amount of regularization. In this article, we propose an extension of the differential geometric least angle regression method for sparse inference in relative risk regression models. A software implementation of our method is available on github (https://github.com/LuigiAugugliaro/dgcox).
Collapse
Affiliation(s)
- Ernst C Wit
- Institute of Computational Science, USI, Via Buffi 13, Lugano, Switzerland
| | - Luigi Augugliaro
- Department of Economics, Business and Statistics, University of Palermo, Building 13, Viale delle Scienze, Palermo, Italy
| | - Hassan Pazira
- Bernoulli Institute, University of Groningen, Nijenborg 9, AG Groningen, The Netherlands
| | - Javier González
- Amazon Research Cambridge, Poseidon House, Castle Park, Cambridge, UK
| | - Fentaw Abegaz
- Bernoulli Institute, University of Groningen, Nijenborg 9, AG Groningen, The Netherlands
- Department of Pediatrics and Systems Biology Centre for Energy Metabolism and Ageing, University of Groningen, University Medical Center Groningen, AD Groningen, The Netherlands
| |
Collapse
|
20
|
Chen LW, Yavuz I, Cheng Y, Wahed AS. Cumulative incidence regression for dynamic treatment regimens. Biostatistics 2020; 21:e113-e130. [PMID: 30371745 PMCID: PMC7868058 DOI: 10.1093/biostatistics/kxy062] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Revised: 10/01/2018] [Accepted: 10/02/2018] [Indexed: 11/14/2022] Open
Abstract
Recently dynamic treatment regimens (DTRs) have drawn considerable attention, as an effective tool for personalizing medicine. Sequential Multiple Assignment Randomized Trials (SMARTs) are often used to gather data for making inference on DTRs. In this article, we focus on regression analysis of DTRs from a two-stage SMART for competing risk outcomes based on cumulative incidence functions (CIFs). Even though there are extensive works on the regression problem for DTRs, no research has been done on modeling the CIF for SMART trials. We extend existing CIF regression models to handle covariate effects for DTRs. Asymptotic properties are established for our proposed estimators. The models can be implemented using existing software by an augmented-data approximation. We show the improvement provided by our proposed methods by simulation and illustrate its practical utility through an analysis of a SMART neuroblastoma study, where disease progression cannot be observed after death.
Collapse
Affiliation(s)
- Ling-Wan Chen
- Department of Statistics, University of Pittsburgh, 230 S Bouquet St, Pittsburgh, PA, USA
| | - Idil Yavuz
- Department of Statistics, Dokuz Eylul University, Tinaztepe, Buca, Izmir, Turkey
| | - Yu Cheng
- Departments of Statistics and Biostatistics, University of Pittsburgh, 230 S Bouquet St, Pittsburgh, PA, USA
| | - Abdus S Wahed
- Department of Biostatistics, University of Pittsburgh, 130 DeSoto Street, Pittsburgh, PA, USA
| |
Collapse
|
21
|
Abstract
This article considers Bayesian approaches for incorporating information from a historical model into a current analysis when the historical model includes only a subset of covariates currently of interest. The statistical challenge is 2-fold. First, the parameters in the nested historical model are not generally equal to their counterparts in the larger current model, neither in value nor interpretation. Second, because the historical information will not be equally informative for all parameters in the current analysis, additional regularization may be required beyond that provided by the historical information. We propose several novel extensions of the so-called power prior that adaptively combine a prior based upon the historical information with a variance-reducing prior that shrinks parameter values toward zero. The ideas are directly motivated by our work building mortality risk prediction models for pediatric patients receiving extracorporeal membrane oxygenation (ECMO). We have developed a model on a registry-based cohort of ECMO patients and now seek to expand this model with additional biometric measurements, not available in the registry, collected on a small auxiliary cohort. Our adaptive priors are able to use the information in the original model and identify novel mortality risk factors. We support this with a simulation study, which demonstrates the potential for efficiency gains in estimation under a variety of scenarios.
Collapse
Affiliation(s)
- Philip S Boonstra
- Department of Biostatistics, University of Michigan, 1415 Washington Hts, SPHII, Ann Arbor, MI, USA
| | - Ryan P Barbaro
- Division of Pediatric Critical Care and Child Health Evaluation and Research Unit, University of Michigan, 1500 East Medical Center Drive, Mott, Ann Arbor, MI, USA
| |
Collapse
|
22
|
White LF, Jiang W, Ma Y, So-Armah K, Samet JH, Cheng DM. Tutorial in Biostatistics: The use of generalized additive models to evaluate alcohol consumption as an exposure variable. Drug Alcohol Depend 2020; 209:107944. [PMID: 32145664 PMCID: PMC7171980 DOI: 10.1016/j.drugalcdep.2020.107944] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 02/24/2020] [Accepted: 02/25/2020] [Indexed: 11/30/2022]
Abstract
Alcohol consumption is a commonly studied risk factor for many poor health outcomes. Various instruments exist to measure alcohol consumption, including the AUDIT-C, Single Alcohol Screening Questionnaire (SASQ) and Timeline Followback. The information gathered by these instruments is often simplified and analyzed as a dichotomous measure, risking the loss of information of potentially prognostic value. We discuss generalized additive models (GAM) as a useful tool to understand the association between alcohol consumption and a health outcome. We demonstrate how this analytic strategy can guide the development of a regression model that retains maximal information about alcohol consumption. We illustrate these approaches using data from the Russia ARCH (Alcohol Research Collaboration on HIV/AIDS) study to analyze the association between alcohol consumption and biomarker of systemic inflammation, interleukin-6 (IL-6). We provide SAS and R code to implement these methods. GAMs have the potential to increase statistical power and allow for better elucidation of more nuanced and non-linear associations between alcohol consumption and important health outcomes.
Collapse
Affiliation(s)
- Laura F White
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Ave, Boston, MA 02118 United States.
| | - Wenqing Jiang
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Ave, Boston, MA 02118 United States
| | - Yicheng Ma
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Ave, Boston, MA 02118 United States; Department of Biostatistics, University of Michigan, 1415 Washington Hts, Ann Arbor, MI 48109 United States
| | - Kaku So-Armah
- General Internal Medicine, Boston University School of Medicine and Boston Medical Center, 801 Massachusetts Ave, Boston, MA 02118 United States
| | - Jeffrey H Samet
- General Internal Medicine, Boston University School of Medicine and Boston Medical Center, 801 Massachusetts Ave, Boston, MA 02118 United States
| | - Debbie M Cheng
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Ave, Boston, MA 02118 United States
| |
Collapse
|
23
|
Abstract
We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where subsets of samples, representing for example disease subtypes, may differ with respect to underlying regression models. In the high-dimensional setting, estimating a different model for each subgroup is challenging due to limited sample sizes. Focusing on the case in which subgroup-specific models may be expected to be similar but not necessarily identical, we treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an $\ell_1$ term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer's disease, amyotrophic lateral sclerosis, and cancer datasets. These examples demonstrate the gains joint estimation can offer in prediction as well as in providing subgroup-specific sparsity patterns.
Collapse
Affiliation(s)
- Frank Dondelinger
- Lancaster Medical School, Lancaster University, Furness College, Bailrigg, Lancaster, UK
| | - Sach Mukherjee
- Statistics and Machine Learning, German Center for Neurodegenerative Diseases (DZNE), Sigmund-Freud-Straße 27, Bonn, Germany
| | | |
Collapse
|
24
|
E Alifieris C, Souferi Chronopoulou E, T Trafalis D, Arvelakis A. The arbitrary magic of p<0.05: Beyond statistics. J BUON 2020; 25:588-593. [PMID: 32521838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Modern research and scientific conclusions are widely regarded as valid when the study design and analysis are interpreted correctly. P-value is considered to be the most commonly used method to provide a dichotomy between true and false data in evidence-based medicine. However, many authors, reviewers and editors may be unfamiliar with the true definition and correct interpretation of this number. This article intends to point out how misunderstanding or misuse of this value can have an impact in both the scientific community as well as the society we live in. The foundation of the medical education system rewards the abundance of scientific papers rather than the careful search of the truth. Appropriate research ethics should be practised in all stages of the publication process.
Collapse
Affiliation(s)
- Constantinos E Alifieris
- Recanati/Miller Transplantation Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | | | | |
Collapse
|
25
|
Ramanathan K, Thenmozhi M, George S, Anandan S, Veeraraghavan B, Naumova EN, Jeyaseelan L. Assessing Seasonality Variation with Harmonic Regression: Accommodations for Sharp Peaks. Int J Environ Res Public Health 2020; 17:ijerph17041318. [PMID: 32085630 PMCID: PMC7068504 DOI: 10.3390/ijerph17041318] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2019] [Revised: 02/06/2020] [Accepted: 02/13/2020] [Indexed: 11/16/2022]
Abstract
The use of the harmonic regression model is well accepted in the epidemiological and biostatistical communities as a standard procedure to examine seasonal patterns in disease occurrence. While these models may provide good fit to periodic patterns with relatively symmetric rises and falls, for some diseases the incidence fluctuates in a more complex manner. We propose a two-step harmonic regression approach to improve the model fit for data exhibiting sharp seasonal peaks. To capture such specific behavior, we first build a basic model and estimate the seasonal peak. At the second step, we apply an extended model using sine and cosine transform functions. These newly proposed functions mimic a quadratic term in the harmonic regression models and thus allow us to better fit the seasonal spikes. We illustrate the proposed method using actual and simulated data and recommend the new approach to assess seasonality in a broad spectrum of diseases manifesting sharp seasonal peaks.
Collapse
Affiliation(s)
- Kavitha Ramanathan
- Department of Biostatistics, Christian Medical College, Vellore 632002, India; (K.R.); (M.T.)
| | - Mani Thenmozhi
- Department of Biostatistics, Christian Medical College, Vellore 632002, India; (K.R.); (M.T.)
| | - Sebastian George
- Department of Statistics, St. Thomas College, Palai, Kerala 686575, India;
| | - Shalini Anandan
- Department of Clinical Microbiology, Christian Medical College, Vellore 632004, India; (S.A.); (B.V.)
| | - Balaji Veeraraghavan
- Department of Clinical Microbiology, Christian Medical College, Vellore 632004, India; (S.A.); (B.V.)
| | - Elena N. Naumova
- Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA 02111, USA;
- Department of Gastrointestinal Sciences, Christian Medical College, Vellore 632004, India
| | - Lakshmanan Jeyaseelan
- Department of Biostatistics, Christian Medical College, Vellore 632002, India; (K.R.); (M.T.)
- Correspondence: or
| |
Collapse
|
26
|
Abstract
OBJECTIVE Because it is impossible to know which statistical learning algorithm performs best on a prediction task, it is common to use stacking methods to ensemble individual learners into a more powerful single learner. Stacking algorithms are usually based on linear models, which may run into problems, especially when predictions are highly correlated. In this study, we develop a greedy algorithm for model stacking that overcomes this issue while still being very fast and easy to interpret. We evaluate our greedy algorithm on 7 different data sets from various biomedical disciplines and compare it to linear stacking, genetic algorithm stacking and a brute force approach in different prediction settings. We further apply this algorithm on a task to optimize the weighting of the single domains (e.g., income, education) that build the German Index of Multiple Deprivation (GIMD) to be highly correlated with mortality. RESULTS The greedy stacking algorithm provides good ensemble weights and outperforms the linear stacker in many tasks. Still, the brute force approach is slightly superior, but is computationally expensive. The greedy weighting algorithm has a variety of possible applications and is fast and efficient. A python implementation is provided.
Collapse
Affiliation(s)
- Christoph F. Kurz
- Institute of Health Economics and Health Care Management, Helmholtz Zentrum München, Ingolstädter Landstraße 1, Neuherberg, Germany
| | - Werner Maier
- Institute of Health Economics and Health Care Management, Helmholtz Zentrum München, Ingolstädter Landstraße 1, Neuherberg, Germany
| | - Christian Rink
- MAN Truck & Bus AG Munich, Elisabeth-Selbert-Strasse 1, 80939 München, Germany
| |
Collapse
|
27
|
Abstract
The quality of medical research importantly depends, among other aspects, on a valid statistical planning of the study, analysis of the data, and reporting of the results, which is usually guaranteed by a biostatistician. However, there are several related professions next to the biostatistician, for example epidemiologists, medical informaticians and bioinformaticians. For medical experts, it is often not clear what the differences between these professions are and how the specific role of a biostatistician can be described. For physicians involved in medical research, this is problematic because false expectations often lead to frustration on both sides. Therefore, the aim of this article is to outline the tasks and responsibilities of biostatisticians in clinical trials as well as in other fields of application in medical research.
Collapse
Affiliation(s)
- Antonia Zapf
- Department of Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246 Hamburg, Germany
| | - Geraldine Rauch
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology, Charitéplatz 1, 10117 Berlin, Germany
| | - Meinhard Kieser
- Institute of Medical Biometry and Informatics, Heidelberg University Hospital, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany
| |
Collapse
|
28
|
Yang Z, Dehmer M, Yli-Harja O, Emmert-Streib F. Combining deep learning with token selection for patient phenotyping from electronic health records. Sci Rep 2020; 10:1432. [PMID: 31996705 PMCID: PMC6989657 DOI: 10.1038/s41598-020-58178-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 01/13/2020] [Indexed: 01/05/2023] Open
Abstract
Artificial intelligence provides the opportunity to reveal important information buried in large amounts of complex data. Electronic health records (eHRs) are a source of such big data that provide a multitude of health related clinical information about patients. However, text data from eHRs, e.g., discharge summary notes, are challenging in their analysis because these notes are free-form texts and the writing formats and styles vary considerably between different records. For this reason, in this paper we study deep learning neural networks in combination with natural language processing to analyze text data from clinical discharge summaries. We provide a detail analysis of patient phenotyping, i.e., the automatic prediction of ten patient disorders, by investigating the influence of network architectures, sample sizes and information content of tokens. Importantly, for patients suffering from Chronic Pain, the disorder that is the most difficult one to classify, we find the largest performance gain for a combined word- and sentence-level input convolutional neural network (ws-CNN). As a general result, we find that the combination of data quality and data quantity of the text data is playing a crucial role for using more complex network architectures that improve significantly beyond a word-level input CNN model. From our investigations of learning curves and token selection mechanisms, we conclude that for such a transition one requires larger sample sizes because the amount of information per sample is quite small and only carried by few tokens and token categories. Interestingly, we found that the token frequency in the eHRs follow a Zipf law and we utilized this behavior to investigate the information content of tokens by defining a token selection mechanism. The latter addresses also issues of explainable AI.
Collapse
Affiliation(s)
- Zhen Yang
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Matthias Dehmer
- Steyr School of Management, University of Applied Sciences Upper Austria, 4400, Steyr Campus, Austria
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China
- Department of Biomedical Computer Science and Mechatronics, UMIT-The Health and Life Science University, 6060, Hall in Tyrol, Austria
| | - Olli Yli-Harja
- Computational Systems Biology Lab, Tampere University, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
| |
Collapse
|
29
|
Choi JY, Kyung M, Hwang H, Park JH. Bayesian Extended Redundancy Analysis: A Bayesian Approach to Component-based Regression with Dimension Reduction. Multivariate Behav Res 2020; 55:30-48. [PMID: 31021267 DOI: 10.1080/00273171.2019.1598837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Extended redundancy analysis (ERA) combines linear regression with dimension reduction to explore the directional relationships between multiple sets of predictors and outcome variables in a parsimonious manner. It aims to extract a component from each set of predictors in such a way that it accounts for the maximum variance of outcome variables. In this article, we extend ERA into the Bayesian framework, called Bayesian ERA (BERA). The advantages of BERA are threefold. First, BERA enables to make statistical inferences based on samples drawn from the joint posterior distribution of parameters obtained from a Markov chain Monte Carlo algorithm. As such, it does not necessitate any resampling method, which is on the other hand required for (frequentist's) ordinary ERA to test the statistical significance of parameter estimates. Second, it formally incorporates relevant information obtained from previous research into analyses by specifying informative power prior distributions. Third, BERA handles missing data by implementing multiple imputation using a Markov Chain Monte Carlo algorithm, avoiding the potential bias of parameter estimates due to missing data. We assess the performance of BERA through simulation studies and apply BERA to real data regarding academic achievement.
Collapse
Affiliation(s)
- Ji Yeh Choi
- Department of Psychology, National University of Singapore, Singapore, Singapore
| | - Minjung Kyung
- Department of Statistics, Duksung Women's University, Seoul, Korea
| | - Heungsun Hwang
- Department of Psychology, McGill University, Montreal, Quebec, Canada
| | - Ju-Hyun Park
- Department of Statistics, Dongguk University, Seoul, Korea
| |
Collapse
|
30
|
Marsh HW, Guo J, Dicke T, Parker PD, Craven RG. Confirmatory Factor Analysis (CFA), Exploratory Structural Equation Modeling (ESEM), and Set-ESEM: Optimal Balance Between Goodness of Fit and Parsimony. Multivariate Behav Res 2020; 55:102-119. [PMID: 31204844 DOI: 10.1080/00273171.2019.1602503] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
CFAs of multidimensional constructs often fail to meet standards of good measurement (e.g., goodness-of-fit, measurement invariance, and well-differentiated factors). Exploratory structural equation modeling (ESEM) represents a compromise between exploratory factor analysis' (EFA) flexibility, and CFA/SEM's rigor and parsimony, but lacks parsimony (particularly in large models) and might confound constructs that need to be kept separate. In Set-ESEM, two or more a priori sets of constructs are modeled within a single model such that cross-loadings are permissible within the same set of factors (as in Full-ESEM) but are constrained to be zero for factors in different sets (as in CFA). The different sets can reflect the same set of constructs on multiple occasions, and/or different constructs measured within the same wave. Hence, Set-ESEM that represents a middle-ground between the flexibility of traditional-ESEM (hereafter referred to as Full-ESEM) and the rigor and parsimony of CFA/SEM. Thus, the purposes of this article are to provide an overview tutorial on Set-ESEM, juxtapose it with Full-ESEM, and to illustrate its application with simulated data and diverse "real" data applications with accessible, heuristic explanations of best practice.
Collapse
Affiliation(s)
- Herbert W Marsh
- Institute of Positive Psychology and Education, Australian Catholic University, Sydney, Australia
| | - Jiesi Guo
- Institute of Positive Psychology and Education, Australian Catholic University, Sydney, Australia
| | - Theresa Dicke
- Institute of Positive Psychology and Education, Australian Catholic University, Sydney, Australia
| | - Philip D Parker
- Institute of Positive Psychology and Education, Australian Catholic University, Sydney, Australia
| | - Rhonda G Craven
- Institute of Positive Psychology and Education, Australian Catholic University, Sydney, Australia
| |
Collapse
|
31
|
Abstract
Exploratory mediation analysis via regularization, or XMed, is a recently developed technique that allows one to identify potential mediators of a process of interest. However, as currently implemented, it can only be applied to continuous outcomes. We extend this method to allow application to dichotomous outcomes, including both mediators and dependent variables. Simulation results show that XMed can achieve the same sensitivity as more conventional methods for mediation analysis such as the Sobel test, percentile bootstrap, and bias-corrected bootstrap, but in general requires only half the sample size to do so. We demonstrate the implementation of this approach using an illustrative example examining the relationship between youth behavioral/emotional problems and alcohol use.
Collapse
Affiliation(s)
- Sarfaraz Serang
- Department of Psychology, Utah State University, Logan, UT, USA
| | - Ross Jacobucci
- Department of Psychology, University of Notre Dame, Notre Dame, IN, USA
| |
Collapse
|
32
|
Abstract
Inference of variance components in linear mixed modeling (LMM) provides evidence of heterogeneity between individuals or clusters. When only nonnegative variances are allowed, there is a boundary (i.e., 0) in the variances' parameter space, and regular inference statistical procedures for such a parameter could be problematic. The goal of this article is to introduce a practically feasible permutation method to make inferences about variance components while considering the boundary issue in LMM. The permutation tests with different settings (i.e., constrained vs. unconstrained estimation, specific vs. generalized test, different ways of calculating p values, and different ways of permutation) were examined with both normal data and non-normal data. In addition, the permutation tests were compared to likelihood ratio (LR) tests with a mixture of chi-squared distributions as the reference distribution. We found that the unconstrained permutation test with the one-sided p-value approach performed better than the other permutation tests and is a useful alternative when the LR tests are not applicable. An R function is provided to facilitate the implementation of the permutation tests, and a real data example is used to illustrate the application. We hope our results will help researchers choose appropriate tests when testing variance components in LMM.
Collapse
Affiliation(s)
- Han Du
- Department of Psychology, University of California, Los Angeles, Los Angeles, California, USA
| | - Lijuan Wang
- University of Notre Dame, Notre Dame, Indiana, USA
| |
Collapse
|
33
|
Abstract
A general modeling framework of response accuracy and response times is proposed to track skill acquisition and provide additional diagnostic information on the change of latent speed in a learning environment. This framework consists of two types of models: a dynamic response model that captures the response accuracy and the change of discrete latent attribute profile upon factors such as practice, intervention effects, and other latent and observable covariates, and a dynamic response time model that describes the change of the continuous response latency due to change of latent attribute profile. These two types of models are connected through a parameter, describing the change rate of the latent speed through the learning process, and a covariate defined as a function of the latent attribute profile. A Bayesian estimation procedure is developed to calibrate the model parameters and measure the latent variables. The estimation algorithm is evaluated through several simulation studies under various conditions. The proposed models are applied to a real data set collected through a spatial rotation diagnostic assessment paired with learning tools.
Collapse
Affiliation(s)
- Shiyu Wang
- Department of Educational Psychology, University of Georgia
| | - Susu Zhang
- Department of Statistics, Columbia University
| | - Yawei Shen
- Department of Educational Psychology, University of Georgia
| |
Collapse
|
34
|
Yamashita N, Adachi K. Permutimin: Factor Rotation to Simple Structure with Permutation of Variables. Multivariate Behav Res 2020; 55:17-29. [PMID: 31021266 DOI: 10.1080/00273171.2019.1598331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Factor rotation is usually performed for a p-variables [Formula: see text]-factors loading matrix so that the resulting rotated matrix has a simple structure. This simple structure was originally defined by Thurstone (1947) by specifying how zero elements are arranged in the loading matrix. In this article, we propose a new rotation technique, which is directly based on Thurstone's definition. It can give a p-variables [Formula: see text]-factors target matrix of zero and nonzero elements, which stands for the properties to be possessed by the rotated loading matrix. However, it is unknown how the rows of the target matrix are associated with those of the loading matrix. In the proposed procedure, a loading matrix is rotated simultaneously with a permutation of the rows of the target matrix, so that the rotated loading matrix is optimally matched to the permuted target matrix in a least squares sense. Its novel feature is the use of permutation, thus we call the technique Permutimin. Its algorithm is presented, with Thurstone's definition of simple structure modified so as to specify the target matrix uniquely. Permutimin is illustrated with real data examples. Finally, we discuss the relationships between Permutimin and Procrustes rotation.
Collapse
Affiliation(s)
- Naoto Yamashita
- Graduate School of Human Sciences, Osaka University, Osaka, Japan
| | - Kohei Adachi
- Graduate School of Human Sciences, Osaka University, Osaka, Japan
| |
Collapse
|
35
|
Wu H, Fai Cheung S, On Leung S. Simple use of BIC to Assess Model Selection Uncertainty: An Illustration using Mediation and Moderation Models. Multivariate Behav Res 2020; 55:1-16. [PMID: 30932709 DOI: 10.1080/00273171.2019.1574546] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The Bayesian information criterion (BIC) has been used sometimes in SEM, even adopting a frequentist approach. Using simple mediation and moderation models as examples, we form posterior probability distribution via using BIC, which we call the BIC posterior, to assess model selection uncertainty of a finite number of models. This is simple but rarely used. The posterior probability distribution can be used to form a credibility set of models and to incorporate prior probabilities for model comparisons and selections. This was validated by a large scale simulation and results showed that the approximation via the BIC posterior is very good except when both the sample sizes and magnitude of parameters are small. We applied the BIC posterior to a real data set, and it has the advantages of flexibility in incorporating prior, addressing overfitting problems, and giving a full picture of posterior distribution to assess model selection uncertainty.
Collapse
Affiliation(s)
- Huiping Wu
- College of Mathematics and Informatics, Fujian Normal University, Fujian, China
| | - Shu Fai Cheung
- Department of Psychology, University of Macau, Macau, China
| | | |
Collapse
|
36
|
Chen PY, Wu W, Garnier-Villarreal M, Kite BA, Jia F. Testing Measurement Invariance with Ordinal Missing Data: A Comparison of Estimators and Missing Data Techniques. Multivariate Behav Res 2020; 55:87-101. [PMID: 31099262 DOI: 10.1080/00273171.2019.1608799] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Ordinal missing data are common in measurement equivalence/invariance (ME/I) testing studies. However, there is a lack of guidance on the appropriate method to deal with ordinal missing data in ME/I testing. Five methods may be used to deal with ordinal missing data in ME/I testing, including the continuous full information maximum likelihood estimation method (FIML), continuous robust FIML (rFIML), FIML with probit links (pFIML), FIML with logit links (lFIML), and mean and variance adjusted weight least squared estimation method combined with pairwise deletion (WLSMV_PD). The current study evaluates the relative performance of these methods in producing valid chi-square difference tests ([Formula: see text]) and accurate parameter estimates. The result suggests that all methods except for WLSMV_PD can reasonably control the type I error rates of [Formula: see text] tests and maintain sufficient power to detect noninvariance in most conditions. Only pFIML and lFIML yield accurate factor loading estimates and standard errors across all the conditions. Recommendations are provided to researchers based on the results.
Collapse
Affiliation(s)
- Po-Yi Chen
- Department of Psychology, University of Kansas
| | - Wei Wu
- Department of Psychology, Indiana University-Purdue University Indianapolis
| | | | | | | |
Collapse
|
37
|
Serhier Z, Bendahhou K, Ben Abdelaziz A, Bennani MO. Methodological sheet n°1: How to calculate the size of a sample for an observational study? Tunis Med 2020; 98:1-7. [PMID: 32395771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
|
38
|
Gadd SC, Tennant PWG, Heppenstall AJ, Boehnke JR, Gilthorpe MS. Analysing trajectories of a longitudinal exposure: A causal perspective on common methods in lifecourse research. PLoS One 2019; 14:e0225217. [PMID: 31800576 PMCID: PMC6892534 DOI: 10.1371/journal.pone.0225217] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 10/29/2019] [Indexed: 11/18/2022] Open
Abstract
Longitudinal data is commonly analysed to inform prevention policies for diseases that may develop throughout life. Commonly methods interpret the longitudinal data as a series of discrete measurements or as continuous patterns. Some of the latter methods condition on the outcome, aiming to capture ‘average’ patterns within outcome groups, while others capture individual-level pattern features before relating these to the outcome. Conditioning on the outcome may prevent meaningful interpretation. Repeated measurements of a longitudinal exposure (weight) and later outcome (glycated haemoglobin levels) were simulated to match three scenarios: one with no causal relationship between growth rate and glycated haemoglobin; two with a positive causal effect of growth rate on glycated haemoglobin. Two methods that condition on the outcome and one that did not were applied to the data in 1000 simulations. The interpretation of the two-step method matched the simulation in all causal scenarios, but that of the methods conditioning on the outcome did not. Methods that condition on the outcome do not accurately represent a causal relationship between a longitudinal pattern and outcome. Researchers considering longitudinal data should carefully determine if they wish to analyse longitudinal data as a series of discrete time points or by extracting pattern features.
Collapse
Affiliation(s)
- Sarah C. Gadd
- Leeds Institute of Data Analytics, University of Leeds, Leeds, England, United Kingdom
- School of Geography, University of Leeds, Leeds, England, United Kingdom
- * E-mail:
| | - Peter W. G. Tennant
- Leeds Institute of Data Analytics, University of Leeds, Leeds, England, United Kingdom
- School of Medicine, University of Leeds, Leeds, England, United Kingdom
- The Alan Turing Institute, London, England, United Kingdom
| | - Alison J. Heppenstall
- Leeds Institute of Data Analytics, University of Leeds, Leeds, England, United Kingdom
- School of Geography, University of Leeds, Leeds, England, United Kingdom
- The Alan Turing Institute, London, England, United Kingdom
| | - Jan R. Boehnke
- School of Nursing and Health Sciences, University of Dundee, Dundee, Scotland, United Kingdom
| | - Mark S. Gilthorpe
- Leeds Institute of Data Analytics, University of Leeds, Leeds, England, United Kingdom
- School of Medicine, University of Leeds, Leeds, England, United Kingdom
- The Alan Turing Institute, London, England, United Kingdom
| |
Collapse
|
39
|
Abstract
Numerical data in biology and medicine are commonly presented as mean or median with error or confidence limits, to the exclusion of individual values. Analysis of our own and others' data indicates that this practice risks excluding 'Goldilocks' effects in which a biological variable falls within a range between 'too much' and 'too little' with a region between where its function is 'just right'; a concept captured by the Swedish term 'Lagom'. This was confirmed by a narrative search of the literature using the PubMed database, which revealed numerous relationships of biological and clinical phenomena of the Goldilocks/Lagom form including quantitative and qualitative examples from the health and social sciences. Some possible mechanisms underlying these phenomena are considered. We conclude that retrospective analysis of existing data will most likely reveal a vast number of such distributions to the benefit of medical understanding and clinical care and that a transparent approach of presenting each value within a dataset individually should be adopted to ensure a more complete evaluation of research studies in future.
Collapse
Affiliation(s)
- Henry J Leese
- Centre for Atherothrombosis and Metabolic Disease, Hull York Medical School, University of Hull, Hull, UK
| | - Thozhukat Sathyapalan
- Academic Diabetes, Endocrinology and Metabolism, Hull York Medical School, University of Hull, Hull, UK
| | - Victoria Allgar
- Hull York Medical School, Department of Health Sciences, University of York, York, UK
| | - Daniel R Brison
- Department of Reproductive Medicine, Manchester University NHS Foundation Trust, Manchester, UK
| | - Roger Sturmey
- Centre for Atherothrombosis and Metabolic Disease, Hull York Medical School, University of Hull, Hull, UK
| |
Collapse
|
40
|
Tang ZZ, Chen G. Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis. Biostatistics 2019; 20:698-713. [PMID: 29939212 PMCID: PMC7410344 DOI: 10.1093/biostatistics/kxy025] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 04/26/2018] [Accepted: 05/06/2018] [Indexed: 12/19/2022] Open
Abstract
There is heightened interest in using high-throughput sequencing technologies to quantify abundances of microbial taxa and linking the abundance to human diseases and traits. Proper modeling of multivariate taxon counts is essential to the power of detecting this association. Existing models are limited in handling excessive zero observations in taxon counts and in flexibly accommodating complex correlation structures and dispersion patterns among taxa. In this article, we develop a new probability distribution, zero-inflated generalized Dirichlet multinomial (ZIGDM), that overcomes these limitations in modeling multivariate taxon counts. Based on this distribution, we propose a ZIGDM regression model to link microbial abundances to covariates (e.g. disease status) and develop a fast expectation-maximization algorithm to efficiently estimate parameters in the model. The derived tests enable us to reveal rich patterns of variation in microbial compositions including differential mean and dispersion. The advantages of the proposed methods are demonstrated through simulation studies and an analysis of a gut microbiome dataset.
Collapse
Affiliation(s)
- Zheng-Zheng Tang
- Department of Biostatistics and Medical Informatics, University of
Wisconsin-Madison, Madison, WI, USA and Wisconsin Institute for
Discovery, Madison, WI, USA
| | - Guanhua Chen
- Department of Biostatistics and Medical Informatics, University of
Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
41
|
Abstract
The human microbiome is a complex ecological system, and describing its structure and function under different environmental conditions is important from both basic scientific and medical perspectives. Viewed through a biostatistical lens, many microbiome analysis goals can be formulated as latent variable modeling problems. However, although probabilistic latent variable models are a cornerstone of modern unsupervised learning, they are rarely applied in the context of microbiome data analysis, in spite of the evolutionary, temporal, and count structure that could be directly incorporated through such models. We explore the application of probabilistic latent variable models to microbiome data, with a focus on Latent Dirichlet allocation, Non-negative matrix factorization, and Dynamic Unigram models. To develop guidelines for when different methods are appropriate, we perform a simulation study. We further illustrate and compare these techniques using the data of Dethlefsen and Relman (2011, Incomplete recovery and individualized responses of the human distal gut microbiota to repeated antibiotic perturbation. Proceedings of the National Academy of Sciences108, 4554-4561), a study on the effects of antibiotics on bacterial community composition. Code and data for all simulations and case studies are available publicly.
Collapse
Affiliation(s)
- Kris Sankaran
- Department of Statistics, Stanford University, 390 Serra Mall, Stanford, CA, USA
| | - Susan P Holmes
- Department of Statistics, Stanford University, 390 Serra Mall, Stanford, CA, USA
| |
Collapse
|
42
|
Hassan AK, Lampkin SJ, Hutcherson TC. Students' perceptions of biostatistics following integration into an evidence-based medicine course series. Curr Pharm Teach Learn 2019; 11:614-620. [PMID: 31213318 DOI: 10.1016/j.cptl.2019.02.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 01/04/2019] [Accepted: 02/18/2019] [Indexed: 06/09/2023]
Abstract
BACKGROUND AND PURPOSE Student pharmacists are expected to demonstrate an understanding of commonly employed statistical tests. This study describes the integration of biostatistics in an evidence-based medicine course series using a learner-centered model tailored to students' needs and interests. EDUCATIONAL ACTIVITY AND SETTING This course series included thirteen two-hour biostatistics sessions focused on interpreting results and critiquing statistical methods. Three lab sessions were also included, which focused on producing summary reports from clinical data. Journal club presentations were the key method of assessing knowledge. A survey to evaluate students' perceptions of the course and their level of confidence in applying biostatistical concepts was administered twice to measure change over time within two student cohorts. FINDINGS Results of the survey showed that a significantly higher proportion of students agreed they understood the analyses covered in class (97% vs. 44%, p < 0.001) and felt more confident interpreting results (82% vs. 41%, p < 0.001) in their third year compared to the second year. Students who agreed that they learned important skills for future practice had a significantly higher mean exam score (82.5% vs. 76.2%, p = 0.001). SUMMARY The results indicate an improvement in the students' perceptions over time with regards to knowledge and usefulness of the course content. Although, integrating biostatistics in a literature-evaluation course is common, this is the first study that evaluated teaching it in more than one semester beyond inclusion in assessment rubrics.
Collapse
Affiliation(s)
- Amany K Hassan
- D'Youville School of Pharmacy, 320 Porter Avenue, DAC 427, Buffalo, NY 14201, United States.
| | - Stacie J Lampkin
- D'Youville School of Pharmacy, 320 Porter Avenue, DAC 330, Buffalo, NY 14201, United States.
| | - Timothy C Hutcherson
- D'Youville School of Pharmacy, 320 Porter Avenue, DAC 320, Buffalo, NY 14201, United States.
| |
Collapse
|
43
|
Abstract
Simulation studies are computer experiments that involve creating data by pseudo-random sampling. A key strength of simulation studies is the ability to understand the behavior of statistical methods because some "truth" (usually some parameter/s of interest) is known from the process of generating the data. This allows us to consider properties of methods, such as bias. While widely used, simulation studies are often poorly designed, analyzed, and reported. This tutorial outlines the rationale for using simulation studies and offers guidance for design, execution, analysis, reporting, and presentation. In particular, this tutorial provides a structured approach for planning and reporting simulation studies, which involves defining aims, data-generating mechanisms, estimands, methods, and performance measures ("ADEMP"); coherent terminology for simulation studies; guidance on coding simulation studies; a critical discussion of key performance measures and their estimation; guidance on structuring tabular and graphical presentation of results; and new graphical presentations. With a view to describing recent practice, we review 100 articles taken from Volume 34 of Statistics in Medicine, which included at least one simulation study and identify areas for improvement.
Collapse
Affiliation(s)
- Tim P. Morris
- London Hub for Trials Methodology ResearchMRC Clinical Trials Unit at UCLLondonUnited Kingdom
| | - Ian R. White
- London Hub for Trials Methodology ResearchMRC Clinical Trials Unit at UCLLondonUnited Kingdom
| | - Michael J. Crowther
- Biostatistics Research Group, Department of Health SciencesUniversity of LeicesterLeicesterUnited Kingdom
| |
Collapse
|
44
|
Janzén DLI, Jirstrand M, Chappell MJ, Evans ND. Three novel approaches to structural identifiability analysis in mixed-effects models. Comput Methods Programs Biomed 2019; 171:141-152. [PMID: 27181677 DOI: 10.1016/j.cmpb.2016.04.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Revised: 03/21/2016] [Accepted: 04/21/2016] [Indexed: 06/05/2023]
Abstract
BACKGROUND AND OBJECTIVE Structural identifiability is a concept that considers whether the structure of a model together with a set of input-output relations uniquely determines the model parameters. In the mathematical modelling of biological systems, structural identifiability is an important concept since biological interpretations are typically made from the parameter estimates. For a system defined by ordinary differential equations, several methods have been developed to analyse whether the model is structurally identifiable or otherwise. Another well-used modelling framework, which is particularly useful when the experimental data are sparsely sampled and the population variance is of interest, is mixed-effects modelling. However, established identifiability analysis techniques for ordinary differential equations are not directly applicable to such models. METHODS In this paper, we present and apply three different methods that can be used to study structural identifiability in mixed-effects models. The first method, called the repeated measurement approach, is based on applying a set of previously established statistical theorems. The second method, called the augmented system approach, is based on augmenting the mixed-effects model to an extended state-space form. The third method, called the Laplace transform mixed-effects extension, is based on considering the moment invariants of the systems transfer function as functions of random variables. RESULTS To illustrate, compare and contrast the application of the three methods, they are applied to a set of mixed-effects models. CONCLUSIONS Three structural identifiability analysis methods applicable to mixed-effects models have been presented in this paper. As method development of structural identifiability techniques for mixed-effects models has been given very little attention, despite mixed-effects models being widely used, the methods presented in this paper provides a way of handling structural identifiability in mixed-effects models previously not possible.
Collapse
Affiliation(s)
- David L I Janzén
- Department of Systems and Data Analysis, Fraunhofer-Chalmers Centre, Chalmers Science Park, SE-412 88 Gothenburg, Sweden; AstraZeneca RD, SE-431 83 Mölndal, Sweden; School of Engineering, University of Warwick, Coventry CV4 7AL, UK.
| | - Mats Jirstrand
- Department of Systems and Data Analysis, Fraunhofer-Chalmers Centre, Chalmers Science Park, SE-412 88 Gothenburg, Sweden
| | | | - Neil D Evans
- School of Engineering, University of Warwick, Coventry CV4 7AL, UK
| |
Collapse
|
45
|
Abstract
BACKGROUND With progress on both the theoretical and the computational fronts the use of spline modelling has become an established tool in statistical regression analysis. An important issue in spline modelling is the availability of user friendly, well documented software packages. Following the idea of the STRengthening Analytical Thinking for Observational Studies initiative to provide users with guidance documents on the application of statistical methods in observational research, the aim of this article is to provide an overview of the most widely used spline-based techniques and their implementation in R. METHODS In this work, we focus on the R Language for Statistical Computing which has become a hugely popular statistics software. We identified a set of packages that include functions for spline modelling within a regression framework. Using simulated and real data we provide an introduction to spline modelling and an overview of the most popular spline functions. RESULTS We present a series of simple scenarios of univariate data, where different basis functions are used to identify the correct functional form of an independent variable. Even in simple data, using routines from different packages would lead to different results. CONCLUSIONS This work illustrate challenges that an analyst faces when working with data. Most differences can be attributed to the choice of hyper-parameters rather than the basis used. In fact an experienced user will know how to obtain a reasonable outcome, regardless of the type of spline used. However, many analysts do not have sufficient knowledge to use these powerful tools adequately and will need more guidance.
Collapse
Affiliation(s)
- Aris Perperoglou
- Department of Mathematical Sciences, University of Essex, Colchester, UK
| | - Willi Sauerbrei
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| | | | - Matthias Schmid
- Medical Biometry, Informatics and Epidemiology, Faculty of Medicine, University of Bonn, Bonn, Germany
| |
Collapse
|
46
|
Abstract
INTRODUCTION The requirement for medical services fluctuates. This study was carried out in order to attempt to extrapolate the service requirements for various cardiology services at Mater Dei Hospital, Malta over the coming five years, based on service demands from previous years. METHODS Past annual data was obtained from hospital records for various services (to 2017). Linear regression was carried out using a bespoke Excel™ spreadsheet in order to extrapolate possible services requirements up to 2022. RESULTS All services are expected to increase, with forecasts ranging between 41 and 354%, depending on services being considered. DISCUSSION It is easy to "get on with it" and perform the work required at the workplace but this study has shown that it is equally important to anticipate demands lest lack of planning leads to long and important waiting lists for critical diagnostics and treatments. Health care provision requirements are increasing worldwide. Even using conservative estimates and in the absence of the creation of new services, the demands for extant services are likely to continue to grow. Unless medium term plans are made for hardware, software, physical space and staffing, and the funding thereof, waiting lists for investigations in this speciality are bound to rise. This may be mitigated by novel treatments but since these cannot be predicted, it would be safer and wiser to plan ahead lest we are overwhelmed. This paper has also shown how WASP (Write a Scientific Paper) precepts can be applied to elegantly study a problem and write up a paper.
Collapse
Affiliation(s)
- Victor Grech
- University of Malta and Consultant Paediatric Cardiologist, Mater Dei Hospital, Malta.
| | | | | | | |
Collapse
|
47
|
Maqsood I, Bukhari SM, Ejaz R, Kausar S, Abbas MN, Ali B, Ke R. Biostatistical Options for Quantitative Diet Analysis. J Agric Food Chem 2019; 67:5-12. [PMID: 30520629 DOI: 10.1021/acs.jafc.8b05156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Sufficient statistics knowledge is crucial for the correct design of a research plan. The elucidations of results are interpretive only if appropriate statistical methods are applied. Statistical strategies are a particular approach to demonstrate complicated information in broad and explicable conclusions. The emergence of biostatistical approaches for diet evaluation has improved the accuracy of diet estimation, and different methodologies of data integration promise to magnify our understanding of ecological communities. The present study aimed to compile multiple statistical methods used for diet analysis. More specifically, the significant analysis used in diet assessment, central expectations, and preferences related to each measure was conceptualized. In addition, the ability of each test to evaluate diversity, richness, differentiation, fluctuation, similarity, and quantification of multiple diet items was summarized. Moreover, different options were proposed for researchers to select the appropriate statistical tests. This study covers a framework, aim, and understanding of the statistical test methods of diet analysis.
Collapse
Affiliation(s)
- Iram Maqsood
- College of Wildlife Resources , Northeast Forestry University , Hexing Road 59 Street , Xiang Fang District, Harbin City 150040 , China
- Department of Zoology , Shaheed Benazir Bhutto Women University Peshawar , Peshawar 25000 , Pakistan
| | - Syed Moshin Bukhari
- College of Wildlife and Ecology , University of Veterinary and Animal Sciences , Lahore 54500 , Pakistan
| | - Rabea Ejaz
- Department of Zoology , Shaheed Benazir Bhutto Women University Peshawar , Peshawar 25000 , Pakistan
| | - Saima Kausar
- College of Life Sciences . Anhui Agricultural University , Hefei 230036 , China
| | | | - Bahar Ali
- College of Plant Sciences and Technology, Hubei Insect Resources Utilization and Sustainable Pest Management Key Laboratory , Huazhong Agriculture University , Wuhan , Hubei 430070 , China
| | - Rong Ke
- College of Wildlife Resources , Northeast Forestry University , Hexing Road 59 Street , Xiang Fang District, Harbin City 150040 , China
| |
Collapse
|
48
|
Wu Z, Casciola-Rosen L, Shah AA, Rosen A, Zeger SL. Estimating autoantibody signatures to detect autoimmune disease patient subsets. Biostatistics 2019; 20:30-47. [PMID: 29140482 PMCID: PMC6657300 DOI: 10.1093/biostatistics/kxx061] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Revised: 08/03/2017] [Accepted: 10/02/2017] [Indexed: 11/12/2022] Open
Abstract
Autoimmune diseases are characterized by highly specific immune responses against molecules in self-tissues. Different autoimmune diseases are characterized by distinct immune responses, making autoantibodies useful for diagnosis and prediction. In many diseases, the targets of autoantibodies are incompletely defined. Although the technologies for autoantibody discovery have advanced dramatically over the past decade, each of these techniques generates hundreds of possibilities, which are onerous and expensive to validate. We set out to establish a method to greatly simplify autoantibody discovery, using a pre-filtering step to define subgroups with similar specificities based on migration of radiolabeled, immunoprecipitated proteins on sodium dodecyl sulfate (SDS) gels and autoradiography [Gel Electrophoresis and band detection on Autoradiograms (GEA)]. Human recognition of patterns is not optimal when the patterns are complex or scattered across many samples. Multiple sources of errors-including irrelevant intensity differences and warping of gels-have challenged automation of pattern discovery from autoradiograms.In this article, we address these limitations using a Bayesian hierarchical model with shrinkage priors for pattern alignment and spatial dewarping. The Bayesian model combines information from multiple gel sets and corrects spatial warping for coherent estimation of autoantibody signatures defined by presence or absence of a grid of landmark proteins. We show the pre-processing creates more clearly separated clusters and improves the accuracy of autoantibody subset detection via hierarchical clustering. Finally, we demonstrate the utility of the proposed methods with GEA data from scleroderma patients.
Collapse
Affiliation(s)
- Zhenke Wu
- Department of Biostatistics and Michigan Institute of Data Science, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, USA
| | - Livia Casciola-Rosen
- Division of Rheumatology, The Johns Hopkins University School of Medicine, Bayview Medical Center, 5200 Eastern Avenue, Mason F. Lord Building, Center Tower, Baltimore, MD, USA
| | - Ami A Shah
- Division of Rheumatology, The Johns Hopkins University School of Medicine, Bayview Medical Center, 5200 Eastern Avenue, Mason F. Lord Building, Center Tower, Baltimore, MD, USA
| | - Antony Rosen
- Division of Rheumatology, The Johns Hopkins University School of Medicine, Bayview Medical Center, 5200 Eastern Avenue, Mason F. Lord Building, Center Tower, Baltimore, MD, USA
| | - Scott L Zeger
- Department of Biostatistics, The Johns Hopkins University, 615 N Wolfe Street, Baltimore, MD, USA
| |
Collapse
|
49
|
Lee JE, Sung JH, Sarpong D, Efird JT, Tchounwou PB, Ofili E, Norris K. Knowledge Management for Fostering Biostatistical Collaboration within a Research Network: The RTRN Case Study. Int J Environ Res Public Health 2018; 15:ijerph15112533. [PMID: 30424550 PMCID: PMC6266008 DOI: 10.3390/ijerph15112533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 10/04/2018] [Accepted: 11/05/2018] [Indexed: 11/25/2022]
Abstract
Purpose: While the intellectual and scientific rationale for research collaboration has been articulated, a paucity of information is available on a strategic approach to facilitate the collaboration within a research network designed to reduce health disparities. This study aimed to (1) develop a conceptual model to facilitate collaboration among biostatisticians in a research network; (2) describe collaborative engagement performed by the Network’s Data Coordinating Center (DCC); and (3) discuss potential challenges and opportunities in engaging the collaboration. Methods: Key components of the strategic approach will be developed through a systematic literature review. The Network’s initiatives for the biostatistical collaboration will be described in the areas of infrastructure, expertise and knowledge management and experiential lessons will be discussed. Results: Components of the strategic approach model included three Ps (people, processes and programs) which were integrated into expert management, infrastructure management and knowledge management, respectively. Ongoing initiatives for collaboration with non-DCC biostatisticians included both web-based and face-to-face interaction approaches: Network’s biostatistical capacities and needs assessment, webinar statistical seminars, mobile statistical workshop and clinics, adjunct appointment program, one-on-one consulting, and on-site workshop. The outreach program, as a face-to-face interaction approach, especially resulted in a useful tool for expertise management and needs assessment as well as knowledge exchange. Conclusions: Although fostering a partnered research culture, sustaining senior management commitment and ongoing monitoring are a challenge for this collaborative engagement, the proposed strategies centrally performed by the DCC may be useful in accelerating the pace and enhancing the quality of the scientific outcomes within a multidisciplinary clinical and translational research network.
Collapse
Affiliation(s)
- Jae Eun Lee
- Research Centers in Minority Institutions Translational Research Network Data Coordinating Center, Mississippi e-Center, Jackson State University, 1230 Raymond Rd., Jackson, MS 39204, USA.
- Department of Biostatistics and Epidemiology, College of Public Services, Jackson State University, 350 W. Woodrow Wilson Drive Jackson Medical Mall, Suite 301, Jackson, MS 39213, USA.
| | - Jung Hye Sung
- Department of Biostatistics and Epidemiology, College of Public Services, Jackson State University, 350 W. Woodrow Wilson Drive Jackson Medical Mall, Suite 301, Jackson, MS 39213, USA.
| | - Daniel Sarpong
- Center for Minority Health and Health Disparities Research and Education, Xavier University, 1 Drexel Drive, New Orleans, LA 70125, USA.
| | - Jimmy T Efird
- Center for Clinical Epidemiology and Biostatistics (CCEB), School of Medicine and Public Health, the University of Newcastle (UoN), Callaghan, NSW 2308, Australia.
| | - Paul B Tchounwou
- Research Centers in Minority Institutions Translational Research Network Data Coordinating Center, Mississippi e-Center, Jackson State University, 1230 Raymond Rd., Jackson, MS 39204, USA.
| | - Elizabeth Ofili
- Clinical Research Center & Clinical and Translational Research, Morehouse School of Medicine, 720 Westview Drive, Atlanta, GA 30310, USA.
| | - Keith Norris
- Department of Medicine, David Geffen School of Medicine, UCLA, 911 Broxton Ave, Room 103, Los Angeles, CA 90024, USA.
| |
Collapse
|
50
|
Nelson A. An Interactive Workshop Reviewing Basic Biostatistics and Applying Bayes' Theorem to Diagnostic Testing and Clinical Decision-Making. MedEdPORTAL 2018; 14:10771. [PMID: 30800971 PMCID: PMC6346275 DOI: 10.15766/mep_2374-8265.10771] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Accepted: 10/04/2018] [Indexed: 06/09/2023]
Abstract
INTRODUCTION Sensitivity, specificity, and predictive values-the basic statistics behind using and interpreting screening and diagnostic tests-are taught in all medical schools, yet studies have shown that a majority of physicians cannot correctly define and apply these concepts. Previous work has not rigorously examined this disconnect and attempted to address it. METHODS We used adult learning theory to design a case-based interactive workshop to review biostatistics and apply them to clinical decision-making using Bayes' theorem. Participants took an anonymous multiple-choice pretest, posttest, and delayed posttest on definitions and application of the concepts, and we compared the scores between the three tests. Several experiences with early iterations provided feedback to improve the workshop but were not included for analysis. RESULTS We conducted the finalized workshop with 54 pediatrics students, residents, and faculty. All learners completed the immediate pre- and posttests, and eight completed the delayed posttest. Average scores rose from 4.5/8 (56%) on the pretest to 6.5/8 (81%) on the posttest and 6.4/8 (80%) on the delayed posttest. Two-tailed t tests showed p < .001 for the difference between the pretest and both posttests, and post hoc power analysis showed a power of 99% to detect the observed differences. There was no significant difference (p = .8) between the posttest and delayed posttest. DISCUSSION Our work demonstrates that an interactive workshop reviewing basic biostatistics and teaching rational diagnostic testing using Bayes' theorem can be effective in connecting theoretical knowledge of biostatistics to evidence-based decision-making in real clinical practice.
Collapse
Affiliation(s)
- Adin Nelson
- Assistant Professor, Department of Pediatrics, Rutgers New Jersey Medical School
| |
Collapse
|