1
|
Pokutnaya D, Van Panhuis WG, Childers B, Hawkins MS, Arcury-Quandt AE, Matlack M, Carpio K, Hochheiser H. Inter-rater reliability of the infectious disease modeling reproducibility checklist (IDMRC) as applied to COVID-19 computational modeling research. BMC Infect Dis 2023; 23:733. [PMID: 37891462 PMCID: PMC10612332 DOI: 10.1186/s12879-023-08729-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 10/19/2023] [Indexed: 10/29/2023] Open
Abstract
BACKGROUND Infectious disease computational modeling studies have been widely published during the coronavirus disease 2019 (COVID-19) pandemic, yet they have limited reproducibility. Developed through an iterative testing process with multiple reviewers, the Infectious Disease Modeling Reproducibility Checklist (IDMRC) enumerates the minimal elements necessary to support reproducible infectious disease computational modeling publications. The primary objective of this study was to assess the reliability of the IDMRC and to identify which reproducibility elements were unreported in a sample of COVID-19 computational modeling publications. METHODS Four reviewers used the IDMRC to assess 46 preprint and peer reviewed COVID-19 modeling studies published between March 13th, 2020, and July 30th, 2020. The inter-rater reliability was evaluated by mean percent agreement and Fleiss' kappa coefficients (κ). Papers were ranked based on the average number of reported reproducibility elements, and average proportion of papers that reported each checklist item were tabulated. RESULTS Questions related to the computational environment (mean κ = 0.90, range = 0.90-0.90), analytical software (mean κ = 0.74, range = 0.68-0.82), model description (mean κ = 0.71, range = 0.58-0.84), model implementation (mean κ = 0.68, range = 0.39-0.86), and experimental protocol (mean κ = 0.63, range = 0.58-0.69) had moderate or greater (κ > 0.41) inter-rater reliability. Questions related to data had the lowest values (mean κ = 0.37, range = 0.23-0.59). Reviewers ranked similar papers in the upper and lower quartiles based on the proportion of reproducibility elements each paper reported. While over 70% of the publications provided data used in their models, less than 30% provided the model implementation. CONCLUSIONS The IDMRC is the first comprehensive, quality-assessed tool for guiding researchers in reporting reproducible infectious disease computational modeling studies. The inter-rater reliability assessment found that most scores were characterized by moderate or greater agreement. These results suggest that the IDMRC might be used to provide reliable assessments of the potential for reproducibility of published infectious disease modeling publications. Results of this evaluation identified opportunities for improvement to the model implementation and data questions that can further improve the reliability of the checklist.
Collapse
Affiliation(s)
- Darya Pokutnaya
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, PA, United States of America.
| | - Willem G Van Panhuis
- Office of Data Science and Emerging Technologies, National Institute of Allergy and Infectious Diseases, Rockville, MD, United States of America
| | - Bruce Childers
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Marquis S Hawkins
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Alice E Arcury-Quandt
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Meghan Matlack
- Department of Environmental and Occupational Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Kharlya Carpio
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Harry Hochheiser
- Department of Biomedical Informatics, Intelligent Systems Program, and Clinical and Translational Science Institute, University of Pittsburgh, Pittsburgh, PA, United States of America
| |
Collapse
|
2
|
Mendes P. Reproducibility and FAIR principles: the case of a segment polarity network model. Front Cell Dev Biol 2023; 11:1201673. [PMID: 37346177 PMCID: PMC10279958 DOI: 10.3389/fcell.2023.1201673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 05/30/2023] [Indexed: 06/23/2023] Open
Abstract
The issue of reproducibility of computational models and the related FAIR principles (findable, accessible, interoperable, and reusable) are examined in a specific test case. I analyze a computational model of the segment polarity network in Drosophila embryos published in 2000. Despite the high number of citations to this publication, 23 years later the model is barely accessible, and consequently not interoperable. Following the text of the original publication allowed successfully encoding the model for the open source software COPASI. Subsequently saving the model in the SBML format allowed it to be reused in other open source software packages. Submission of this SBML encoding of the model to the BioModels database enables its findability and accessibility. This demonstrates how the FAIR principles can be successfully enabled by using open source software, widely adopted standards, and public repositories, facilitating reproducibility and reuse of computational cell biology models that will outlive the specific software used.
Collapse
Affiliation(s)
- Pedro Mendes
- Center for Cell Analysis and Modeling, University of Connecticut School of Medicine, Farmington, CT, United States
- Department of Cell Biology, University of Connecticut School of Medicine, Farmington, CT, United States
| |
Collapse
|
3
|
Pokutnaya D, Van Panhuis WG, Childers B, Hawkins MS, Arcury-Quandt AE, Matlack M, Carpio K, Hochheiser H. Inter-rater reliability of the Infectious Disease Modeling Reproducibility Checklist (IDMRC) as applied to COVID-19 computational modeling research. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.21.23287529. [PMID: 36993426 PMCID: PMC10055605 DOI: 10.1101/2023.03.21.23287529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Background Infectious disease computational modeling studies have been widely published during the coronavirus disease 2019 (COVID-19) pandemic, yet they have limited reproducibility. Developed through an iterative testing process with multiple reviewers, the Infectious Disease Modeling Reproducibility Checklist (IDMRC) enumerates the minimal elements necessary to support reproducible infectious disease computational modeling publications. The primary objective of this study was to assess the reliability of the IDMRC and to identify which reproducibility elements were unreported in a sample of COVID-19 computational modeling publications. Methods Four reviewers used the IDMRC to assess 46 preprint and peer reviewed COVID-19 modeling studies published between March 13th, 2020, and July 31st, 2020. The inter-rater reliability was evaluated by mean percent agreement and Fleiss' kappa coefficients (κ). Papers were ranked based on the average number of reported reproducibility elements, and average proportion of papers that reported each checklist item were tabulated. Results Questions related to the computational environment (mean κ = 0.90, range = 0.90-0.90), analytical software (mean κ = 0.74, range = 0.68-0.82), model description (mean κ = 0.71, range = 0.58-0.84), model implementation (mean κ = 0.68, range = 0.39-0.86), and experimental protocol (mean κ = 0.63, range = 0.58-0.69) had moderate or greater (κ > 0.41) inter-rater reliability. Questions related to data had the lowest values (mean κ = 0.37, range = 0.23-0.59). Reviewers ranked similar papers in the upper and lower quartiles based on the proportion of reproducibility elements each paper reported. While over 70% of the publications provided data used in their models, less than 30% provided the model implementation. Conclusions The IDMRC is the first comprehensive, quality-assessed tool for guiding researchers in reporting reproducible infectious disease computational modeling studies. The inter-rater reliability assessment found that most scores were characterized by moderate or greater agreement. These results suggests that the IDMRC might be used to provide reliable assessments of the potential for reproducibility of published infectious disease modeling publications. Results of this evaluation identified opportunities for improvement to the model implementation and data questions that can further improve the reliability of the checklist.
Collapse
Affiliation(s)
- Darya Pokutnaya
- University of Pittsburgh, Department of Epidemiology; Pittsburgh, Pennsylvania, United States of America
| | - Willem G Van Panhuis
- Office of Data Science and Emerging Technologies, National Institute of Allergy and Infectious Diseases; Rockville, Maryland, United States of America [note that Dr. Van Panhuis completed the research described in this paper during his time at the University of Pittsburgh, before starting his position at NIAID]
| | - Bruce Childers
- University of Pittsburgh, Department of Computer Science; Pittsburgh, Pennsylvania, United States of America
| | - Marquis S Hawkins
- University of Pittsburgh, Department of Epidemiology; Pittsburgh, Pennsylvania, United States of America
| | - Alice E Arcury-Quandt
- University of Pittsburgh, Department of Epidemiology; Pittsburgh, Pennsylvania, United States of America
| | - Meghan Matlack
- University of Pittsburgh, Department of Environmental and Occupational Health, Pittsburgh, PA, USA
| | - Kharlya Carpio
- University of Pittsburgh, Department of Epidemiology; Pittsburgh, Pennsylvania, United States of America
| | - Harry Hochheiser
- University of Pittsburgh, Department of Biomedical Informatics, Intelligent Systems Program, and Clinical and Translational Science Institute; Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
4
|
Parker C, Nelson E, Zhang T. VeVaPy, a Python Platform for Efficient Verification and Validation of Systems Biology Models with Demonstrations Using Hypothalamic-Pituitary-Adrenal Axis Models. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1747. [PMID: 36554152 PMCID: PMC9777964 DOI: 10.3390/e24121747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/21/2022] [Accepted: 11/24/2022] [Indexed: 06/17/2023]
Abstract
In order for mathematical models to make credible contributions, it is essential for them to be verified and validated. Currently, verification and validation (V&V) of these models does not meet the expectations of the system biology and systems pharmacology communities. Partially as a result of this shortfall, systemic V&V of existing models currently requires a lot of time and effort. In order to facilitate systemic V&V of chosen hypothalamic-pituitary-adrenal (HPA) axis models, we have developed a computational framework named VeVaPy-taking care to follow the recommended best practices regarding the development of mathematical models. VeVaPy includes four functional modules coded in Python, and the source code is publicly available. We demonstrate that VeVaPy can help us efficiently verify and validate the five HPA axis models we have chosen. Supplied with new and independent data, VeVaPy outputs objective V&V benchmarks for each model. We believe that VeVaPy will help future researchers with basic modeling and programming experience to efficiently verify and validate mathematical models from the fields of systems biology and systems pharmacology.
Collapse
Affiliation(s)
- Christopher Parker
- Department of Pharmacology & Systems Physiology, College of Medicine, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Erik Nelson
- Department of Psychiatry & Behavioral Neuroscience, College of Medicine, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Tongli Zhang
- Department of Pharmacology & Systems Physiology, College of Medicine, University of Cincinnati, Cincinnati, OH 45221, USA
| |
Collapse
|
5
|
Mattison KA, Merchak AR, Wieman ST, Zimmer S, Fankhauser SC. Engaging young scholars in science through publication: A survey analysis of published middle and high school authors. LEARNED PUBLISHING 2022. [DOI: 10.1002/leap.1480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Kari A. Mattison
- Journal of Emerging Investigators Oro Valley Arizona USA
- Department of Human Genetics Emory University Atlanta Georgia USA
| | - Andrea R. Merchak
- Journal of Emerging Investigators Oro Valley Arizona USA
- Department of Neuroscience, College of Medicine University of Virginia Charlottesville Virginia USA
| | - Scott T. Wieman
- Journal of Emerging Investigators Oro Valley Arizona USA
- Department of Geology and Geophysics Woods Hole Oceanographic Institution Woods Hole Massachusetts USA
- Department of Earth, Atmospheric, and Planetary Sciences Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Stephanie Zimmer
- Journal of Emerging Investigators Oro Valley Arizona USA
- Department of Dermatology Pennsylvania State University, College of Medicine Hershey Pennsylvania USA
| | - Sarah C. Fankhauser
- Journal of Emerging Investigators Oro Valley Arizona USA
- Department of Biology Oxford College of Emory University Oxford Georgia USA
| |
Collapse
|
6
|
Dynamic publication media with the COPASI R Connector (CoRC). Math Biosci 2022; 348:108822. [PMID: 35452633 DOI: 10.1016/j.mbs.2022.108822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 04/07/2022] [Accepted: 04/08/2022] [Indexed: 11/27/2022]
Abstract
In this article we show how dynamic publication media and the COPASI R Connector (CoRC) can be combined in a natural and synergistic way to communicate (biochemical) models. Dynamic publication media are becoming a popular tool for authors to effectively compose and publish their work. They are built from templates and the final documents are created dynamically. In addition, they can also be interactive. Working with dynamic publication media is made easy with the programming environment R via its integration with tools such as R Markdown, Jupyter and Shiny. Additionally, the COmplex PAthway SImulator COPASI (http://www.copasi.org), a widely used biochemical modelling toolkit, is available in R through the use of the COPASI R Connector (CoRC, https://jpahle.github.io/CoRC). Models are a common tool in the mathematical biosciences, in particular kinetic models of biochemical networks in (computational) systems biology. We focus on three application areas of dynamic publication media and CoRC: Documentation (reproducible workflows), Teaching (creating self-paced lessons) and Science Communication (immersive and engaging presentation). To illustrate these, we created six dynamic document examples in the form of R Markdown and Jupyter notebooks, hosted on the platforms GitHub, shinyapps.io, Google Colaboratory. Having code and output in one place, creating documents in template-form and the option of interactivity make the combination of dynamic documents and CoRC a versatile tool. All our example documents are freely available at https://jpahle.github.io/DynamiCoRC under the Creative Commons BY 4.0 licence.
Collapse
|
7
|
Ioannidis JP. Pre-registration of mathematical models. Math Biosci 2022; 345:108782. [DOI: 10.1016/j.mbs.2022.108782] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/13/2022] [Accepted: 01/13/2022] [Indexed: 11/28/2022]
|
8
|
Renardy M, Joslyn LR, Millar JA, Kirschner DE. To Sobol or not to Sobol? The effects of sampling schemes in systems biology applications. Math Biosci 2021; 337:108593. [PMID: 33865847 PMCID: PMC8184610 DOI: 10.1016/j.mbs.2021.108593] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 03/21/2021] [Accepted: 03/22/2021] [Indexed: 12/13/2022]
Abstract
Computational and mathematical models in biology rely heavily on the parameters that characterize them. However, robust estimates for their values are typically elusive and thus a large parameter space becomes necessary for model study, particularly to make translationally impactful predictions. Sampling schemes exploring parameter spaces for models are used for a variety of purposes in systems biology, including model calibration and sensitivity analysis. Typically, random sampling is used; however, when models have a high number of unknown parameters or the models are highly complex, computational cost becomes an important factor. This issue can be reduced through the use of efficient sampling schemes such as Latin hypercube sampling (LHS) and Sobol sequences. In this work, we compare and contrast three sampling schemes - random sampling, LHS, and Sobol sequences - for the purposes of performing both parameter sensitivity analysis and model calibration. In addition, we apply these analyses to different types of computational and mathematical models of varying complexity: a simple ODE model, a complex ODE model, and an agent-based model. In general, the sampling scheme had little effect when used for calibration efforts, but when applied to sensitivity analyses, Sobol sequences exhibited faster convergence. While the observed benefit to convergence is relatively small, Sobol sequences are computationally less expensive to compute than LHS samples and also have the benefit of being deterministic, which allows for better reproducibility of results.
Collapse
Affiliation(s)
- Marissa Renardy
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, United States
| | - Louis R Joslyn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Jess A Millar
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Denise E Kirschner
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, United States; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States.
| |
Collapse
|
9
|
Tiwari K, Kananathan S, Roberts MG, Meyer JP, Sharif Shohan MU, Xavier A, Maire M, Zyoud A, Men J, Ng S, Nguyen TVN, Glont M, Hermjakob H, Malik‐Sheriff RS. Reproducibility in systems biology modelling. Mol Syst Biol 2021; 17:e9982. [PMID: 33620773 PMCID: PMC7901289 DOI: 10.15252/msb.20209982] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Reproducibility of scientific results is a key element of science and credibility. The lack of reproducibility across many scientific fields has emerged as an important concern. In this piece, we assess mathematical model reproducibility and propose a scorecard for improving reproducibility in this field.
Collapse
Affiliation(s)
- Krishna Tiwari
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
- Babraham InstituteBabraham Research CampusCambridgeUK
| | - Sarubini Kananathan
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Matthew G Roberts
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Johannes P Meyer
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Mohammad Umer Sharif Shohan
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Ashley Xavier
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Matthieu Maire
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Ahmad Zyoud
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Jinghao Men
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Szeyi Ng
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Tung V N Nguyen
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Mihai Glont
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| | - Henning Hermjakob
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
- Beijing Institute of LifeomicsNational Center for Protein Sciences (The Phoenix Center)BeijingChina
| | - Rahuman S Malik‐Sheriff
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)Wellcome Genome CampusHinxton, CambridgeUK
| |
Collapse
|
10
|
Iwanaga T, Wang HH, Hamilton SH, Grimm V, Koralewski TE, Salado A, Elsawah S, Razavi S, Yang J, Glynn P, Badham J, Voinov A, Chen M, Grant WE, Peterson TR, Frank K, Shenk G, Barton CM, Jakeman AJ, Little JC. Socio-technical scales in socio-environmental modeling: Managing a system-of-systems modeling approach. ENVIRONMENTAL MODELLING & SOFTWARE : WITH ENVIRONMENT DATA NEWS 2021; 135:104885. [PMID: 33041631 PMCID: PMC7537632 DOI: 10.1016/j.envsoft.2020.104885] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/29/2020] [Indexed: 05/05/2023]
Abstract
System-of-systems approaches for integrated assessments have become prevalent in recent years. Such approaches integrate a variety of models from different disciplines and modeling paradigms to represent a socio-environmental (or social-ecological) system aiming to holistically inform policy and decision-making processes. Central to the system-of-systems approaches is the representation of systems in a multi-tier framework with nested scales. Current modeling paradigms, however, have disciplinary-specific lineage, leading to inconsistencies in the conceptualization and integration of socio-environmental systems. In this paper, a multidisciplinary team of researchers, from engineering, natural and social sciences, have come together to detail socio-technical practices and challenges that arise in the consideration of scale throughout the socio-environmental modeling process. We identify key paths forward, focused on explicit consideration of scale and uncertainty, strengthening interdisciplinary communication, and improvement of the documentation process. We call for a grand vision (and commensurate funding) for holistic system-of-systems research that engages researchers, stakeholders, and policy makers in a multi-tiered process for the co-creation of knowledge and solutions to major socio-environmental problems.
Collapse
Affiliation(s)
- Takuya Iwanaga
- Institute for Water Futures and Fenner School of Environment and Society, The Australian National University, Canberra, Australia
| | - Hsiao-Hsuan Wang
- Ecological Systems Laboratory, Department of Ecology and Conservation Biology, Texas A&M University, College Station, TX, 77843, USA
| | - Serena H Hamilton
- Institute for Water Futures and Fenner School of Environment and Society, The Australian National University, Canberra, Australia
- CSIRO Land & Water, Canberra, Australia
| | - Volker Grimm
- Helmholtz Centre for Environmental Research - UFZ, Department of Ecological Modelling, Leipzig, Germany
- University of Potsdam, Plant Ecology and Nature Conservation, Potsdam, Germany
| | - Tomasz E Koralewski
- Ecological Systems Laboratory, Department of Ecology and Conservation Biology, Texas A&M University, College Station, TX, 77843, USA
| | - Alejandro Salado
- Grado Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Sondoss Elsawah
- Institute for Water Futures and Fenner School of Environment and Society, The Australian National University, Canberra, Australia
- School of Electrical Engineering and Information Technology, University of New South Wales, Australian Defence Force Academy, Canberra, ACT, Australia
| | - Saman Razavi
- Global Institute for Water Security, School of Environment and Sustainability, Department of Civil, Geological, and Environmental Engineering, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Jing Yang
- National Institute of Water and Atmospheric Research, New Zealand
| | - Pierre Glynn
- U.S. Department of the Interior, U.S. Geological Survey, Reston, VA, USA
| | - Jennifer Badham
- Centre for Research in Social Simulation, University of Surrey, Guildford, GU2 7XH, United Kingdom
| | - Alexey Voinov
- Center on Persuasive Systems for Wise Adaptive Living (PERSWADE), Faculty of Engineering & IT, University of Technology, Sydney, Australia
- Faculty of Engineering Technology, University of Twente, Netherlands
| | - Min Chen
- Key Laboratory of Virtual Geographic Environment (Ministry of Education of PRC), Nanjing Normal University, Nanjing, 210023, China
| | - William E Grant
- Ecological Systems Laboratory, Department of Ecology and Conservation Biology, Texas A&M University, College Station, TX, 77843, USA
| | - Tarla Rai Peterson
- Environmental Science and Engineering Program, University of Texas at El Paso, El Paso, TX, 79968, USA
| | - Karin Frank
- Helmholtz Centre for Environmental Research - UFZ, Department of Ecological Modelling, Leipzig, Germany
| | - Gary Shenk
- U.S Geological Survey, Chesapeake Bay Program, Annapolis, MD, 21403, USA
| | - C Michael Barton
- Center for Social Dynamics & Complexity, School of Human Evolution & Social Change, Arizona State University, Tempe, AZ, USA
| | - Anthony J Jakeman
- Institute for Water Futures and Fenner School of Environment and Society, The Australian National University, Canberra, Australia
| | - John C Little
- Department of Civil and Environmental Engineering, Virginia Tech, Blacksburg, VA, 24061, USA
| |
Collapse
|
11
|
Liu DM, Salganik MJ. Successes and Struggles with Computational Reproducibility: Lessons from the Fragile Families Challenge. SOCIUS : SOCIOLOGICAL RESEARCH FOR A DYNAMIC WORLD 2019; 5:10.1177/2378023119849803. [PMID: 37309413 PMCID: PMC10260256 DOI: 10.1177/2378023119849803] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Reproducibility is fundamental to science, and an important component of reproducibility is computational reproducibility: the ability of a researcher to recreate the results of a published study using the original author's raw data and code. Although most people agree that computational reproducibility is important, it is still difficult to achieve in practice. In this article, the authors describe their approach to enabling computational reproducibility for the 12 articles in this special issue of Socius about the Fragile Families Challenge. The approach draws on two tools commonly used by professional software engineers but not widely used by academic researchers: software containers (e.g., Docker) and cloud computing (e.g., Amazon Web Services). These tools made it possible to standardize the computing environment around each submission, which will ease computational reproducibility both today and in the future. Drawing on their successes and struggles, the authors conclude with recommendations to researchers and journals.
Collapse
|