1
|
Porubsky VL, Sauro HM. A Practical Guide to Reproducible Modeling for Biochemical Networks. Methods Mol Biol 2023; 2634:107-138. [PMID: 37074576 DOI: 10.1007/978-1-0716-3008-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
Abstract
While scientific disciplines revere reproducibility, many studies - experimental and computational alike - fall short of this ideal and cannot be reproduced or even repeated when the model is shared. For computational modeling of biochemical networks, there is a dearth of formal training and resources available describing how to practically implement reproducible methods, despite a wealth of existing tools and formats which could be used to support reproducibility. This chapter points the reader to useful software tools and standardized formats that support reproducible modeling of biochemical networks and provides suggestions on how to implement reproducible methods in practice. Many of the suggestions encourage readers to use best practices from the software development community in order to automate, test, and version control their model components. A Jupyter Notebook demonstrating several of the key steps in building a reproducible biochemical network model is included to supplement the recommendations in the text.
Collapse
Affiliation(s)
| | - Herbert M Sauro
- University of Washington, Department of Bioengineering, Seattle, WA, USA
| |
Collapse
|
2
|
Porubsky VL, Goldberg AP, Rampadarath AK, Nickerson DP, Karr JR, Sauro HM. Best Practices for Making Reproducible Biochemical Models. Cell Syst 2020; 11:109-120. [PMID: 32853539 DOI: 10.1016/j.cels.2020.06.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 05/15/2020] [Accepted: 06/24/2020] [Indexed: 01/03/2023]
Abstract
Like many scientific disciplines, dynamical biochemical modeling is hindered by irreproducible results. This limits the utility of biochemical models by making them difficult to understand, trust, or reuse. We comprehensively list the best practices that biochemical modelers should follow to build reproducible biochemical model artifacts-all data, model descriptions, and custom software used by the model-that can be understood and reused. The best practices provide advice for all steps of a typical biochemical modeling workflow in which a modeler collects data; constructs, trains, simulates, and validates the model; uses the predictions of a model to advance knowledge; and publicly shares the model artifacts. The best practices emphasize the benefits obtained by using standard tools and formats and provides guidance to modelers who do not or cannot use standards in some stages of their modeling workflow. Adoption of these best practices will enhance the ability of researchers to reproduce, understand, and reuse biochemical models.
Collapse
Affiliation(s)
- Veronica L Porubsky
- Department of Bioengineering, University of Washington, Seattle, WA 98105, USA.
| | - Arthur P Goldberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Anand K Rampadarath
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - David P Nickerson
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Jonathan R Karr
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Herbert M Sauro
- Department of Bioengineering, University of Washington, Seattle, WA 98105, USA
| |
Collapse
|
3
|
Poline JB. From data sharing to data publishing [version 2; peer review: 2 approved, 1 approved with reservations]. ACTA ACUST UNITED AC 2019; 2. [PMID: 31157322 PMCID: PMC6540973 DOI: 10.12688/mniopenres.12772.2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Data sharing, i.e. depositing data in research community
accessiblerepositories, is not becoming as rapidly widespread across the life
scienceresearch community as hoped or expected. I consider the sociological and
cultural context of research and lay out why the community should instead move
to data publishing with a focus on neuroscience data, and outline practical
steps that can be taken to realize this goal.
Collapse
Affiliation(s)
- Jean-Baptiste Poline
- Montreal Neurological Institute and Hospital, McGill University, Montréal, QC, H3A 2B4, Canada.,Henry H. Wheeler, Jr. Brain Imaging Center, Helen Wills Neuroscience Institute, University of California, Berkley, CA, 94720, USA
| |
Collapse
|
4
|
Mahmud M, Vassanelli S. Processing and Analysis of Multichannel Extracellular Neuronal Signals: State-of-the-Art and Challenges. Front Neurosci 2016; 10:248. [PMID: 27313507 PMCID: PMC4889584 DOI: 10.3389/fnins.2016.00248] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 05/19/2016] [Indexed: 12/02/2022] Open
Abstract
In recent years multichannel neuronal signal acquisition systems have allowed scientists to focus on research questions which were otherwise impossible. They act as a powerful means to study brain (dys)functions in in-vivo and in in-vitro animal models. Typically, each session of electrophysiological experiments with multichannel data acquisition systems generate large amount of raw data. For example, a 128 channel signal acquisition system with 16 bits A/D conversion and 20 kHz sampling rate will generate approximately 17 GB data per hour (uncompressed). This poses an important and challenging problem of inferring conclusions from the large amounts of acquired data. Thus, automated signal processing and analysis tools are becoming a key component in neuroscience research, facilitating extraction of relevant information from neuronal recordings in a reasonable time. The purpose of this review is to introduce the reader to the current state-of-the-art of open-source packages for (semi)automated processing and analysis of multichannel extracellular neuronal signals (i.e., neuronal spikes, local field potentials, electroencephalogram, etc.), and the existing Neuroinformatics infrastructure for tool and data sharing. The review is concluded by pinpointing some major challenges that are being faced, which include the development of novel benchmarking techniques, cloud-based distributed processing and analysis tools, as well as defining novel means to share and standardize data.
Collapse
Affiliation(s)
- Mufti Mahmud
- NeuroChip Laboratory, Department of Biomedical Sciences, University of Padova Padova, Italy
| | - Stefano Vassanelli
- NeuroChip Laboratory, Department of Biomedical Sciences, University of Padova Padova, Italy
| |
Collapse
|
5
|
Abstract
Routine data sharing is greatly benefiting several scientific disciplines, such as molecular biology, particle physics, and astronomy. Neuroscience data, in contrast, are still rarely shared, greatly limiting the potential for secondary discovery and the acceleration of research progress. Although the attitude toward data sharing is non-uniform across neuroscience subdomains, widespread adoption of data sharing practice will require a cultural shift in the community. Digital reconstructions of axonal and dendritic morphology constitute a particularly "sharable" kind of data. The popularity of the public repository NeuroMorpho.Org demonstrates that data sharing can benefit both users and contributors. Increased data availability is also catalyzing the grassroots development and spontaneous integration of complementary resources, research tools, and community initiatives. Even in this rare successful subfield, however, more data are still unshared than shared. Our experience as developers and curators of NeuroMorpho.Org suggests that greater transparency regarding the expectations and consequences of sharing (or not sharing) data, combined with public disclosure of which datasets are shared and which are not, may expedite the transition to community-wide data sharing.
Collapse
Affiliation(s)
- Giorgio A. Ascoli
- Krasnow Institute for Advanced Study, George Mason University, Fairfax, Virginia, United States of America
| |
Collapse
|
6
|
Affiliation(s)
- Leonardo Candela
- Istituto di Scienza e Tecnologie dell'Informazione “A. Faedo,”; Italian National Research Council; via G. Moruzzi, 1 Pisa 56124 Italy
| | - Donatella Castelli
- Istituto di Scienza e Tecnologie dell'Informazione “A. Faedo,”; Italian National Research Council; via G. Moruzzi, 1 Pisa 56124 Italy
| | - Paolo Manghi
- Istituto di Scienza e Tecnologie dell'Informazione “A. Faedo,”; Italian National Research Council; via G. Moruzzi, 1 Pisa 56124 Italy
| | - Alice Tani
- Istituto di Scienza e Tecnologie dell'Informazione “A. Faedo,”; Italian National Research Council; via G. Moruzzi, 1 Pisa 56124 Italy
| |
Collapse
|
7
|
Haselgrove C, Poline JB, Kennedy DN. A simple tool for neuroimaging data sharing. Front Neuroinform 2014; 8:52. [PMID: 24904398 PMCID: PMC4033259 DOI: 10.3389/fninf.2014.00052] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 04/28/2014] [Indexed: 11/13/2022] Open
Abstract
Data sharing is becoming increasingly common, but despite encouragement and facilitation by funding agencies, journals, and some research efforts, most neuroimaging data acquired today is still not shared due to political, financial, social, and technical barriers to sharing data that remain. In particular, technical solutions are few for researchers that are not a part of larger efforts with dedicated sharing infrastructures, and social barriers such as the time commitment required to share can keep data from becoming publicly available. We present a system for sharing neuroimaging data, designed to be simple to use and to provide benefit to the data provider. The system consists of a server at the International Neuroinformatics Coordinating Facility (INCF) and user tools for uploading data to the server. The primary design principle for the user tools is ease of use: the user identifies a directory containing Digital Imaging and Communications in Medicine (DICOM) data, provides their INCF Portal authentication, and provides identifiers for the subject and imaging session. The user tool anonymizes the data and sends it to the server. The server then runs quality control routines on the data, and the data and the quality control reports are made public. The user retains control of the data and may change the sharing policy as they need. The result is that in a few minutes of the user's time, DICOM data can be anonymized and made publicly available, and an initial quality control assessment can be performed on the data. The system is currently functional, and user tools and access to the public image database are available at http://xnat.incf.org/.
Collapse
Affiliation(s)
| | | | - David N Kennedy
- University of Massachusetts Medical School Worcester, MA, USA
| |
Collapse
|
8
|
Belter CW. Measuring the value of research data: a citation analysis of oceanographic data sets. PLoS One 2014; 9:e92590. [PMID: 24671177 PMCID: PMC3966791 DOI: 10.1371/journal.pone.0092590] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2013] [Accepted: 02/25/2014] [Indexed: 11/24/2022] Open
Abstract
Evaluation of scientific research is becoming increasingly reliant on publication-based bibliometric indicators, which may result in the devaluation of other scientific activities--such as data curation--that do not necessarily result in the production of scientific publications. This issue may undermine the movement to openly share and cite data sets in scientific publications because researchers are unlikely to devote the effort necessary to curate their research data if they are unlikely to receive credit for doing so. This analysis attempts to demonstrate the bibliometric impact of properly curated and openly accessible data sets by attempting to generate citation counts for three data sets archived at the National Oceanographic Data Center. My findings suggest that all three data sets are highly cited, with estimated citation counts in most cases higher than 99% of all the journal articles published in Oceanography during the same years. I also find that methods of citing and referring to these data sets in scientific publications are highly inconsistent, despite the fact that a formal citation format is suggested for each data set. These findings have important implications for developing a data citation format, encouraging researchers to properly curate their research data, and evaluating the bibliometric impact of individuals and institutions.
Collapse
Affiliation(s)
- Christopher W. Belter
- LAC Group, Central Library, National Oceanic and Atmospheric Administration, Silver Spring, Maryland, United States of America
| |
Collapse
|
9
|
Castelli D, Manghi P, Thanos C. A vision towards Scientific Communication Infrastructures. INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES 2013. [DOI: 10.1007/s00799-013-0106-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
10
|
Comparison of models for IP3 receptor kinetics using stochastic simulations. PLoS One 2013; 8:e59618. [PMID: 23630568 PMCID: PMC3629942 DOI: 10.1371/journal.pone.0059618] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Accepted: 02/15/2013] [Indexed: 12/07/2022] Open
Abstract
Inositol 1,4,5-trisphosphate receptor (IP3R) is a ubiquitous intracellular calcium (Ca2+) channel which has a major role in controlling Ca2+ levels in neurons. A variety of computational models have been developed to describe the kinetic function of IP3R under different conditions. In the field of computational neuroscience, it is of great interest to apply the existing models of IP3R when modeling local Ca2+ transients in dendrites or overall Ca2+ dynamics in large neuronal models. The goal of this study was to evaluate existing IP3R models, based on electrophysiological data. This was done in order to be able to suggest suitable models for neuronal modeling. Altogether four models (Othmer and Tang, 1993; Dawson etal., 2003; Fraiman and Dawson, 2004; Doi etal., 2005) were selected for a more detailed comparison. The selection was based on the computational efficiency of the models and the type of experimental data that was used in developing the model. The kinetics of all four models were simulated by stochastic means, using the simulation software STEPS, which implements the Gillespie stochastic simulation algorithm. The results show major differences in the statistical properties of model functionality. Of the four compared models, the one by Fraiman and Dawson (2004) proved most satisfactory in producing the specific features of experimental findings reported in literature. To our knowledge, the present study is the first detailed evaluation of IP3R models using stochastic simulation methods, thus providing an important setting for constructing a new, realistic model of IP3R channel kinetics for compartmental modeling of neuronal functions. We conclude that the kinetics of IP3R with different concentrations of Ca2+ and IP3 should be more carefully addressed when new models for IP3R are developed.
Collapse
|
11
|
Gorgolewski KJ, Margulies DS, Milham MP. Making data sharing count: a publication-based solution. Front Neurosci 2013; 7:9. [PMID: 23390412 PMCID: PMC3565154 DOI: 10.3389/fnins.2013.00009] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 01/11/2013] [Indexed: 11/22/2022] Open
Abstract
The neuroimaging community has been increasingly called up to openly share data. Although data sharing has been a cornerstone of large-scale data consortia, the incentive for the individual researcher remains unclear. Other fields have benefited from embracing a data publication form – the data paper – that allows researchers to publish their datasets as a citable scientific publication. Such publishing mechanisms both give credit that is recognizable within the scientific ecosystem, and also ensure the quality of the published data and metadata through the peer review process. We discuss the specific challenges of adapting data papers to the needs of the neuroimaging community, and we propose guidelines for the structure as well as review process.
Collapse
|