1
|
Many nonnormalities, one simulation: Do different data generation algorithms affect study results? Behav Res Methods 2024:10.3758/s13428-024-02364-w. [PMID: 38389030 DOI: 10.3758/s13428-024-02364-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/07/2024] [Indexed: 02/24/2024]
Abstract
Monte Carlo simulation studies are among the primary scientific outputs contributed by methodologists, guiding application of various statistical tools in practice. Although methodological researchers routinely extend simulation study findings through follow-up work, few studies are ever replicated. Simulation studies are susceptible to factors that can contribute to replicability failures, however. This paper sought to conduct a meta-scientific study by replicating one highly cited simulation study (Curran et al., Psychological Methods, 1, 16-29, 1996) that investigated the robustness of normal theory maximum likelihood (ML)-based chi-square fit statistics under multivariate nonnormality. We further examined the generalizability of the original study findings across different nonnormal data generation algorithms. Our replication results were generally consistent with original findings, but we discerned several differences. Our generalizability results were more mixed. Only two results observed under the original data generation algorithm held completely across other algorithms examined. One of the most striking findings we observed was that results associated with the independent generator (IG) data generation algorithm vastly differed from other procedures examined and suggested that ML was robust to nonnormality for the particular factor model used in the simulation. Findings point to the reality that extant methodological recommendations may not be universally valid in contexts where multiple data generation algorithms exist for a given data characteristic. We recommend that researchers consider multiple approaches to generating a specific data or model characteristic (when more than one is available) to optimize the generalizability of simulation results.
Collapse
|
2
|
Computational reproducibility of Jupyter notebooks from biomedical publications. Gigascience 2024; 13:giad113. [PMID: 38206590 PMCID: PMC10783158 DOI: 10.1093/gigascience/giad113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 08/09/2023] [Accepted: 12/08/2023] [Indexed: 01/12/2024] Open
Abstract
BACKGROUND Jupyter notebooks facilitate the bundling of executable code with its documentation and output in one interactive environment, and they represent a popular mechanism to document and share computational workflows, including for research publications. The reproducibility of computational aspects of research is a key component of scientific reproducibility but has not yet been assessed at scale for Jupyter notebooks associated with biomedical publications. APPROACH We address computational reproducibility at 2 levels: (i) using fully automated workflows, we analyzed the computational reproducibility of Jupyter notebooks associated with publications indexed in the biomedical literature repository PubMed Central. We identified such notebooks by mining the article's full text, trying to locate them on GitHub, and attempting to rerun them in an environment as close to the original as possible. We documented reproduction success and exceptions and explored relationships between notebook reproducibility and variables related to the notebooks or publications. (ii) This study represents a reproducibility attempt in and of itself, using essentially the same methodology twice on PubMed Central over the course of 2 years, during which the corpus of Jupyter notebooks from articles indexed in PubMed Central has grown in a highly dynamic fashion. RESULTS Out of 27,271 Jupyter notebooks from 2,660 GitHub repositories associated with 3,467 publications, 22,578 notebooks were written in Python, including 15,817 that had their dependencies declared in standard requirement files and that we attempted to rerun automatically. For 10,388 of these, all declared dependencies could be installed successfully, and we reran them to assess reproducibility. Of these, 1,203 notebooks ran through without any errors, including 879 that produced results identical to those reported in the original notebook and 324 for which our results differed from the originally reported ones. Running the other notebooks resulted in exceptions. CONCLUSIONS We zoom in on common problems and practices, highlight trends, and discuss potential improvements to Jupyter-related workflows associated with biomedical publications.
Collapse
|
3
|
Towards reproducible radiomics research: introduction of a database for radiomics studies. Eur Radiol 2024; 34:436-443. [PMID: 37572188 DOI: 10.1007/s00330-023-10095-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 07/07/2023] [Accepted: 07/12/2023] [Indexed: 08/14/2023]
Abstract
OBJECTIVES To investigate the model-, code-, and data-sharing practices in the current radiomics research landscape and to introduce a radiomics research database. METHODS A total of 1254 articles published between January 1, 2021, and December 31, 2022, in leading radiology journals (European Radiology, European Journal of Radiology, Radiology, Radiology: Artificial Intelligence, Radiology: Cardiothoracic Imaging, Radiology: Imaging Cancer) were retrospectively screened, and 257 original research articles were included in this study. The categorical variables were compared using Fisher's exact tests or chi-square test and numerical variables using Student's t test with relation to the year of publication. RESULTS Half of the articles (128 of 257) shared the model by either including the final model formula or reporting the coefficients of selected radiomics features. A total of 73 (28%) models were validated on an external independent dataset. Only 16 (6%) articles shared the data or used publicly available open datasets. Similarly, only 20 (7%) of the articles shared the code. A total of 7 (3%) articles both shared code and data. All collected data in this study is presented in a radiomics research database (RadBase) and could be accessed at https://github.com/EuSoMII/RadBase . CONCLUSION According to the results of this study, the majority of published radiomics models were not technically reproducible since they shared neither model nor code and data. There is still room for improvement in carrying out reproducible and open research in the field of radiomics. CLINICAL RELEVANCE STATEMENT To date, the reproducibility of radiomics research and open science practices within the radiomics research community are still very low. Ensuring reproducible radiomics research with model-, code-, and data-sharing practices will facilitate faster clinical translation. KEY POINTS • There is a discrepancy between the number of published radiomics papers and the clinical implementation of these published radiomics models. • The main obstacle to clinical implementation is the lack of model-, code-, and data-sharing practices. • In order to translate radiomics research into clinical practice, the radiomics research community should adopt open science practices.
Collapse
|
4
|
Replicability of simulation studies for the investigation of statistical methods: the RepliSims project. ROYAL SOCIETY OPEN SCIENCE 2024; 11:231003. [PMID: 38234442 PMCID: PMC10791519 DOI: 10.1098/rsos.231003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 12/14/2023] [Indexed: 01/19/2024]
Abstract
Results of simulation studies evaluating the performance of statistical methods can have a major impact on the way empirical research is implemented. However, so far there is limited evidence of the replicability of simulation studies. Eight highly cited statistical simulation studies were selected, and their replicability was assessed by teams of replicators with formal training in quantitative methodology. The teams used information in the original publications to write simulation code with the aim of replicating the results. The primary outcome was to determine the feasibility of replicability based on reported information in the original publications and supplementary materials. Replicasility varied greatly: some original studies provided detailed information leading to almost perfect replication of results, whereas other studies did not provide enough information to implement any of the reported simulations. Factors facilitating replication included availability of code, detailed reporting or visualization of data-generating procedures and methods, and replicator expertise. Replicability of statistical simulation studies was mainly impeded by lack of information and sustainability of information sources. We encourage researchers publishing simulation studies to transparently report all relevant implementation details either in the research paper itself or in easily accessible supplementary material and to make their simulation code publicly available using permanent links.
Collapse
|
5
|
A Critical Review of Risk Assessment Models for Listeria monocytogenes in Dairy Products. Foods 2023; 12:4436. [PMID: 38137240 PMCID: PMC10742501 DOI: 10.3390/foods12244436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/06/2023] [Accepted: 12/09/2023] [Indexed: 12/24/2023] Open
Abstract
A review of the published quantitative risk assessment (QRA) models of L. monocytogenes in dairy products was undertaken in order to identify and appraise the relative effectiveness of control measures and intervention strategies implemented at primary production, processing, retail, and consumer practices. A systematic literature search retrieved 18 QRA models, most of them (9) investigated raw and pasteurized milk cheeses, with the majority covering long supply chains (4 farm-to-table and 3 processing-to-table scopes). On-farm contamination sources, either from shedding animals or from the broad environment, have been demonstrated by different QRA models to impact the risk of listeriosis, in particular for raw milk cheeses. Through scenarios and sensitivity analysis, QRA models demonstrated the importance of the modeled growth rate and lag phase duration and showed that the risk contribution of consumers' practices is greater than in retail conditions. Storage temperature was proven to be more determinant of the final risk than storage time. Despite the pathogen's known ability to reside in damp spots or niches, re-contamination and/or cross-contamination were modeled in only two QRA studies. Future QRA models in dairy products should entail the full farm-to-table scope, should represent cross-contamination and the use of novel technologies, and should estimate L. monocytogenes growth more accurately by means of better-informed kinetic parameters and realistic time-temperature trajectories.
Collapse
|
6
|
Establishing a national research software award. OPEN RESEARCH EUROPE 2023; 3:185. [PMID: 38009089 PMCID: PMC10674088 DOI: 10.12688/openreseurope.16069.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 09/19/2023] [Indexed: 11/28/2023]
Abstract
Software development has become an integral part of the scholarly ecosystem, spanning all fields and disciplines. To support the sharing and creation of knowledge in line with open science principles, and particularly to enable the reproducibility of research results, it is crucial to make the source code of research software available, allowing for modification, reuse, and distribution. Recognizing the significance of open-source software contributions in academia, the second French Plan for Open Science, announced by the Minister of Higher Education and Research in 2021, introduced a National Award to promote open-source research software. This award serves multiple objectives: firstly, to highlight the software projects and teams that have devoted time and effort to develop outstanding research software, sometimes for decades, and often with little recognition; secondly, to draw attention to the importance of software as a valuable research output and to inspire new generations of researchers to follow and learn from these examples. We present here an in-depth analysis of the design and implementation of this unique initiative. As a national award established explicitly to foster Open Science practices by the French Minister of Research, it faced the intricate challenge of fairly evaluating open research software across all fields, striving for inclusivity across domains, applications, and participants. We provide a comprehensive report on the results of the first edition, which received 129 high-quality submissions. Additionally, we emphasize the impact of this initiative on the open science landscape, promoting software as a valuable research outcome, on par with publications.
Collapse
|
7
|
Computational Reproducibility of Molecular Phylogenies. Mol Biol Evol 2023; 40:msad165. [PMID: 37467477 PMCID: PMC10370456 DOI: 10.1093/molbev/msad165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 07/11/2023] [Accepted: 07/12/2023] [Indexed: 07/21/2023] Open
Abstract
Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.
Collapse
|
8
|
On the Analyses of Medical Images Using Traditional Machine Learning Techniques and Convolutional Neural Networks. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2023; 30:3173-3233. [PMID: 37260910 PMCID: PMC10071480 DOI: 10.1007/s11831-023-09899-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 02/19/2023] [Indexed: 06/02/2023]
Abstract
Convolutional neural network (CNN) has shown dissuasive accomplishment on different areas especially Object Detection, Segmentation, Reconstruction (2D and 3D), Information Retrieval, Medical Image Registration, Multi-lingual translation, Local language Processing, Anomaly Detection on video and Speech Recognition. CNN is a special type of Neural Network, which has compelling and effective learning ability to learn features at several steps during augmentation of the data. Recently, different interesting and inspiring ideas of Deep Learning (DL) such as different activation functions, hyperparameter optimization, regularization, momentum and loss functions has improved the performance, operation and execution of CNN Different internal architecture innovation of CNN and different representational style of CNN has significantly improved the performance. This survey focuses on internal taxonomy of deep learning, different models of vonvolutional neural network, especially depth and width of models and in addition CNN components, applications and current challenges of deep learning.
Collapse
|
9
|
Analysis of Network Models with Neuron-Astrocyte Interactions. Neuroinformatics 2023; 21:375-406. [PMID: 36959372 PMCID: PMC10085960 DOI: 10.1007/s12021-023-09622-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 03/25/2023]
Abstract
Neural networks, composed of many neurons and governed by complex interactions between them, are a widely accepted formalism for modeling and exploring global dynamics and emergent properties in brain systems. In the past decades, experimental evidence of computationally relevant neuron-astrocyte interactions, as well as the astrocytic modulation of global neural dynamics, have accumulated. These findings motivated advances in computational glioscience and inspired several models integrating mechanisms of neuron-astrocyte interactions into the standard neural network formalism. These models were developed to study, for example, synchronization, information transfer, synaptic plasticity, and hyperexcitability, as well as classification tasks and hardware implementations. We here focus on network models of at least two neurons interacting bidirectionally with at least two astrocytes that include explicitly modeled astrocytic calcium dynamics. In this study, we analyze the evolution of these models and the biophysical, biochemical, cellular, and network mechanisms used to construct them. Based on our analysis, we propose how to systematically describe and categorize interaction schemes between cells in neuron-astrocyte networks. We additionally study the models in view of the existing experimental data and present future perspectives. Our analysis is an important first step towards understanding astrocytic contribution to brain functions. However, more advances are needed to collect comprehensive data about astrocyte morphology and physiology in vivo and to better integrate them in data-driven computational models. Broadening the discussion about theoretical approaches and expanding the computational tools is necessary to better understand astrocytes' roles in brain functions.
Collapse
|
10
|
The Issues with Journal Issues: Let Journals Be Digital Libraries. PUBLICATIONS 2023. [DOI: 10.3390/publications11010007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023] Open
Abstract
Science depends on a communication system, and today, that is largely provided by digital technologies such as the internet and web. Despite the fact that digital technologies provide the infrastructure for this communication system, peer-reviewed journals continue to mimic workflows and processes from the print era. This paper focuses on one artifact from the print era, the journal issue, and describes how this artifact has been detrimental to the communication of science, and therefore, to science itself. To replace the journal issue, this paper argues that scholarly publishing and journals could more fully embrace digital technologies by creating digital libraries to present and organize scholarly output.
Collapse
|
11
|
Replication of the natural selection of bad science. ROYAL SOCIETY OPEN SCIENCE 2023; 10:221306. [PMID: 36844805 PMCID: PMC9943874 DOI: 10.1098/rsos.221306] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/20/2023] [Indexed: 06/18/2023]
Abstract
This study reports an independent replication of the findings presented by Smaldino and McElreath (Smaldino, McElreath 2016 R. Soc. Open Sci. 3, 160384 (doi:10.1098/rsos.160384)). The replication was successful with one exception. We find that selection acting on scientist's propensity for replication frequency caused a brief period of exuberant replication not observed in the original paper due to a coding error. This difference does not, however, change the authors' original conclusions. We call for more replication studies for simulations as unique contributions to scientific quality assurance.
Collapse
|
12
|
It's time! Ten reasons to start replicating simulation studies. FRONTIERS IN EPIDEMIOLOGY 2022; 2:973470. [PMID: 38455335 PMCID: PMC10911016 DOI: 10.3389/fepid.2022.973470] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 08/17/2022] [Indexed: 03/09/2024]
Abstract
The quantitative analysis of research data is a core element of empirical research. The performance of statistical methods that are used for analyzing empirical data can be evaluated and compared using computer simulations. A single simulation study can influence the analyses of thousands of empirical studies to follow. With great power comes great responsibility. Here, we argue that this responsibility includes replication of simulation studies to ensure a sound foundation for data analytical decisions. Furthermore, being designed, run, and reported by humans, simulation studies face challenges similar to other experimental empirical research and hence should not be exempt from replication attempts. We highlight that the potential replicability of simulation studies is an opportunity quantitative methodology as a field should pay more attention to.
Collapse
|
13
|
Detection of Klebsiella pneumoniae human gut carriage: a comparison of culture, qPCR, and whole metagenomic sequencing methods. Gut Microbes 2022; 14:2118500. [PMID: 36045603 PMCID: PMC9450895 DOI: 10.1080/19490976.2022.2118500] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Klebsiella pneumoniae is an important opportunistic healthcare-associated pathogen and major contributor to the global spread of antimicrobial resistance. Gastrointestinal colonization with K. pneumoniae is a major predisposing risk factor for infection and forms an important hub for the dispersal of resistance. Current culture-based detection methods are time consuming, give limited intra-sample abundance and strain diversity information, and have uncertain sensitivity. Here we investigated the presence and abundance of K. pneumoniae at the species and strain level within fecal samples from 103 community-based adults by qPCR and whole metagenomic sequencing (WMS) compared to culture-based detection. qPCR demonstrated the highest sensitivity, detecting K. pneumoniae in 61.2% and 75.8% of direct-fecal and culture-enriched sweep samples, respectively, including 52/52 culture-positive samples. WMS displayed lower sensitivity, detecting K. pneumoniae in 71.2% of culture-positive fecal samples at a 0.01% abundance cutoff, and was inclined to false positives in proportion to the relative abundance of other Enterobacterales present. qPCR accurately quantified K. pneumoniae to 16 genome copies/reaction while WMS could estimate relative abundance to at least 0.01%. Quantification by both methods correlated strongly with each other (Spearman's rho = 0.91). WMS also supported accurate intra-sample K. pneumoniae sequence type (ST)-level diversity detection from fecal microbiomes to 0.1% relative abundance, agreeing with the culture-based detected ST in 16/19 samples. Our results show that qPCR and WMS are sensitive and reliable tools for detection, quantification, and strain analysis of K. pneumoniae from fecal samples with potential to support infection control and enhance insights in K. pneumoniae gastrointestinal ecology.
Collapse
|
14
|
Connectivity concepts in neuronal network modeling. PLoS Comput Biol 2022; 18:e1010086. [PMID: 36074778 PMCID: PMC9455883 DOI: 10.1371/journal.pcbi.1010086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 04/07/2022] [Indexed: 11/19/2022] Open
Abstract
Sustainable research on computational models of neuronal networks requires published models to be understandable, reproducible, and extendable. Missing details or ambiguities about mathematical concepts and assumptions, algorithmic implementations, or parameterizations hinder progress. Such flaws are unfortunately frequent and one reason is a lack of readily applicable standards and tools for model description. Our work aims to advance complete and concise descriptions of network connectivity but also to guide the implementation of connection routines in simulation software and neuromorphic hardware systems. We first review models made available by the computational neuroscience community in the repositories ModelDB and Open Source Brain, and investigate the corresponding connectivity structures and their descriptions in both manuscript and code. The review comprises the connectivity of networks with diverse levels of neuroanatomical detail and exposes how connectivity is abstracted in existing description languages and simulator interfaces. We find that a substantial proportion of the published descriptions of connectivity is ambiguous. Based on this review, we derive a set of connectivity concepts for deterministically and probabilistically connected networks and also address networks embedded in metric space. Beside these mathematical and textual guidelines, we propose a unified graphical notation for network diagrams to facilitate an intuitive understanding of network properties. Examples of representative network models demonstrate the practical use of the ideas. We hope that the proposed standardizations will contribute to unambiguous descriptions and reproducible implementations of neuronal network connectivity in computational neuroscience.
Collapse
|
15
|
Mitigating Computer Limitations in Replicating Numerical Simulations of a Neural Network Model With Hodgkin-Huxley-Type Neurons. Front Neuroinform 2022; 16:874234. [PMID: 35645756 PMCID: PMC9135410 DOI: 10.3389/fninf.2022.874234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 04/12/2022] [Indexed: 11/13/2022] Open
Abstract
Computational experiments have been very important to numerically simulate real phenomena in several areas. Many studies in computational biology discuss the necessity to obtain numerical replicability to accomplish new investigations. However, even following well-established rules in the literature, numerical replicability is unsuccessful when it takes the computer's limitations for representing real numbers into consideration. In this study, we used a previous published recurrent network model composed by Hodgkin-Huxley-type neurons to simulate the neural activity during development. The original source code in C/C++ was carefully refactored to mitigate the lack of replicability; moreover, it was re-implemented to other programming languages/software (XPP/XPPAUT, Python and Matlab) and executed under two operating systems (Windows and Linux). The commutation and association of the input current values during the summation of the pre-synaptic activity were also analyzed. A total of 72 simulations which must obtain the same result were executed to cover these scenarios. The results were replicated when the high floating-point precision (supplied by third-party libraries) was used. However, using the default floating-point precision type, none of the results were replicated when compared with previous results. Several new procedures were proposed during the source code refactorization; they allowed replicating only a few scenarios, regardless of the language and operating system. Thus, the generated computational “errors” were the same. Even using a simple computational model, the numerical replicability was very difficult to be achieved, requiring people with computational expertise to be performed. After all, the research community must be aware that conducting analyses with numerical simulations that use real number operations can lead to different conclusions.
Collapse
|
16
|
A Modular Workflow for Performance Benchmarking of Neuronal Network Simulations. Front Neuroinform 2022; 16:837549. [PMID: 35645755 PMCID: PMC9131021 DOI: 10.3389/fninf.2022.837549] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 03/11/2022] [Indexed: 11/13/2022] Open
Abstract
Modern computational neuroscience strives to develop complex network models to explain dynamics and function of brains in health and disease. This process goes hand in hand with advancements in the theory of neuronal networks and increasing availability of detailed anatomical data on brain connectivity. Large-scale models that study interactions between multiple brain areas with intricate connectivity and investigate phenomena on long time scales such as system-level learning require progress in simulation speed. The corresponding development of state-of-the-art simulation engines relies on information provided by benchmark simulations which assess the time-to-solution for scientifically relevant, complementary network models using various combinations of hardware and software revisions. However, maintaining comparability of benchmark results is difficult due to a lack of standardized specifications for measuring the scaling performance of simulators on high-performance computing (HPC) systems. Motivated by the challenging complexity of benchmarking, we define a generic workflow that decomposes the endeavor into unique segments consisting of separate modules. As a reference implementation for the conceptual workflow, we develop beNNch: an open-source software framework for the configuration, execution, and analysis of benchmarks for neuronal network simulations. The framework records benchmarking data and metadata in a unified way to foster reproducibility. For illustration, we measure the performance of various versions of the NEST simulator across network models with different levels of complexity on a contemporary HPC system, demonstrating how performance bottlenecks can be identified, ultimately guiding the development toward more efficient simulation technology.
Collapse
|
17
|
Neuron-Glia Interactions and Brain Circuits. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1359:87-103. [PMID: 35471536 DOI: 10.1007/978-3-030-89439-9_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Recent evidence suggests that glial cells take an active role in a number of brain functions that were previously attributed solely to neurons. For example, astrocytes, one type of glial cells, have been shown to promote coordinated activation of neuronal networks, modulate sensory-evoked neuronal network activity, and influence brain state transitions during development. This reinforces the idea that astrocytes not only provide the "housekeeping" for the neurons, but that they also play a vital role in supporting and expanding the functions of brain circuits and networks. Despite this accumulated knowledge, the field of computational neuroscience has mostly focused on modeling neuronal functions, ignoring the glial cells and the interactions they have with the neurons. In this chapter, we introduce the biology of neuron-glia interactions, summarize the existing computational models and tools, and emphasize the glial properties that may be important in modeling brain functions in the future.
Collapse
|
18
|
Open-source Software Sustainability Models: Initial White Paper From the Informatics Technology for Cancer Research Sustainability and Industry Partnership Working Group. J Med Internet Res 2021; 23:e20028. [PMID: 34860667 PMCID: PMC8686402 DOI: 10.2196/20028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Revised: 12/14/2020] [Accepted: 09/23/2021] [Indexed: 11/13/2022] Open
Abstract
Background The National Cancer Institute Informatics Technology for Cancer Research (ITCR) program provides a series of funding mechanisms to create an ecosystem of open-source software (OSS) that serves the needs of cancer research. As the ITCR ecosystem substantially grows, it faces the challenge of the long-term sustainability of the software being developed by ITCR grantees. To address this challenge, the ITCR sustainability and industry partnership working group (SIP-WG) was convened in 2019. Objective The charter of the SIP-WG is to investigate options to enhance the long-term sustainability of the OSS being developed by ITCR, in part by developing a collection of business model archetypes that can serve as sustainability plans for ITCR OSS development initiatives. The working group assembled models from the ITCR program, from other studies, and from the engagement of its extensive network of relationships with other organizations (eg, Chan Zuckerberg Initiative, Open Source Initiative, and Software Sustainability Institute) in support of this objective. Methods This paper reviews the existing sustainability models and describes 10 OSS use cases disseminated by the SIP-WG and others, including 3D Slicer, Bioconductor, Cytoscape, Globus, i2b2 (Informatics for Integrating Biology and the Bedside) and tranSMART, Insight Toolkit, Linux, Observational Health Data Sciences and Informatics tools, R, and REDCap (Research Electronic Data Capture), in 10 sustainability aspects: governance, documentation, code quality, support, ecosystem collaboration, security, legal, finance, marketing, and dependency hygiene. Results Information available to the public reveals that all 10 OSS have effective governance, comprehensive documentation, high code quality, reliable dependency hygiene, strong user and developer support, and active marketing. These OSS include a variety of licensing models (eg, general public license version 2, general public license version 3, Berkeley Software Distribution, and Apache 3) and financial models (eg, federal research funding, industry and membership support, and commercial support). However, detailed information on ecosystem collaboration and security is not publicly provided by most OSS. Conclusions We recommend 6 essential attributes for research software: alignment with unmet scientific needs, a dedicated development team, a vibrant user community, a feasible licensing model, a sustainable financial model, and effective product management. We also stress important actions to be considered in future ITCR activities that involve the discussion of the sustainability and licensing models for ITCR OSS, the establishment of a central library, the allocation of consulting resources to code quality control, ecosystem collaboration, security, and dependency hygiene.
Collapse
|
19
|
Unbiased Recursive Partitioning Enables Robust and Reliable Outcome Prediction in Acute Spinal Cord Injury. J Neurotrauma 2021; 39:266-276. [PMID: 33619988 DOI: 10.1089/neu.2020.7407] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Neurological disorders usually present very heterogeneous recovery patterns. Nonetheless, accurate prediction of future clinical end-points and robust definition of homogeneous cohorts are necessary for scientific investigation and targeted care. For this, unbiased recursive partitioning with conditional inference trees (URP-CTREE) have received increasing attention in medical research, especially, but not limited to traumatic spinal cord injuries (SCIs). URP-CTREE was introduced to SCI as a clinical guidance tool to explore and define homogeneous outcome groups by clinical means, while providing high accuracy in predicting future clinical outcomes. The validity and predictive value of URP-CTREE to provide improvements compared with other more common approaches applied by clinicians has recently come under critical scrutiny. Therefore, a comprehensive simulation study based on traumatic, cervical complete spinal cord injuries provides a framework to investigate and quantify the issues raised. First, we assessed the replicability and robustness of URP-CTREE to identify homogeneous subgroups. Second, we implemented a prediction performance comparison of URP-CTREE with traditional statistical techniques, such as linear or logistic regression, and a novel machine learning method. URP-CTREE's ability to identify homogeneous subgroups proved to be replicable and robust. In terms of prediction, URP-CTREE yielded a high prognostic performance comparable to a machine learning algorithm. The simulation study provides strong evidence for the robustness of URP-CTREE, which is achieved without compromising prediction accuracy. The slightly lower prediction performance is offset by URP-CTREE's straightforward interpretation and application in clinical settings based on simple, data-driven decision rules.
Collapse
|
20
|
Executable Simulation Model of the Liver. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11682-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
21
|
Astrocyte-mediated spike-timing-dependent long-term depression modulates synaptic properties in the developing cortex. PLoS Comput Biol 2020; 16:e1008360. [PMID: 33170856 PMCID: PMC7654831 DOI: 10.1371/journal.pcbi.1008360] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 09/22/2020] [Indexed: 12/26/2022] Open
Abstract
Astrocytes have been shown to modulate synaptic transmission and plasticity in specific cortical synapses, but our understanding of the underlying molecular and cellular mechanisms remains limited. Here we present a new biophysicochemical model of a somatosensory cortical layer 4 to layer 2/3 synapse to study the role of astrocytes in spike-timing-dependent long-term depression (t-LTD) in vivo. By applying the synapse model and electrophysiological data recorded from rodent somatosensory cortex, we show that a signal from a postsynaptic neuron, orchestrated by endocannabinoids, astrocytic calcium signaling, and presynaptic N-methyl-D-aspartate receptors coupled with calcineurin signaling, induces t-LTD which is sensitive to the temporal difference between post- and presynaptic firing. We predict for the first time the dynamics of astrocyte-mediated molecular mechanisms underlying t-LTD and link complex biochemical networks at presynaptic, postsynaptic, and astrocytic sites to the time window of t-LTD induction. During t-LTD a single astrocyte acts as a delay factor for fast neuronal activity and integrates fast neuronal sensory processing with slow non-neuronal processing to modulate synaptic properties in the brain. Our results suggest that astrocytes play a critical role in synaptic computation during postnatal development and are of paramount importance in guiding the development of brain circuit functions, learning and memory.
Collapse
|
22
|
Calculating with Permanent Marker: How Blockchains Record Immutable Mistakes in Computational Chemistry. J Phys Chem Lett 2020; 11:6618-6620. [PMID: 32787338 DOI: 10.1021/acs.jpclett.0c02159] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Computational science experiments within an open blockchain environment have recently been demonstrated and can improve transparency, reproducibility, and censorship resistance in theoretical scientific work. However, the append-only nature of these records also means that historical calculation errors cannot be effectively removed or changed. This process preserves otherwise unavailable data on the scientific process of error correction and is shown here for simulations of carbon monoxide.
Collapse
|
23
|
Overview of green business practices within the Bangladeshi RMG industry: competitiveness and sustainable development perspective. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020; 27:22888-22901. [PMID: 32329005 DOI: 10.1007/s11356-020-08816-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Accepted: 04/07/2020] [Indexed: 06/11/2023]
Abstract
Green business initiatives have become a prime driving force towards sustainable development across the world. The economy of Bangladesh is now on a pick as she has been moving towards nearly hitting the target for becoming a middle-income economy by 2021, and for attaining so, the development of industrial sectors is unavoidable. The booming industrial sectors have been resulting in a massive depletion of natural resources, greenhouse gas erosion, and toxic waste disposal, which further can cause uncontrolled degradation of air, soil, and water. In this highly competitive world, like most other businesses, ready-made garment (RMG) firms are facing tremendous pressure for being more competitive and availing a decisive position towards the reduction of pollution and impacts of ecological footprint. This nation brings a new paradigm of doing business called green business (GB). Principles of green business compact with emerging approaches for producing, marketing, and disposing of the products, which maintain environmental safety and business competitiveness. Green-oriented business strategy can act as a crucial tactic for gaining a competitive advantage over the potential competitors and assess them with a better way to uphold their stand towards attaining sustainable development. Since the modern purchaser is continuously being attracted to green-oriented RMG industries, they want to attain their responsibility to Mother Nature. Especially for the RMG sectors of Bangladesh, environmental concern is highly crucial since it is the most significant economic sectors of this country. The prime objective of this study is to provide an overview of green business strategy associated with the RMG sectors of Bangladesh, which further assist the sectors with competitive advantages. In order to fulfill the prime objectives of the paper, the authors conferred various books, journals, and research papers related to green business and competitiveness within the context of Bangladeshi RMG sectors. Furthermore, the study arranged some casual discussions with industry professionals in corporate social responsibility (CSR), environmental protectionist, sustainable development executives, and ecology-friendly performance officer to provide substantial assistance for focusing on the precise direction of foundation evaluation and synthesis. In contextualization of the intensification of green business strategy and environmental degradation, we proposed a framework to quantify the green business strategy with the view of gain competitive advantages across the RMG sectors in Bangladesh.
Collapse
|
24
|
Computational chemistry experiments performed directly on a blockchain virtual computer. Chem Sci 2020; 11:4644-4647. [PMID: 34122919 PMCID: PMC8159212 DOI: 10.1039/d0sc01523g] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 04/15/2020] [Indexed: 01/24/2023] Open
Abstract
Blockchain technology has had a substantial impact across multiple disciplines, creating new methods for storing and processing data with improved transparency, immutability, and reproducibility. These developments come at a time when the reproducibility of many scientific findings has been called into question, including computational studies. Here we present a computational chemistry simulation run directly on a blockchain virtual machine, using a harmonic potential to model the vibration of carbon monoxide. The results demonstrate for the first time that computational science calculations are feasible entirely within a blockchain environment and that they can be used to increase transparency and accessibility across the computational sciences.
Collapse
|
25
|
International Collaboration in Open Access Publications: How Income Shapes International Collaboration. PUBLICATIONS 2020. [DOI: 10.3390/publications8010013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Does the rise of open access journals change the way researchers collaborate? Specifically, since publishing in open access journals requires a publication fee, does income affect how researchers form international collaborations? To answer this question, we create a new data set by scraping bibliographic data from Multidisciplinary Digital Publishing Institute (MDPI) journals. Using the four income group classifications from the World Bank Analytical Classifications, we find that researchers from low-income nations are more likely to form international collaborations than researchers from wealthier nations. This result is verified to be significant using a series of pairwise Kolmogorov–Smirnov tests. We then study which nations most frequently form international collaborations with other nations and find that the USA, China, Germany, and France are the most preferred nations for forming international collaborations. While most nations prefer to form international collaborations with high-income nations, some exceptions exist, where a nation most often forms international collaborations with a nearby nation that is either an upper-middle-income or lower-middle-income nation. We further this analysis by showing that these results are apparent across the six different research categories established in the Frascati Manual. Finally, trends in publications in MDPI journals mirror trends seen in all journals, such as the continued increase in the percentage of published papers involving international collaboration.
Collapse
|
26
|
Finite element analysis of the rotator cuff: A systematic review. Clin Biomech (Bristol, Avon) 2020; 71:73-85. [PMID: 31707188 PMCID: PMC7086380 DOI: 10.1016/j.clinbiomech.2019.10.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 07/30/2019] [Accepted: 10/05/2019] [Indexed: 02/07/2023]
Abstract
BACKGROUND Finite element modeling serves as a promising tool for investigating underlying rotator cuff biomechanics and pathology. However, there are currently no concrete guidelines for reporting in finite element model studies. This has compromised the reliability, validity, and reproducibility of literature due to omission of pertinent items within publications. Recently a Finite Element Model Grading Procedure has been proposed as a reporting guideline for model developers. The aim of this study was to conduct a systematic review of rotator cuff focused finite element models and characterize the reporting quality of those articles. METHODS A comprehensive literature search was performed in PubMed, Web of Science, and Embase to find relevant articles. Each article was graded and given a reporting quality ranking based on a score generated from the Finite Element Model Grading Procedure. FINDINGS We found that only 5/22 articles had scores of 75% or higher and fell within the "exceptional" reporting quality range. Most of the articles (16/22) fell within the "good" reporting quality range with scores between 50% and 75%. However, 9/16 articles within the "good" reporting quality range had scores below 60%. INTERPRETATION This study indicates that improved guidelines and standards for good reporting practices must be made in the field of finite element modeling. Furthermore, it supports the use of the Finite Element Model Grading Procedure as an objective method for evaluating the quality of finite element model reporting in the literature.
Collapse
|
27
|
A Review of Microsoft Academic Services for Science of Science Studies. Front Big Data 2019; 2:45. [PMID: 33693368 PMCID: PMC7931949 DOI: 10.3389/fdata.2019.00045] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 11/18/2019] [Indexed: 11/25/2022] Open
Abstract
Since the relaunch of Microsoft Academic Services (MAS) 4 years ago, scholarly communications have undergone dramatic changes: more ideas are being exchanged online, more authors are sharing their data, and more software tools used to make discoveries and reproduce the results are being distributed openly. The sheer amount of information available is overwhelming for individual humans to keep up and digest. In the meantime, artificial intelligence (AI) technologies have made great strides and the cost of computing has plummeted to the extent that it has become practical to employ intelligent agents to comprehensively collect and analyze scholarly communications. MAS is one such effort and this paper describes its recent progresses since the last disclosure. As there are plenty of independent studies affirming the effectiveness of MAS, this paper focuses on the use of three key AI technologies that underlies its prowess in capturing scholarly communications with adequate quality and broad coverage: (1) natural language understanding in extracting factoids from individual articles at the web scale, (2) knowledge assisted inference and reasoning in assembling the factoids into a knowledge graph, and (3) a reinforcement learning approach to assessing scholarly importance for entities participating in scholarly communications, called the saliency, that serves both as an analytic and a predictive metric in MAS. These elements enhance the capabilities of MAS in supporting the studies of science of science based on the GOTO principle, i.e., good and open data with transparent and objective methodologies. The current direction of development and how to access the regularly updated data and tools from MAS, including the knowledge graph, a REST API and a website, are also described.
Collapse
|
28
|
Can You Repeat That? Exploring the Definition of a Successful Model Replication in Health Economics. PHARMACOECONOMICS 2019; 37:1371-1381. [PMID: 31531833 DOI: 10.1007/s40273-019-00836-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
The International Society for Pharmacoeconomics and Outcomes Research (ISPOR) modelling taskforce suggests decision models should be thoroughly reported and transparent. However, the level of transparency and indeed how transparency should be assessed are yet to be defined. One way may be to attempt to replicate the model and its outputs. The ability to replicate a decision model could demonstrate adequate reporting transparency. This review aims to explore published definitions of replication success across all scientific disciplines and to consider how such a definition should be tailored for use in health economic models. A literature review was conducted to identify published definitions of a 'successful replication'. Using these as a foundation, several definitions of replication success were constructed, to be applicable to replications of economic decision models, with the associated strengths and weaknesses of such definitions discussed. A substantial body of literature discussing replicability was found; however, relatively few studies, ten, explicitly defined a successful replication. These definitions varied from subjective assessments to expecting exactly the same results to be reproduced. Whilst the definitions that have been found may help to construct a definition specific to health economics, no definition was found that completely encompassed the unique requirements for decision models. Replication is widely discussed in other scientific disciplines; however, as of yet, there is no consensus on how replicable models should be within health economics or what constitutes a successful replication. Replication studies can demonstrate how transparently a model is reported, identify potential calculation errors and inform future reporting practices. It may therefore be a useful adjunct to other transparency or quality measures.
Collapse
|
29
|
Abstract
Bolstered by ever affordable computational power and open big datasets, artificial intelligence (AI) technologies are bringing revolutionary changes to our lives. This article examines the current trends and elaborates the future potentials of AI in its role for making science more open and accessible. Based on the experience derived from a research project called Microsoft Academic, the advocates have reasons to be optimistic about the future of open science as the advanced discovery, ranking, and distribution technologies enabled by AI are offering strong incentives for scientists, funders and research managers to make research articles, data and software freely available and accessible.
Collapse
|
30
|
Abstract
Brian 2 allows scientists to simply and efficiently simulate spiking neural network models. These models can feature novel dynamical equations, their interactions with the environment, and experimental protocols. To preserve high performance when defining new models, most simulators offer two options: low-level programming or description languages. The first option requires expertise, is prone to errors, and is problematic for reproducibility. The second option cannot describe all aspects of a computational experiment, such as the potentially complex logic of a stimulation protocol. Brian addresses these issues using runtime code generation. Scientists write code with simple and concise high-level descriptions, and Brian transforms them into efficient low-level code that can run interleaved with their code. We illustrate this with several challenging examples: a plastic model of the pyloric network, a closed-loop sensorimotor model, a programmatic exploration of a neuron model, and an auditory model with real-time input.
Collapse
|
31
|
Automatised pharmacophoric deconvolution of plant extracts - application to Cinchona bark crude extract. Faraday Discuss 2019; 218:441-458. [PMID: 31120045 DOI: 10.1039/c8fd00242h] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We present a development of the "Plasmodesma" dereplication method [Margueritte et al., Magn. Reson. Chem., 2018, 56, 469]. This method is based on the automatic acquisition of a standard set of NMR experiments from a medium sized set of samples differing by their bioactivity. From this raw data, an analysis pipeline is run and the data is analysed by leveraging machine learning approaches in order to extract the spectral fingerprints of the active compounds. The optimal conditions for the analysis are determined and tested on two different systems, a synthetic sample where a single active molecule is to be isolated and characterized, and a complex bioactive matrix with synergetic interactions between the components. The method allows the identification of the active compounds and performs a pharmacophoric deconvolution. The program is freely available on the Internet, with an interactive visualisation of the statistical analysis, at https://plasmodesma.igbmc.science.
Collapse
|
32
|
Abstract
The increasing pursuit of replicable research and actual replication of research is a political project that articulates a very specific technology of accountability for science. This project was initiated in response to concerns about the openness and trustworthiness of science. Though applicable and valuable in many fields, here we argue that this value cannot be extended everywhere, since the epistemic content of fields, as well as their accountability infrastructures, differ. Furthermore, we argue that there are limits to replicability across all fields; but in some fields, including parts of the humanities, these limits severely undermine the value of replication to account for the value of research.
Collapse
|
33
|
Open collaborative writing with Manubot. PLoS Comput Biol 2019; 15:e1007128. [PMID: 31233491 PMCID: PMC6611653 DOI: 10.1371/journal.pcbi.1007128] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 07/05/2019] [Accepted: 05/24/2019] [Indexed: 01/08/2023] Open
Abstract
Open, collaborative research is a powerful paradigm that can immensely strengthen the scientific process by integrating broad and diverse expertise. However, traditional research and multi-author writing processes break down at scale. We present new software named Manubot, available at https://manubot.org, to address the challenges of open scholarly writing. Manubot adopts the contribution workflow used by many large-scale open source software projects to enable collaborative authoring of scholarly manuscripts. With Manubot, manuscripts are written in Markdown and stored in a Git repository to precisely track changes over time. By hosting manuscript repositories publicly, such as on GitHub, multiple authors can simultaneously propose and review changes. A cloud service automatically evaluates proposed changes to catch errors. Publication with Manubot is continuous: When a manuscript's source changes, the rendered outputs are rebuilt and republished to a web page. Manubot automates bibliographic tasks by implementing citation by identifier, where users cite persistent identifiers (e.g. DOIs, PubMed IDs, ISBNs, URLs), whose metadata is then retrieved and converted to a user-specified style. Manubot modernizes publishing to align with the ideals of open science by making it transparent, reproducible, immediate, versioned, collaborative, and free of charge.
Collapse
|
34
|
Pixel: a content management platform for quantitative omics data. PeerJ 2019; 7:e6623. [PMID: 30944779 PMCID: PMC6441322 DOI: 10.7717/peerj.6623] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 02/14/2019] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND In biology, high-throughput experimental technologies, also referred as "omics" technologies, are increasingly used in research laboratories. Several thousands of gene expression measurements can be obtained in a single experiment. Researchers are routinely facing the challenge to annotate, store, explore and mine all the biological information they have at their disposal. We present here the Pixel web application (Pixel Web App), an original content management platform to help people involved in a multi-omics biological project. METHODS The Pixel Web App is built with open source technologies and hosted on the collaborative development platform GitHub (https://github.com/Candihub/pixel). It is written in Python using the Django framework and stores all the data in a PostgreSQL database. It is developed in the open and licensed under the BSD 3-clause license. The Pixel Web App is also heavily tested with both unit and functional tests, a strong code coverage and continuous integration provided by CircleCI. To ease the development and the deployment of the Pixel Web App, Docker and Docker Compose are used to bundle the application as well as its dependencies. RESULTS The Pixel Web App offers researchers an intuitive way to annotate, store, explore and mine their multi-omics results. It can be installed on a personal computer or on a server to fit the needs of many users. In addition, anyone can enhance the application to better suit their needs, either by contributing directly on GitHub (encouraged) or by extending Pixel on their own. The Pixel Web App does not provide any computational programs to analyze the data. Still, it helps to rapidly explore and mine existing results and holds a strategic position in the management of research data.
Collapse
|
35
|
Successes and Struggles with Computational Reproducibility: Lessons from the Fragile Families Challenge. SOCIUS : SOCIOLOGICAL RESEARCH FOR A DYNAMIC WORLD 2019; 5:10.1177/2378023119849803. [PMID: 37309413 PMCID: PMC10260256 DOI: 10.1177/2378023119849803] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Reproducibility is fundamental to science, and an important component of reproducibility is computational reproducibility: the ability of a researcher to recreate the results of a published study using the original author's raw data and code. Although most people agree that computational reproducibility is important, it is still difficult to achieve in practice. In this article, the authors describe their approach to enabling computational reproducibility for the 12 articles in this special issue of Socius about the Fragile Families Challenge. The approach draws on two tools commonly used by professional software engineers but not widely used by academic researchers: software containers (e.g., Docker) and cloud computing (e.g., Amazon Web Services). These tools made it possible to standardize the computing environment around each submission, which will ease computational reproducibility both today and in the future. Drawing on their successes and struggles, the authors conclude with recommendations to researchers and journals.
Collapse
|
36
|
Replicability or reproducibility? On the replication crisis in computational neuroscience and sharing only relevant detail. J Comput Neurosci 2018; 45:163-172. [PMID: 30377880 PMCID: PMC6306493 DOI: 10.1007/s10827-018-0702-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Revised: 10/05/2018] [Accepted: 10/17/2018] [Indexed: 01/25/2023]
Abstract
Replicability and reproducibility of computational models has been somewhat understudied by "the replication movement." In this paper, we draw on methodological studies into the replicability of psychological experiments and on the mechanistic account of explanation to analyze the functions of model replications and model reproductions in computational neuroscience. We contend that model replicability, or independent researchers' ability to obtain the same output using original code and data, and model reproducibility, or independent researchers' ability to recreate a model without original code, serve different functions and fail for different reasons. This means that measures designed to improve model replicability may not enhance (and, in some cases, may actually damage) model reproducibility. We claim that although both are undesirable, low model reproducibility poses more of a threat to long-term scientific progress than low model replicability. In our opinion, low model reproducibility stems mostly from authors' omitting to provide crucial information in scientific papers and we stress that sharing all computer code and data is not a solution. Reports of computational studies should remain selective and include all and only relevant bits of code.
Collapse
|
37
|
"Reproducible" Research in Mathematical Sciences Requires Changes in our Peer Review Culture and Modernization of our Current Publication Approach. Bull Math Biol 2018; 80:3095-3105. [PMID: 30232583 PMCID: PMC6240027 DOI: 10.1007/s11538-018-0500-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 08/29/2018] [Indexed: 01/30/2023]
Abstract
The nature of scientific research in mathematical and computational biology allows editors and reviewers to evaluate the findings of a scientific paper. Replication of a research study should be the minimum standard for judging its scientific claims and considering it for publication. This requires changes in the current peer review practice and a strict adoption of a replication policy similar to those adopted in experimental fields such as organic synthesis. In the future, the culture of replication can be easily adopted by publishing papers through dynamic computational notebooks combining formatted text, equations, computer algebra and computer code.
Collapse
|
38
|
Neuroinformatics and Computational Modelling as Complementary Tools for Neurotoxicology Studies. Basic Clin Pharmacol Toxicol 2018; 123 Suppl 5:56-61. [DOI: 10.1111/bcpt.13075] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 06/18/2018] [Indexed: 11/28/2022]
|
39
|
Reproducing Polychronization: A Guide to Maximizing the Reproducibility of Spiking Network Models. Front Neuroinform 2018; 12:46. [PMID: 30123121 PMCID: PMC6085985 DOI: 10.3389/fninf.2018.00046] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 06/26/2018] [Indexed: 01/02/2023] Open
Abstract
Any modeler who has attempted to reproduce a spiking neural network model from its description in a paper has discovered what a painful endeavor this is. Even when all parameters appear to have been specified, which is rare, typically the initial attempt to reproduce the network does not yield results that are recognizably akin to those in the original publication. Causes include inaccurately reported or hidden parameters (e.g., wrong unit or the existence of an initialization distribution), differences in implementation of model dynamics, and ambiguities in the text description of the network experiment. The very fact that adequate reproduction often cannot be achieved until a series of such causes have been tracked down and resolved is in itself disconcerting, as it reveals unreported model dependencies on specific implementation choices that either were not clear to the original authors, or that they chose not to disclose. In either case, such dependencies diminish the credibility of the model's claims about the behavior of the target system. To demonstrate these issues, we provide a worked example of reproducing a seminal study for which, unusually, source code was provided at time of publication. Despite this seemingly optimal starting position, reproducing the results was time consuming and frustrating. Further examination of the correctly reproduced model reveals that it is highly sensitive to implementation choices such as the realization of background noise, the integration timestep, and the thresholding parameter of the analysis algorithm. From this process, we derive a guideline of best practices that would substantially reduce the investment in reproducing neural network studies, whilst simultaneously increasing their scientific quality. We propose that this guideline can be used by authors and reviewers to assess and improve the reproducibility of future network models.
Collapse
|
40
|
Challenges in Reproducibility, Replicability, and Comparability of Computational Models and Tools for Neuronal and Glial Networks, Cells, and Subcellular Structures. Front Neuroinform 2018; 12:20. [PMID: 29765315 PMCID: PMC5938413 DOI: 10.3389/fninf.2018.00020] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 04/06/2018] [Indexed: 01/26/2023] Open
Abstract
The possibility to replicate and reproduce published research results is one of the biggest challenges in all areas of science. In computational neuroscience, there are thousands of models available. However, it is rarely possible to reimplement the models based on the information in the original publication, let alone rerun the models just because the model implementations have not been made publicly available. We evaluate and discuss the comparability of a versatile choice of simulation tools: tools for biochemical reactions and spiking neuronal networks, and relatively new tools for growth in cell cultures. The replicability and reproducibility issues are considered for computational models that are equally diverse, including the models for intracellular signal transduction of neurons and glial cells, in addition to single glial cells, neuron-glia interactions, and selected examples of spiking neuronal networks. We also address the comparability of the simulation results with one another to comprehend if the studied models can be used to answer similar research questions. In addition to presenting the challenges in reproducibility and replicability of published results in computational neuroscience, we highlight the need for developing recommendations and good practices for publishing simulation tools and computational models. Model validation and flexible model description must be an integral part of the tool used to simulate and develop computational models. Constant improvement on experimental techniques and recording protocols leads to increasing knowledge about the biophysical mechanisms in neural systems. This poses new challenges for computational neuroscience: extended or completely new computational methods and models may be required. Careful evaluation and categorization of the existing models and tools provide a foundation for these future needs, for constructing multiscale models or extending the models to incorporate additional or more detailed biophysical mechanisms. Improving the quality of publications in computational neuroscience, enabling progressive building of advanced computational models and tools, can be achieved only through adopting publishing standards which underline replicability and reproducibility of research results.
Collapse
|
41
|
Computational Models for Calcium-Mediated Astrocyte Functions. Front Comput Neurosci 2018; 12:14. [PMID: 29670517 PMCID: PMC5893839 DOI: 10.3389/fncom.2018.00014] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2018] [Accepted: 02/28/2018] [Indexed: 12/16/2022] Open
Abstract
The computational neuroscience field has heavily concentrated on the modeling of neuronal functions, largely ignoring other brain cells, including one type of glial cell, the astrocytes. Despite the short history of modeling astrocytic functions, we were delighted about the hundreds of models developed so far to study the role of astrocytes, most often in calcium dynamics, synchronization, information transfer, and plasticity in vitro, but also in vascular events, hyperexcitability, and homeostasis. Our goal here is to present the state-of-the-art in computational modeling of astrocytes in order to facilitate better understanding of the functions and dynamics of astrocytes in the brain. Due to the large number of models, we concentrated on a hundred models that include biophysical descriptions for calcium signaling and dynamics in astrocytes. We categorized the models into four groups: single astrocyte models, astrocyte network models, neuron-astrocyte synapse models, and neuron-astrocyte network models to ease their use in future modeling projects. We characterized the models based on which earlier models were used for building the models and which type of biological entities were described in the astrocyte models. Features of the models were compared and contrasted so that similarities and differences were more readily apparent. We discovered that most of the models were basically generated from a small set of previously published models with small variations. However, neither citations to all the previous models with similar core structure nor explanations of what was built on top of the previous models were provided, which made it possible, in some cases, to have the same models published several times without an explicit intention to make new predictions about the roles of astrocytes in brain functions. Furthermore, only a few of the models are available online which makes it difficult to reproduce the simulation results and further develop the models. Thus, we would like to emphasize that only via reproducible research are we able to build better computational models for astrocytes, which truly advance science. Our study is the first to characterize in detail the biophysical and biochemical mechanisms that have been modeled for astrocytes.
Collapse
|
42
|
Re-run, Repeat, Reproduce, Reuse, Replicate: Transforming Code into Scientific Contributions. Front Neuroinform 2018; 11:69. [PMID: 29354046 PMCID: PMC5758530 DOI: 10.3389/fninf.2017.00069] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2017] [Accepted: 11/17/2017] [Indexed: 12/04/2022] Open
Abstract
Scientific code is different from production software. Scientific code, by producing results that are then analyzed and interpreted, participates in the elaboration of scientific conclusions. This imposes specific constraints on the code that are often overlooked in practice. We articulate, with a small example, five characteristics that a scientific code in computational science should possess: re-runnable, repeatable, reproducible, reusable, and replicable. The code should be executable (re-runnable) and produce the same result more than once (repeatable); it should allow an investigator to reobtain the published results (reproducible) while being easy to use, understand and modify (reusable), and it should act as an available reference for any ambiguity in the algorithmic descriptions of the article (replicable).
Collapse
|