1
|
Anderson KR, Harris JA, Ng L, Prins P, Memar S, Ljungquist B, Fürth D, Williams RW, Ascoli GA, Dumitriu D. Highlights from the Era of Open Source Web-Based Tools. J Neurosci 2021; 41:927-936. [PMID: 33472826 PMCID: PMC7880282 DOI: 10.1523/jneurosci.1657-20.2020] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 11/22/2020] [Accepted: 11/29/2020] [Indexed: 12/20/2022] Open
Abstract
High digital connectivity and a focus on reproducibility are contributing to an open science revolution in neuroscience. Repositories and platforms have emerged across the whole spectrum of subdisciplines, paving the way for a paradigm shift in the way we share, analyze, and reuse vast amounts of data collected across many laboratories. Here, we describe how open access web-based tools are changing the landscape and culture of neuroscience, highlighting six free resources that span subdisciplines from behavior to whole-brain mapping, circuits, neurons, and gene variants.
Collapse
Affiliation(s)
- Kristin R Anderson
- Departments of Pediatrics and Psychiatry, Columbia University, New York, New York 10032
- Division of Developmental Psychobiology, New York State Psychiatric Institute, New York, New York 10032
- The Sackler Institute for Developmental Psychobiology, Columbia University, New York, New York 10032
- Columbia Population Research Center, Columbia University, New York, New York 10027
- Zuckerman Institute, Columbia University, New York, New York 10027
| | - Julie A Harris
- Allen Institute for Brain Science, Seattle, Washington 98109
| | - Lydia Ng
- Allen Institute for Brain Science, Seattle, Washington 98109
| | - Pjotr Prins
- Department of Genetics, Genomics and Informatics, Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, Tennessee 38163
| | - Sara Memar
- Robarts Research Institute, BrainsCAN, Schulich School of Medicine & Dentistry, Western University, London, Ontario N6A 3K7, Canada
| | - Bengt Ljungquist
- Center for Neural Informatics, Structures, and Plasticity, Krasnow Institute for Advanced Study; and Department of Bioengineering, Volgenau School of Engineering, George Mason University, Fairfax, Virginia 22030
| | - Daniel Fürth
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724
| | - Robert W Williams
- Department of Genetics, Genomics and Informatics, Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, Tennessee 38163
| | - Giorgio A Ascoli
- Center for Neural Informatics, Structures, and Plasticity, Krasnow Institute for Advanced Study; and Department of Bioengineering, Volgenau School of Engineering, George Mason University, Fairfax, Virginia 22030
| | - Dani Dumitriu
- Departments of Pediatrics and Psychiatry, Columbia University, New York, New York 10032
- Division of Developmental Psychobiology, New York State Psychiatric Institute, New York, New York 10032
- The Sackler Institute for Developmental Psychobiology, Columbia University, New York, New York 10032
- Columbia Population Research Center, Columbia University, New York, New York 10027
- Zuckerman Institute, Columbia University, New York, New York 10027
| |
Collapse
|
2
|
McDougal RA, Bulanova AS, Lytton WW. Reproducibility in Computational Neuroscience Models and Simulations. IEEE Trans Biomed Eng 2016; 63:2021-35. [PMID: 27046845 PMCID: PMC5016202 DOI: 10.1109/tbme.2016.2539602] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
OBJECTIVE Like all scientific research, computational neuroscience research must be reproducible. Big data science, including simulation research, cannot depend exclusively on journal articles as the method to provide the sharing and transparency required for reproducibility. METHODS Ensuring model reproducibility requires the use of multiple standard software practices and tools, including version control, strong commenting and documentation, and code modularity. RESULTS Building on these standard practices, model-sharing sites and tools have been developed that fit into several categories: 1) standardized neural simulators; 2) shared computational resources; 3) declarative model descriptors, ontologies, and standardized annotations; and 4) model-sharing repositories and sharing standards. CONCLUSION A number of complementary innovations have been proposed to enhance sharing, transparency, and reproducibility. The individual user can be encouraged to make use of version control, commenting, documentation, and modularity in development of models. The community can help by requiring model sharing as a condition of publication and funding. SIGNIFICANCE Model management will become increasingly important as multiscale models become larger, more detailed, and correspondingly more difficult to manage by any single investigator or single laboratory. Additional big data management complexity will come as the models become more useful in interpreting experiments, thus increasing the need to ensure clear alignment between modeling data, both parameters and results, and experiment.
Collapse
|
3
|
Abstract
Routine data sharing is greatly benefiting several scientific disciplines, such as molecular biology, particle physics, and astronomy. Neuroscience data, in contrast, are still rarely shared, greatly limiting the potential for secondary discovery and the acceleration of research progress. Although the attitude toward data sharing is non-uniform across neuroscience subdomains, widespread adoption of data sharing practice will require a cultural shift in the community. Digital reconstructions of axonal and dendritic morphology constitute a particularly "sharable" kind of data. The popularity of the public repository NeuroMorpho.Org demonstrates that data sharing can benefit both users and contributors. Increased data availability is also catalyzing the grassroots development and spontaneous integration of complementary resources, research tools, and community initiatives. Even in this rare successful subfield, however, more data are still unshared than shared. Our experience as developers and curators of NeuroMorpho.Org suggests that greater transparency regarding the expectations and consequences of sharing (or not sharing) data, combined with public disclosure of which datasets are shared and which are not, may expedite the transition to community-wide data sharing.
Collapse
Affiliation(s)
- Giorgio A. Ascoli
- Krasnow Institute for Advanced Study, George Mason University, Fairfax, Virginia, United States of America
| |
Collapse
|
4
|
A Digital Repository and Execution Platform for Interactive Scholarly Publications in Neuroscience. Neuroinformatics 2015; 14:23-40. [PMID: 26306864 DOI: 10.1007/s12021-015-9276-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The CARMEN Virtual Laboratory (VL) is a cloud-based platform which allows neuroscientists to store, share, develop, execute, reproduce and publicise their work. This paper describes new functionality in the CARMEN VL: an interactive publications repository. This new facility allows users to link data and software to publications. This enables other users to examine data and software associated with the publication and execute the associated software within the VL using the same data as the authors used in the publication. The cloud-based architecture and SaaS (Software as a Service) framework allows vast data sets to be uploaded and analysed using software services. Thus, this new interactive publications facility allows others to build on research results through reuse. This aligns with recent developments by funding agencies, institutions, and publishers with a move to open access research. Open access provides reproducibility and verification of research resources and results. Publications and their associated data and software will be assured of long-term preservation and curation in the repository. Further, analysing research data and the evaluations described in publications frequently requires a number of execution stages many of which are iterative. The VL provides a scientific workflow environment to combine software services into a processing tree. These workflows can also be associated with publications and executed by users. The VL also provides a secure environment where users can decide the access rights for each resource to ensure copyright and privacy restrictions are met.
Collapse
|
5
|
Making big data open: data sharing in neuroimaging. Nat Neurosci 2014; 17:1510-7. [PMID: 25349916 DOI: 10.1038/nn.3818] [Citation(s) in RCA: 240] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Accepted: 08/21/2014] [Indexed: 12/11/2022]
|
6
|
Marenco LN, Wang R, Bandrowski AE, Grethe JS, Shepherd GM, Miller PL. Extending the NIF DISCO framework to automate complex workflow: coordinating the harvest and integration of data from diverse neuroscience information resources. Front Neuroinform 2014; 8:58. [PMID: 25018728 PMCID: PMC4071641 DOI: 10.3389/fninf.2014.00058] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Accepted: 05/06/2014] [Indexed: 11/15/2022] Open
Abstract
This paper describes how DISCO, the data aggregator that supports the Neuroscience Information Framework (NIF), has been extended to play a central role in automating the complex workflow required to support and coordinate the NIF’s data integration capabilities. The NIF is an NIH Neuroscience Blueprint initiative designed to help researchers access the wealth of data related to the neurosciences available via the Internet. A central component is the NIF Federation, a searchable database that currently contains data from 231 data and information resources regularly harvested, updated, and warehoused in the DISCO system. In the past several years, DISCO has greatly extended its functionality and has evolved to play a central role in automating the complex, ongoing process of harvesting, validating, integrating, and displaying neuroscience data from a growing set of participating resources. This paper provides an overview of DISCO’s current capabilities and discusses a number of the challenges and future directions related to the process of coordinating the integration of neuroscience data within the NIF Federation.
Collapse
Affiliation(s)
- Luis N Marenco
- Center for Medical Informatics, Yale University School of Medicine New Haven, CT, USA ; VA Connecticut Healthcare System, US Department of Veterans Affairs West Haven, CT, USA ; Department of Neurobiology, Yale University School of Medicine New Haven, CT, USA
| | - Rixin Wang
- Center for Medical Informatics, Yale University School of Medicine New Haven, CT, USA
| | - Anita E Bandrowski
- Department of Neurosciences, Center for Research in Biological Systems, University of California at San Diego La Jolla, CA, USA
| | - Jeffrey S Grethe
- Department of Neurosciences, Center for Research in Biological Systems, University of California at San Diego La Jolla, CA, USA
| | - Gordon M Shepherd
- Department of Neurobiology, Yale University School of Medicine New Haven, CT, USA
| | - Perry L Miller
- Center for Medical Informatics, Yale University School of Medicine New Haven, CT, USA ; VA Connecticut Healthcare System, US Department of Veterans Affairs West Haven, CT, USA ; Department of Anesthesiology, Yale University School of Medicine New Haven, CT, USA ; Department of Molecular, Cellular and Developmental Biology, Yale University New Haven, CT, USA
| |
Collapse
|
7
|
Arbib MA, Bonaiuto JJ, Bornkessel-Schlesewsky I, Kemmerer D, MacWhinney B, Nielsen FÅ, Oztop E. Action and language mechanisms in the brain: data, models and neuroinformatics. Neuroinformatics 2014; 12:209-25. [PMID: 24234916 PMCID: PMC4101894 DOI: 10.1007/s12021-013-9210-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We assess the challenges of studying action and language mechanisms in the brain, both singly and in relation to each other to provide a novel perspective on neuroinformatics, integrating the development of databases for encoding – separately or together – neurocomputational models and empirical data that serve systems and cognitive neuroscience.
Collapse
Affiliation(s)
- Michael A. Arbib
- Computer Science and Neuroscience Graduate Program, University of Southern California, Los Angeles, CA, USA
| | - James J. Bonaiuto
- Division of Biology, California Institute of Technology, Pasadena, CA, USA
| | | | - David Kemmerer
- Speech, Language, & Hearing Sciences and Psychological Sciences, Purdue University, West Lafayette, IN, USA
| | - Brian MacWhinney
- Psychology, Computational Linguistics, and Modern Languages, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | |
Collapse
|
8
|
Poldrack RA, Barch DM, Mitchell JP, Wager TD, Wagner AD, Devlin JT, Cumba C, Koyejo O, Milham MP. Toward open sharing of task-based fMRI data: the OpenfMRI project. Front Neuroinform 2013; 7:12. [PMID: 23847528 PMCID: PMC3703526 DOI: 10.3389/fninf.2013.00012] [Citation(s) in RCA: 203] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2013] [Accepted: 06/03/2013] [Indexed: 11/13/2022] Open
Abstract
The large-scale sharing of task-based functional neuroimaging data has the potential to allow novel insights into the organization of mental function in the brain, but the field of neuroimaging has lagged behind other areas of bioscience in the development of data sharing resources. This paper describes the OpenFMRI project (accessible online at http://www.openfmri.org), which aims to provide the neuroimaging community with a resource to support open sharing of task-based fMRI studies. We describe the motivation behind the project, focusing particularly on how this project addresses some of the well-known challenges to sharing of task-based fMRI data. Results from a preliminary analysis of the current database are presented, which demonstrate the ability to classify between task contrasts with high generalization accuracy across subjects, and the ability to identify individual subjects from their activation maps with moderately high accuracy. Clustering analyses show that the similarity relations between statistical maps have a somewhat orderly relation to the mental functions engaged by the relevant tasks. These results highlight the potential of the project to support large-scale multivariate analyses of the relation between mental processes and brain function.
Collapse
|
9
|
A survey of the neuroscience resource landscape: perspectives from the neuroscience information framework. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2013. [PMID: 23195120 DOI: 10.1016/b978-0-12-388408-4.00003-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
The number of available neuroscience resources (databases, tools, materials, and networks) available via the Web continues to expand, particularly in light of newly implemented data sharing policies required by funding agencies and journals. However, the nature of dense, multifaceted neuroscience data and the design of classic search engine systems make efficient, reliable, and relevant discovery of such resources a significant challenge. This challenge is especially pertinent for online databases, whose dynamic content is largely opaque to contemporary search engines. The Neuroscience Information Framework was initiated to address this problem of finding and utilizing neuroscience-relevant resources. Since its first production release in 2008, NIF has been surveying the resource landscape for the neurosciences, identifying relevant resources and working to make them easily discoverable by the neuroscience community. In this chapter, we provide a survey of the resource landscape for neuroscience: what types of resources are available, how many there are, what they contain, and most importantly, ways in which these resources can be utilized by the research community to advance neuroscience research.
Collapse
|
10
|
Kennedy DN, Hodge SM, Gao Y, Frazier JA, Haselgrove C. The internet brain volume database: a public resource for storage and retrieval of volumetric data. Neuroinformatics 2012; 10:129-40. [PMID: 21931990 DOI: 10.1007/s12021-011-9130-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Every month, numerous publications appear that include neuroanatomic volumetric observations. The current and past literature that includes volumetric measurements is vast, but variable with respect to specific species, structures, and subject characteristics (such as gender, age, pathology, etc.). In this report we introduce the Internet Brain Volume Database (IBVD), www.nitrc.org/projects/ibvd , a site devoted to facilitating access to and utilization of neuroanatomic volumetric observations as published in the literature. We review the design and functionality of the site. The IBVD is the first database dedicated to integrating, exposing and sharing brain volumetric observations across species and disease. It offers valuable functionality for quality assurance assessment of results as well as support for meta-analysis across large segments of the published literature that are obscured from traditional text-based search engines.
Collapse
Affiliation(s)
- David N Kennedy
- Department of Psychiatry, University of Massachusetts Medical School, 356 Plantation St, Biotech 1, Suite 100, Worcester, MA 01605, USA.
| | | | | | | | | |
Collapse
|
11
|
Bandrowski AE, Cachat J, Li Y, Müller HM, Sternberg PW, Ciccarese P, Clark T, Marenco L, Wang R, Astakhov V, Grethe JS, Martone ME. A hybrid human and machine resource curation pipeline for the Neuroscience Information Framework. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas005. [PMID: 22434839 PMCID: PMC3308161 DOI: 10.1093/database/bas005] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The breadth of information resources available to researchers on the Internet continues to expand, particularly in light of recently implemented data-sharing policies required by funding agencies. However, the nature of dense, multifaceted neuroscience data and the design of contemporary search engine systems makes efficient, reliable and relevant discovery of such information a significant challenge. This challenge is specifically pertinent for online databases, whose dynamic content is ‘hidden’ from search engines. The Neuroscience Information Framework (NIF; http://www.neuinfo.org) was funded by the NIH Blueprint for Neuroscience Research to address the problem of finding and utilizing neuroscience-relevant resources such as software tools, data sets, experimental animals and antibodies across the Internet. From the outset, NIF sought to provide an accounting of available resources, whereas developing technical solutions to finding, accessing and utilizing them. The curators therefore, are tasked with identifying and registering resources, examining data, writing configuration files to index and display data and keeping the contents current. In the initial phases of the project, all aspects of the registration and curation processes were manual. However, as the number of resources grew, manual curation became impractical. This report describes our experiences and successes with developing automated resource discovery and semiautomated type characterization with text-mining scripts that facilitate curation team efforts to discover, integrate and display new content. We also describe the DISCO framework, a suite of automated web services that significantly reduce manual curation efforts to periodically check for resource updates. Lastly, we discuss DOMEO, a semi-automated annotation tool that improves the discovery and curation of resources that are not necessarily website-based (i.e. reagents, software tools). Although the ultimate goal of automation was to reduce the workload of the curators, it has resulted in valuable analytic by-products that address accessibility, use and citation of resources that can now be shared with resource owners and the larger scientific community. Database URL:http://neuinfo.org
Collapse
Affiliation(s)
- A E Bandrowski
- Center for Research in Biological Systems, University of California San Diego, CA, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
The NIF DISCO Framework: facilitating automated integration of neuroscience content on the web. Neuroinformatics 2010; 8:101-12. [PMID: 20387131 DOI: 10.1007/s12021-010-9068-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
This paper describes the capabilities of DISCO, an extensible approach that supports integrative Web-based information dissemination. DISCO is a component of the Neuroscience Information Framework (NIF), an NIH Neuroscience Blueprint initiative that facilitates integrated access to diverse neuroscience resources via the Internet. DISCO facilitates the automated maintenance of several distinct capabilities using a collection of files 1) that are maintained locally by the developers of participating neuroscience resources and 2) that are "harvested" on a regular basis by a central DISCO server. This approach allows central NIF capabilities to be updated as each resource's content changes over time. DISCO currently supports the following capabilities: 1) resource descriptions, 2) "LinkOut" to a resource's data items from NCBI Entrez resources such as PubMed, 3) Web-based interoperation with a resource, 4) sharing a resource's lexicon and ontology, 5) sharing a resource's database schema, and 6) participation by the resource in neuroscience-related RSS news dissemination. The developers of a resource are free to choose which DISCO capabilities their resource will participate in. Although DISCO is used by NIF to facilitate neuroscience data integration, its capabilities have general applicability to other areas of research.
Collapse
|
13
|
The neuroscience information framework: a data and knowledge environment for neuroscience. Neuroinformatics 2008; 6:149-60. [PMID: 18946742 DOI: 10.1007/s12021-008-9024-z] [Citation(s) in RCA: 136] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2008] [Accepted: 08/26/2008] [Indexed: 10/21/2022]
Abstract
With support from the Institutes and Centers forming the NIH Blueprint for Neuroscience Research, we have designed and implemented a new initiative for integrating access to and use of Web-based neuroscience resources: the Neuroscience Information Framework. The Framework arises from the expressed need of the neuroscience community for neuroinformatic tools and resources to aid scientific inquiry, builds upon prior development of neuroinformatics by the Human Brain Project and others, and directly derives from the Society for Neuroscience's Neuroscience Database Gateway. Partnered with the Society, its Neuroinformatics Committee, and volunteer consultant-collaborators, our multi-site consortium has developed: (1) a comprehensive, dynamic, inventory of Web-accessible neuroscience resources, (2) an extended and integrated terminology describing resources and contents, and (3) a framework accepting and aiding concept-based queries. Evolving instantiations of the Framework may be viewed at http://nif.nih.gov , http://neurogateway.org , and other sites as they come on line.
Collapse
|
14
|
Halavi M, Polavaram S, Donohue DE, Hamilton G, Hoyt J, Smith KP, Ascoli GA. NeuroMorpho.Org implementation of digital neuroscience: dense coverage and integration with the NIF. Neuroinformatics 2008; 6:241-52. [PMID: 18949582 PMCID: PMC2655120 DOI: 10.1007/s12021-008-9030-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2008] [Accepted: 09/22/2008] [Indexed: 02/03/2023]
Abstract
Neuronal morphology affects network connectivity, plasticity, and information processing. Uncovering the design principles and functional consequences of dendritic and axonal shape necessitates quantitative analysis and computational modeling of detailed experimental data. Digital reconstructions provide the required neuromorphological descriptions in a parsimonious, comprehensive, and reliable numerical format. NeuroMorpho.Org is the largest web-accessible repository service for digitally reconstructed neurons and one of the integrated resources in the Neuroscience Information Framework (NIF). Here we describe the NeuroMorpho.Org approach as an exemplary experience in designing, creating, populating, and curating a neuroscience digital resource. The simple three-tier architecture of NeuroMorpho.Org (web client, web server, and relational database) encompasses all necessary elements to support a large-scale, integrate-able repository. The data content, while heterogeneous in scientific scope and experimental origin, is unified in format and presentation by an in house standardization protocol. The server application (MRALD) is secure, customizable, and developer-friendly. Centralized processing and expert annotation yields a comprehensive set of metadata that enriches and complements the raw data. The thoroughly tested interface design allows for optimal and effective data search and retrieval. Availability of data in both original and standardized formats ensures compatibility with existing resources and fosters further tool development. Other key functions enable extensive exploration and discovery, including 3D and interactive visualization of branching, frequently measured morphometrics, and reciprocal links to the original PubMed publications. The integration of NeuroMorpho.Org with version-1 of the NIF (NIFv1) provides the opportunity to access morphological data in the context of other relevant resources and diverse subdomains of neuroscience, opening exciting new possibilities in data mining and knowledge discovery. The outcome of such coordination is the rapid and powerful advancement of neuroscience research at both the conceptual and technological level.
Collapse
Affiliation(s)
- Maryam Halavi
- Center for Neural Informatics, Structure, & Plasticity, and Molecular Neuroscience Department, Krasnow Institute for Advanced Study, George Mason University, Fairfax, VA, USA
| | | | | | | | | | | | | |
Collapse
|