1
|
Abstract
Brain scientists are now capable of collecting more data in a single experiment than researchers a generation ago might have collected over an entire career. Indeed, the brain itself seems to thirst for more and more data. Such digital information not only comprises individual studies but is also increasingly shared and made openly available for secondary, confirmatory, and/or combined analyses. Numerous web resources now exist containing data across spatiotemporal scales. Data processing workflow technologies running via cloud-enabled computing infrastructures allow for large-scale processing. Such a move toward greater openness is fundamentally changing how brain science results are communicated and linked to available raw data and processed results. Ethical, professional, and motivational issues challenge the whole-scale commitment to data-driven neuroscience. Nevertheless, fueled by government investments into primary brain data collection coupled with increased sharing and community pressure challenging the dominant publishing model, large-scale brain and data science is here to stay.
Collapse
Affiliation(s)
- John Darrell Van Horn
- Department of Psychology, University of Virginia, Charlottesville, Virginia, USA
- School of Data Science, University of Virginia, Charlottesville, Virginia, USA
| |
Collapse
|
2
|
Ieong PU, Sørensen J, Vemu PL, Wong CW, Demir Ö, Williams NP, Wang J, Crawl D, Swift RV, Malmstrom RD, Altintas I, Amaro RE. Progress towards automated Kepler scientific workflows for computer-aided drug discovery and molecular simulations. ACTA ACUST UNITED AC 2014; 29:1745-1755. [PMID: 29399238 DOI: 10.1016/j.procs.2014.05.159] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
We describe the development of automated workflows that support computed-aided drug discovery (CADD) and molecular dynamics (MD) simulations and are included as part of the National Biomedical Computational Resource (NBCR). The main workflow components include: file-management tasks, ligand force field parameterization, receptor-ligand molecular dynamics (MD) simulations, job submission and monitoring on relevant high-performance computing (HPC) resources, receptor structural clustering, virtual screening (VS), and statistical analyses of the VS results. The workflows aim to standardize simulation and analysis and promote best practices within the molecular simulation and CADD communities. Each component is developed as a stand-alone workflow, which allows easy integration into larger frameworks built to suit user needs, while remaining intuitive and easy to extend.
Collapse
Affiliation(s)
- Pek U Ieong
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Jesper Sørensen
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Prasantha L Vemu
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Celia W Wong
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Özlem Demir
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Nadya P Williams
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Jianwu Wang
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Daniel Crawl
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Robert V Swift
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Robert D Malmstrom
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Ilkay Altintas
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Rommie E Amaro
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| |
Collapse
|
3
|
A survey of the neuroscience resource landscape: perspectives from the neuroscience information framework. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2013. [PMID: 23195120 DOI: 10.1016/b978-0-12-388408-4.00003-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
The number of available neuroscience resources (databases, tools, materials, and networks) available via the Web continues to expand, particularly in light of newly implemented data sharing policies required by funding agencies and journals. However, the nature of dense, multifaceted neuroscience data and the design of classic search engine systems make efficient, reliable, and relevant discovery of such resources a significant challenge. This challenge is especially pertinent for online databases, whose dynamic content is largely opaque to contemporary search engines. The Neuroscience Information Framework was initiated to address this problem of finding and utilizing neuroscience-relevant resources. Since its first production release in 2008, NIF has been surveying the resource landscape for the neurosciences, identifying relevant resources and working to make them easily discoverable by the neuroscience community. In this chapter, we provide a survey of the resource landscape for neuroscience: what types of resources are available, how many there are, what they contain, and most importantly, ways in which these resources can be utilized by the research community to advance neuroscience research.
Collapse
|