1
|
Reiter T, Brooks† PT, Irber† L, Joslin† SEK, Reid† CM, Scott† C, Brown CT, Pierce-Ward NT. Streamlining data-intensive biology with workflow systems. Gigascience 2021; 10:giaa140. [PMID: 33438730 PMCID: PMC8631065 DOI: 10.1093/gigascience/giaa140] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 11/06/2020] [Accepted: 11/13/2020] [Indexed: 11/14/2022] Open
Abstract
As the scale of biological data generation has increased, the bottleneck of research has shifted from data generation to analysis. Researchers commonly need to build computational workflows that include multiple analytic tools and require incremental development as experimental insights demand tool and parameter modifications. These workflows can produce hundreds to thousands of intermediate files and results that must be integrated for biological insight. Data-centric workflow systems that internally manage computational resources, software, and conditional execution of analysis steps are reshaping the landscape of biological data analysis and empowering researchers to conduct reproducible analyses at scale. Adoption of these tools can facilitate and expedite robust data analysis, but knowledge of these techniques is still lacking. Here, we provide a series of strategies for leveraging workflow systems with structured project, data, and resource management to streamline large-scale biological analysis. We present these practices in the context of high-throughput sequencing data analysis, but the principles are broadly applicable to biologists working beyond this field.
Collapse
Affiliation(s)
- Taylor Reiter
- Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| | - Phillip T Brooks†
- Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| | - Luiz Irber†
- Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| | - Shannon E K Joslin†
- Department of Animal Science, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| | - Charles M Reid†
- Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| | - Camille Scott†
- Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| | - C Titus Brown
- Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| | - N Tessa Pierce-Ward
- Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA
| |
Collapse
|
2
|
Bokulich NA, Ziemski M, Robeson MS, Kaehler BD. Measuring the microbiome: Best practices for developing and benchmarking microbiomics methods. Comput Struct Biotechnol J 2020; 18:4048-4062. [PMID: 33363701 PMCID: PMC7744638 DOI: 10.1016/j.csbj.2020.11.049] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 11/27/2020] [Accepted: 11/28/2020] [Indexed: 12/12/2022] Open
Abstract
Microbiomes are integral components of diverse ecosystems, and increasingly recognized for their roles in the health of humans, animals, plants, and other hosts. Given their complexity (both in composition and function), the effective study of microbiomes (microbiomics) relies on the development, optimization, and validation of computational methods for analyzing microbial datasets, such as from marker-gene (e.g., 16S rRNA gene) and metagenome data. This review describes best practices for benchmarking and implementing computational methods (and software) for studying microbiomes, with particular focus on unique characteristics of microbiomes and microbiomics data that should be taken into account when designing and testing microbiomics methods.
Collapse
Affiliation(s)
- Nicholas A. Bokulich
- Laboratory of Food Systems Biotechnology, Institute of Food, Nutrition, and Health, ETH Zurich, Switzerland
| | - Michal Ziemski
- Laboratory of Food Systems Biotechnology, Institute of Food, Nutrition, and Health, ETH Zurich, Switzerland
| | - Michael S. Robeson
- University of Arkansas for Medical Sciences, Department of Biomedical Informatics, Little Rock, AR, USA
| | | |
Collapse
|
3
|
Abstract
AbstractThis paper places observational studies of women’s work in historical perspective. We present some of the very early studies (carried out in the period from 1900 to 1930), as well as several examples of fieldwork-based studies of women’s work, undertaken from different perspectives and in varied locations between the 1960s and the mid 1990s. We outline and discuss several areas of thought which have influenced studies of women’s work - the automation debate; the focus on the skills women need in their work; labour market segregation; women’s health; and technology and the redesign of work – and the research methods they used. Our main motivation in this paper is threefold: to demonstrate how fieldwork based studies which have focussed on women’s work have attempted to locate women’s work in a larger context that addresses its visibility and value; to provide a thematic historiography of studies of women’s work, thereby also demonstrating the value of an historical perspective, and a means through which to link it to contemporary themes; and to increase awareness of varied methodological perspectives on how to study work.
Collapse
|
4
|
Cereceda O, Quinn DE. A graduate student perspective on overcoming barriers to interacting with open-source software. Facets (Ott) 2020. [DOI: 10.1139/facets-2019-0020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Computational methods, coding, and software are important tools for conducting research. In both academic and industry data analytics, open-source software (OSS) has gained massive popularity. Collaborative source code allows students to interact with researchers, code developers, and users from a variety of disciplines. Based on the authors’ experiences as graduate students and coding instructors, this paper provides a unique overview of the obstacles that graduate students face in obtaining the knowledge and skills required to complete their research and in transitioning from an OSS user to a contributor: psychological, practical, and cultural barriers and challenges specific to graduate students including cognitive load in graduate school, the importance of a knowledgeable mentor, seeking help from both the online and local communities, and the ongoing campaign to recognize software as research output in career and degree progression. Specific and practical steps are recommended to provide a foundation for graduate students, supervisors, administrators, and members of the OSS community to help overcome these obstacles. In conclusion, the objective of these recommendations is to describe a possible framework that individuals from across the scientific community can adapt to their needs and facilitate a sustainable feedback loop between graduate students and OSS.
Collapse
Affiliation(s)
- Oihane Cereceda
- Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John’s, NL A1C 5S7, Canada
| | - Danielle E.A. Quinn
- Faculty of Science, Memorial University of Newfoundland, St. John’s, NL A1C 5S7, Canada
| |
Collapse
|
5
|
Ram K, Boettiger C, Chamberlain S, Ross N, Salmon M, Butland S. A Community of Practice Around Peer Review for Long-Term Research Software Sustainability. Comput Sci Eng 2019. [DOI: 10.1109/mcse.2018.2882753] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|