1
|
Wogu E, Ogoh G, Filima P, Nsaanee B, Caron B, Pestilli F, Eke D. FAIR African brain data: challenges and opportunities. Front Neuroinform 2025; 19:1530445. [PMID: 40098921 PMCID: PMC11911527 DOI: 10.3389/fninf.2025.1530445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 02/12/2025] [Indexed: 03/19/2025] Open
Abstract
Introduction The effectiveness of research and innovation often relies on the diversity or heterogeneity of datasets that are Findable, Accessible, Interoperable and Reusable (FAIR). However, the global landscape of brain data is yet to achieve desired levels of diversity that can facilitate generalisable outputs. Brain datasets from low-and middle-income countries of Africa are still missing in the global open science ecosystem. This can mean that decades of brain research and innovation may not be generalisable to populations in Africa. Methods This research combined experiential learning or experiential research with a survey questionnaire. The experiential research involved deriving insights from direct, hands-on experiences of collecting African Brain data in view of making it FAIR. This was a critical process of action, reflection, and learning from doing data collection. A questionnaire was then used to validate the findings from the experiential research and provide wider contexts for these findings. Results The experiential research revealed major challenges to FAIR African brain data that can be categorised as socio-cultural, economic, technical, ethical and legal challenges. It also highlighted opportunities for growth that include capacity development, development of technical infrastructure, funding as well as policy and regulatory changes. The questionnaire then showed that the wider African neuroscience community believes that these challenges can be ranked in order of priority as follows: Technical, economic, socio-cultural and ethical and legal challenges. Conclusion We conclude that African researchers need to work together as a community to address these challenges in a way to maximise efforts and to build a thriving FAIR brain data ecosystem that is socially acceptable, ethically responsible, technically robust and legally compliant.
Collapse
Affiliation(s)
- Eberechi Wogu
- Department of Anatomy, University of Port Harcourt, Port Harcourt, Nigeria
| | - George Ogoh
- School of Computer Science, University of Nottingham, Nottingham, United Kingdom
| | - Patrick Filima
- Department of Anatomy, University of Port Harcourt, Port Harcourt, Nigeria
| | - Barisua Nsaanee
- Department of Anatomy, University of Port Harcourt, Port Harcourt, Nigeria
| | - Bradley Caron
- Department of Psychology and Neuroscience, The University of Texas at Austin, Austin, TX, United States
| | - Franco Pestilli
- Department of Psychology and Neuroscience, The University of Texas at Austin, Austin, TX, United States
| | - Damian Eke
- School of Computer Science, University of Nottingham, Nottingham, United Kingdom
| |
Collapse
|
2
|
Clark T, Caufield H, Parker JA, Al Manir S, Amorim E, Eddy J, Gim N, Gow B, Goar W, Haendel M, Hansen JN, Harris N, Hermjakob H, McWeeney SK, Nebeker C, Nikolov M, Shaffer J, Sheffield N, Sheynkman G, Stevenson J, Mungall C, Chen JY, Wagner A, Kong SW, Ghosh SS, Patel B, Williams A, Munoz-Torres MC. AI-readiness for Biomedical Data: Bridge2AI Recommendations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.23.619844. [PMID: 39484409 PMCID: PMC11526931 DOI: 10.1101/2024.10.23.619844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
Biomedical research and clinical practice are in the midst of a transition toward significantly increased use of artificial intelligence (AI) and machine learning (ML) methods. These advances promise to enable qualitatively deeper insight into complex challenges formerly beyond the reach of analytic methods and human intuition while placing increased demands on ethical and explainable artificial intelligence (XAI), given the opaque nature of many deep learning methods. The U.S. National Institutes of Health (NIH) has initiated a significant research and development program, Bridge2AI, aimed at producing new "flagship" datasets designed to support AI/ML analysis of complex biomedical challenges, elucidate best practices, develop tools and standards in AI/ML data science, and disseminate these datasets, tools, and methods broadly to the biomedical community. An essential set of concepts to be developed and disseminated in this program along with the data and tools produced are criteria for AI-readiness of data, including critical considerations for XAI and ethical, legal, and social implications (ELSI) of AI technologies. NIH Bridge to Artificial Intelligence (Bridge2AI) Standards Working Group members prepared this article to present methods for assessing the AI-readiness of biomedical data and the data standards perspectives and criteria we have developed throughout this program. While the field is rapidly evolving, these criteria are foundational for scientific rigor and the ethical design and application of biomedical AI methods.
Collapse
|
3
|
Leigh DM, Vandergast AG, Hunter ME, Crandall ED, Funk WC, Garroway CJ, Hoban S, Oyler-McCance SJ, Rellstab C, Segelbacher G, Schmidt C, Vázquez-Domínguez E, Paz-Vinas I. Best practices for genetic and genomic data archiving. Nat Ecol Evol 2024; 8:1224-1232. [PMID: 38789640 DOI: 10.1038/s41559-024-02423-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 04/25/2024] [Indexed: 05/26/2024]
Abstract
Genetic and genomic data are collected for a vast array of scientific and applied purposes. Despite mandates for public archiving, data are typically used only by the generating authors. The reuse of genetic and genomic datasets remains uncommon because it is difficult, if not impossible, due to non-standard archiving practices and lack of contextual metadata. But as the new field of macrogenetics is demonstrating, if genetic data and their metadata were more accessible and FAIR (findable, accessible, interoperable and reusable) compliant, they could be reused for many additional purposes. We discuss the main challenges with existing genetic and genomic data archives, and suggest best practices for archiving genetic and genomic data. Recognizing that this is a longstanding issue due to little formal data management training within the fields of ecology and evolution, we highlight steps that research institutions and publishers could take to improve data archiving.
Collapse
Affiliation(s)
- Deborah M Leigh
- Swiss Federal Research Institute WSL, Birmensdorf, Switzerland.
| | - Amy G Vandergast
- US Geological Survey, Western Ecological Research Center, San Diego, CA, USA
| | - Margaret E Hunter
- US Geological Survey, Wetland & Aquatic Research Center, Gainesville, FL, USA
| | - Eric D Crandall
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - W Chris Funk
- Department of Biology, Graduate Degree Program in Ecology, Colorado State University, Fort Collins, CO, USA
| | - Colin J Garroway
- Department of Biological Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Sean Hoban
- Center for Tree Science, The Morton Arboretum, Lisle, IL, USA
| | | | | | | | - Chloé Schmidt
- German Centre for Integrative Biodiversity Research Halle-Jena-Leipzig, Leipzig, Germany
| | - Ella Vázquez-Domínguez
- Departamento de Ecología de la Biodiversidad, Instituto de Ecología, Universidad Nacional Autónoma de México, Coyoacán, Ciudad de México, México
| | - Ivan Paz-Vinas
- Department of Biology, Graduate Degree Program in Ecology, Colorado State University, Fort Collins, CO, USA
- Universite Claude Bernard Lyon 1, LEHNA UMR 5023, CNRS, ENTPE, Villeurbanne, France
| |
Collapse
|
4
|
Tranfield EM, Lippens S. Future proofing core facilities with a seven-pillar model. J Microsc 2024; 294:411-419. [PMID: 38700841 DOI: 10.1111/jmi.13314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 04/23/2024] [Accepted: 04/23/2024] [Indexed: 05/21/2024]
Abstract
Centralised core facilities have evolved into vital components of life science research, transitioning from a primary focus on centralising equipment to ensuring access to technology experts across all facets of an experimental workflow. Herein, we put forward a seven-pillar model to define what a core facility needs to meet its overarching goal of facilitating research. The seven equally weighted pillars are Technology, Core Facility Team, Training, Career Tracks, Technical Support, Community and Transparency. These seven pillars stand on a solid foundation of cultural, operational and framework policies including the elements of transparent and stable funding strategies, modern human resources support, progressive facility leadership and management as well as clear institute strategies and policies. This foundation, among other things, ensures a tight alignment of the core facilities to the vision and mission of the institute. To future-proof core facilities, it is crucial to foster all seven of these pillars, particularly focusing on newly identified pillars such as career tracks, thus enabling core facilities to continue supporting research and catalysing scientific advancement. Lay abstract: In research, there is a growing trend to bring advanced, high-performance equipment together into a centralised location. This is done to streamline how the equipment purchase is financed, how the equipment is maintained, and to enable an easier approach for research scientists to access these tools in a location that is supported by a team of technology experts who can help scientists use the equipment. These centralised equipment centres are called Core Facilities. The core facility model is relatively new in science and it requires an adapted approach to how core facilities are built and managed. In this paper, we put forward a seven-pillar model of the important supporting elements of core facilities. These supporting elements are: Technology (the instruments themselves), Core Facility Team (the technology experts who operate the instruments), Training (of the staff and research community), Career Tracks (for the core facility staff), Technical Support (the process of providing help to apply the technology to a scientific question), Community (of research scientist, technology experts and developers) and Transparency (of how the core facility works and the costs associated with using the service). These pillars stand on the bigger foundation of clear policies, guidelines, and leadership approaches at the institutional level. With a focus on these elements, the authors feel core facilities will be well positioned to support scientific discovery in the future.
Collapse
Affiliation(s)
- Erin M Tranfield
- VIB Bioimaging Core Ghent, VIB, Zwijnaarde, Belgium
- VIB Center for Inflammation Research, Ghent University, Zwijnaarde, Belgium
| | | |
Collapse
|
5
|
Steffens S, Schröder K, Krüger M, Maack C, Streckfuss-Bömeke K, Backs J, Backofen R, Baeßler B, Devaux Y, Gilsbach R, Heijman J, Knaus J, Kramann R, Linz D, Lister AL, Maatz H, Maegdefessel L, Mayr M, Meder B, Nussbeck SY, Rog-Zielinska EA, Schulz MH, Sickmann A, Yigit G, Kohl P. The challenges of research data management in cardiovascular science: a DGK and DZHK position paper-executive summary. Clin Res Cardiol 2024; 113:672-679. [PMID: 37847314 PMCID: PMC11026239 DOI: 10.1007/s00392-023-02303-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 09/01/2023] [Indexed: 10/18/2023]
Abstract
The sharing and documentation of cardiovascular research data are essential for efficient use and reuse of data, thereby aiding scientific transparency, accelerating the progress of cardiovascular research and healthcare, and contributing to the reproducibility of research results. However, challenges remain. This position paper, written on behalf of and approved by the German Cardiac Society and German Centre for Cardiovascular Research, summarizes our current understanding of the challenges in cardiovascular research data management (RDM). These challenges include lack of time, awareness, incentives, and funding for implementing effective RDM; lack of standardization in RDM processes; a need to better identify meaningful and actionable data among the increasing volume and complexity of data being acquired; and a lack of understanding of the legal aspects of data sharing. While several tools exist to increase the degree to which data are findable, accessible, interoperable, and reusable (FAIR), more work is needed to lower the threshold for effective RDM not just in cardiovascular research but in all biomedical research, with data sharing and reuse being factored in at every stage of the scientific process. A culture of open science with FAIR research data should be fostered through education and training of early-career and established research professionals. Ultimately, FAIR RDM requires permanent, long-term effort at all levels. If outcomes can be shown to be superior and to promote better (and better value) science, modern RDM will make a positive difference to cardiovascular science and practice. The full position paper is available in the supplementary materials.
Collapse
Affiliation(s)
- Sabine Steffens
- Institute for Cardiovascular Prevention (IPEK), Ludwig-Maximilians-Universität, Munich, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Munich Heart Alliance, Munich, Germany
| | - Katrin Schröder
- Institute for Cardiovascular Physiology, Goethe University, Frankfurt Am Main, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site RheinMain, Frankfurt, Germany
| | - Martina Krüger
- Institute of Cardiovascular Physiology, University Hospital Düsseldorf, Düsseldorf, Germany
- Cardiovascular Research Institute Düsseldorf (CARID), Düsseldorf, Germany
| | - Christoph Maack
- Comprehensive Heart Failure Center (CHFC), University Clinic Würzburg, Würzburg, Germany
- Medical Clinic 1, University Clinic Würzburg, Würzburg, Germany
| | - Katrin Streckfuss-Bömeke
- Clinic for Cardiology and Pneumology, Georg-August University Göttingen, Göttingen, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Göttingen, Göttingen, Germany
- Institute of Pharmacology and Toxicology, University of Würzburg, Würzburg, Germany
| | - Johannes Backs
- Institute of Experimental Cardiology, University Hospital Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany
| | - Rolf Backofen
- Faculty of Medicine, Institute for Experimental and Clinical Pharmacology and Toxicology, Albert-Ludwigs-University, Freiburg, Germany
| | - Bettina Baeßler
- Department of Diagnostic and Interventional Radiology, University Hospital Würzburg, Würzburg, Germany
| | - Yvan Devaux
- Cardiovascular Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Ralf Gilsbach
- Institute of Experimental Cardiology, University Hospital Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany
| | - Jordi Heijman
- Department of Cardiology, CARIM School for Cardiovascular Diseases, Maastricht University, Maastricht, The Netherlands
| | - Jochen Knaus
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| | - Rafael Kramann
- Institute of Experimental Medicine and Systems Biology, RWTH Aachen Medical Faculty, Aachen, Germany
- Department of Nephrology and Clinical Immunology, RWTH Aachen Medical Faculty, Aachen, Germany
- Department of Internal Medicine, Nephrology and Transplantation, Erasmus MC, Rotterdam, The Netherlands
| | - Dominik Linz
- Department of Cardiology, Maastricht University Medical Centre and Cardiovascular Research Institute Maastricht, Maastricht, The Netherlands
- Department of Biomedical Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Allyson L Lister
- Oxford E-Research Centre (OeRC), Department of Engineering Science, University of Oxford, Oxford, UK
| | - Henrike Maatz
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, Berlin, Germany
| | - Lars Maegdefessel
- DZHK (German Centre for Cardiovascular Research), Partner Site Munich Heart Alliance, Munich, Germany
- Department for Vascular and Endovascular Surgery, Klinikum Rechts Der Isar, Technical University Munich, Munich, Germany
- Department of Medicine, Karolinska Institute, Stockholm, Sweden
| | - Manuel Mayr
- School of Cardiovascular Medicine and Sciences, King's College London British Heart Foundation Centre, London, UK
- Division of Cardiology, Department of Internal Medicine II, Medical University of Vienna, Vienna, Austria
| | - Benjamin Meder
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany
- Department of Internal Medicine III (Cardiology, Angiology, and Pneumology), University Hospital Heidelberg, Heidelberg, Germany
| | - Sara Y Nussbeck
- Department of Medical Informatics, University Medical Center Göttingen (UMG), Göttingen, Germany
- Central Biobank UMG, UMG, Göttingen, Germany
| | - Eva A Rog-Zielinska
- Institute for Experimental Cardiovascular Medicine, University Heart Center Freiburg-Bad Krozingen, University of Freiburg, Freiburg, Germany
- Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Marcel H Schulz
- DZHK (German Centre for Cardiovascular Research), Partner Site RheinMain, Frankfurt, Germany
- Institute of Cardiovascular Regeneration, Goethe University, Frankfurt, Germany
| | - Albert Sickmann
- Leibniz-Institut Für Analytische Wissenschaften, ISAS, E.V., Dortmund, Germany
- Department of Chemistry, College of Physical Sciences, University of Aberdeen, Aberdeen, UK
- Institute for Virology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Gökhan Yigit
- Institute of Human Genetics, University Medical Center Göttingen, Göttingen, Germany
- German Center of Cardiovascular Research (DZHK), Partner Site Göttingen, Göttingen, Germany
| | - Peter Kohl
- Institute for Experimental Cardiovascular Medicine, University Heart Center Freiburg-Bad Krozingen, University of Freiburg, Freiburg, Germany.
- Faculty of Medicine, University of Freiburg, Freiburg, Germany.
- CIBSS Centre for Integrative Biological Signalling Studies, University of Freiburg, Freiburg, Germany.
| |
Collapse
|
6
|
Ross KE, Bastian FB, Buys M, Cook CE, D’Eustachio P, Harrison M, Hermjakob H, Li D, Lord P, Natale DA, Peters B, Sternberg PW, Su AI, Thakur M, Thomas PD, Bateman A. Perspectives on tracking data reuse across biodata resources. BIOINFORMATICS ADVANCES 2024; 4:vbae057. [PMID: 38721398 PMCID: PMC11076920 DOI: 10.1093/bioadv/vbae057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 03/13/2024] [Accepted: 04/11/2024] [Indexed: 06/14/2024]
Abstract
Motivation Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge. Results The article reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources. Availability and implementation Summaries of survey results are available at: https://docs.google.com/forms/d/1j-VU2ifEKb9C-sW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics (survey of data providers) and https://docs.google.com/forms/d/18WbJFutUd7qiZoEzbOytFYXSfWFT61hVce0vjvIwIjk/viewanalytics (survey of users).
Collapse
Affiliation(s)
- Karen E Ross
- Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, United States
| | - Frederic B Bastian
- Evolutionary Bioinformatics Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | | | | | - Peter D’Eustachio
- Department of Biochemistry & Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10012, United States
| | - Melissa Harrison
- Literature Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Henning Hermjakob
- Molecular Systems, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Donghui Li
- Chan Zuckerberg Initiative, Redwood City, CA 94063, United States
| | - Phillip Lord
- School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, United Kingdom
| | - Darren A Natale
- Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, United States
| | - Bjoern Peters
- Center for Vaccine Innovation, La Jolla Institute of Immunology, La Jolla, CA 92037, United States
| | - Paul W Sternberg
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Matthew Thakur
- Data Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90089, United States
| | - Alex Bateman
- MSCB, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| |
Collapse
|
7
|
Forester S, Jennings-Dobbs E, Burton-Freeman B. Development of a Comprehensive Food Data Citation Standard: A Surprising Gap in the Nutrition Research Literature. Curr Dev Nutr 2024; 8:102048. [PMID: 38156342 PMCID: PMC10751823 DOI: 10.1016/j.cdnut.2023.102048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/06/2023] [Accepted: 11/20/2023] [Indexed: 12/30/2023] Open
Abstract
Currently, there is no standard for the citation of food composition data. This leads to the questions: how are food and nutrient data cited in research papers, and are they presented in a way that allows studies to be reproduced? To answer these questions, we performed a review of the literature and quantified the accuracy and completeness of data citations from publications (January to December 2020) in the top 5 nutrition journals as ranked by the Scimago Journal Rankings. We then performed a review of citation guidelines currently in place in other disciplines. Similar to the requirement of completing the Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist for systematic reviews, we have developed a comprehensive data citation checklist, the Comprehensive Food Data Citation (CFDC) checklist. The CFDC checklist was developed through a benchmarking assessment against established data citation standards. Its purpose is to establish a standardized, best-practice approach for reporting food composition data. The CFDC checklist has been designed to cater to both publishers and authors, ensuring consistency and accuracy in food composition data reporting. The CFDC checklist is also available as an interactive citation generator to facilitate the adoption of consistent and comprehensive citation of food composition data and is available at https://www.nutrientinstitute.org/cfdc. Despite general agreement that accurate data citation is paramount, this is the first citation standard specifically developed to capture food composition data. Because food composition data are the foundation of nutrition research, our proposed guidelines aim to provide the field with a much-needed foundation for acknowledging and sharing data in a way that fosters reproducibility.
Collapse
Affiliation(s)
- Shavawn Forester
- Nutrient Institute, a 501(c)(3) not-for-profit organization, Reno, NV, United States
| | - Emily Jennings-Dobbs
- Nutrient Institute, a 501(c)(3) not-for-profit organization, Reno, NV, United States
| | - Britt Burton-Freeman
- Department of Food Science and Nutrition, Illinois Institute of Technology, Chicago, IL, United States
| |
Collapse
|
8
|
Kiermer V. Authorship practices must evolve to support collaboration and open science. PLoS Biol 2023; 21:e3002364. [PMID: 37831717 PMCID: PMC10599500 DOI: 10.1371/journal.pbio.3002364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 10/25/2023] [Indexed: 10/15/2023] Open
Abstract
Journal authorship practices have not sufficiently evolved to reflect the way research is now done. Improvements to support teams, collaboration, and open science are urgently needed.
Collapse
Affiliation(s)
- Veronique Kiermer
- Public Library of Science, San Francisco, California, United States of America
| |
Collapse
|
9
|
Stall S, Bilder G, Cannon M, Chue Hong N, Edmunds S, Erdmann CC, Evans M, Farmer R, Feeney P, Friedman M, Giampoala M, Hanson RB, Harrison M, Karaiskos D, Katz DS, Letizia V, Lizzi V, MacCallum C, Muench A, Perry K, Ratner H, Schindler U, Sedora B, Stockhause M, Townsend R, Yeston J, Clark T. Journal Production Guidance for Software and Data Citations. Sci Data 2023; 10:656. [PMID: 37752153 PMCID: PMC10522580 DOI: 10.1038/s41597-023-02491-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/16/2023] [Indexed: 09/28/2023] Open
Affiliation(s)
- Shelley Stall
- American Geophysical Union, 2000 Florida Ave. NW, Washington, DC, 20009, USA.
| | | | | | | | | | | | | | | | | | | | - Matthew Giampoala
- American Geophysical Union, 2000 Florida Ave. NW, Washington, DC, 20009, USA
| | - R Brooks Hanson
- American Geophysical Union, 2000 Florida Ave. NW, Washington, DC, 20009, USA
| | | | | | - Daniel S Katz
- University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | | | | | | | - August Muench
- American Astronomical Society, Washington, DC, 20006, USA
| | | | - Howard Ratner
- CHORUS, 72 Dreyer Avenue, Staten Island, NY, 10314, USA
| | | | - Brian Sedora
- American Geophysical Union, 2000 Florida Ave. NW, Washington, DC, 20009, USA
| | | | - Randy Townsend
- Plos, 1265 Battery Street, San Francisco, CA, 94111, USA
| | - Jake Yeston
- AAAS, 1200 New York Ave NW, Washington, DC, 20005, USA
| | - Timothy Clark
- University of Virginia, Charlottesville, VA, 22904, USA
| |
Collapse
|
10
|
Of data and transparency. NATURE COMPUTATIONAL SCIENCE 2023; 3:571. [PMID: 38177745 DOI: 10.1038/s43588-023-00499-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
|
11
|
Tomaszewski R. Visibility, impact, and applications of bibliometric software tools through citation analysis. Scientometrics 2023; 128:4007-4028. [PMID: 37287881 PMCID: PMC10234239 DOI: 10.1007/s11192-023-04725-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 04/21/2023] [Indexed: 06/09/2023]
Abstract
This study examines the visibility, impact, and applications of bibliometric software tools in the peer-reviewed literature through a "Cited Reference Search" using the Web of Science (WOS) database. A total of 2882 citing research articles to eight bibliometric software tools were extracted from the WOS Core Collection between 2010 and 2021. These citing articles are analyzed by publication year, country, publication title, publisher, open access level, funding agency, and WOS category. Mentions of bibliometric software tools in Author Keywords and KeyWords Plus are also compared. The VOSviewer software is utilized to identify specific research areas by discipline from the keyword co-occurrences of the citing articles. The findings reveal that while bibliometric software tools are making a noteworthy impact and contribution to research, their visibility through referencing, Author Keywords, and KeyWords Plus is limited. This study serves as a clarion call to raise awareness and initiate discussions on the citing practices of software tools in scholarly publications.
Collapse
Affiliation(s)
- Robert Tomaszewski
- California State University, Fullerton, 800 North State College Blvd, Fullerton, CA 92831 USA
| |
Collapse
|
12
|
Melero R, Boté‐Vericad J, López‐Borrull A. Perceptions regarding open science appraised by editors of scholarly publications published in Spain. LEARNED PUBLISHING 2022. [DOI: 10.1002/leap.1511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Remedios Melero
- Instituto de Agroquímica y Tecnología de Alimentos‐CSIC Paterna Valencia Spain
| | - Juan‐José Boté‐Vericad
- Departament de Biblioteconomia, Documentació i Comunicació Audiovisual & Centre de Recerca en Informació Comunicació i Cultura. Universitat de Barcelona Barcelona Spain
| | - Alexandre López‐Borrull
- Universitat Oberta de Catalunya Estudis de Ciències de la Informació i la Comunicació Rambla del Poblenou, 156 Barcelona 08018 Barcelona Spain
| |
Collapse
|
13
|
Strasser C, Hertweck K, Greenberg J, Taraborelli D, Vu E. Ten simple rules for funding scientific open source software. PLoS Comput Biol 2022; 18:e1010627. [PMID: 36395089 PMCID: PMC9671312 DOI: 10.1371/journal.pcbi.1010627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Scientific research increasingly relies on open source software (OSS). Funding OSS development requires intentional focus on issues of scholarly credit, unique forms of labor, maintenance, governance, and inclusive community-building. Such issues cut across different scientific disciplines that make them of interest to a variety of funders and institutions but may present challenges in understanding generalized needs. Here we present 10 simple rules for investing in scientific OSS and the teams who build and maintain it.
Collapse
Affiliation(s)
- Carly Strasser
- Chan Zuckerberg Initiative, Redwood City, California, United States of America
| | - Kate Hertweck
- Chan Zuckerberg Initiative, Redwood City, California, United States of America
| | - Josh Greenberg
- Alfred P. Sloan Foundation, New York, New York, United States of America
| | - Dario Taraborelli
- Chan Zuckerberg Initiative, Redwood City, California, United States of America
| | - Elizabeth Vu
- Alfred P. Sloan Foundation, New York, New York, United States of America
| |
Collapse
|
14
|
Laufs D, Peters M, Schultz C. Data platforms for open life sciences-A systematic analysis of management instruments. PLoS One 2022; 17:e0276204. [PMID: 36282849 PMCID: PMC9595524 DOI: 10.1371/journal.pone.0276204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 10/02/2022] [Indexed: 11/05/2022] Open
Abstract
Open data platforms are interfaces between data demand of and supply from their users. Yet, data platform providers frequently struggle to aggregate data to suit their users' needs and to establish a high intensity of data exchange in a collaborative environment. Here, using open life science data platforms as an example for a diverse data structure, we systematically categorize these platforms based on their technology intermediation and the range of domains they cover to derive general and specific success factors for their management instruments. Our qualitative content analysis is based on 39 in-depth interviews with experts employed by data platforms and external stakeholders. We thus complement peer initiatives which focus solely on data quality, by additionally highlighting the data platforms' role to enable data utilization for innovative output. Based on our analysis, we propose a clearly structured and detailed guideline for seven management instruments. This guideline helps to establish and operationalize data platforms and to best exploit the data provided. Our findings support further exploitation of the open innovation potential in the life sciences and beyond.
Collapse
Affiliation(s)
- Daniel Laufs
- Technology Management Research Group, Faculty of Business, Economics and Social Sciences, Kiel University, Kiel, SH, Germany
| | - Mareike Peters
- Technology Management Research Group, Faculty of Business, Economics and Social Sciences, Kiel University, Kiel, SH, Germany
| | - Carsten Schultz
- Technology Management Research Group, Faculty of Business, Economics and Social Sciences, Kiel University, Kiel, SH, Germany
| |
Collapse
|
15
|
Kivinen K, van Luenen HGAM, Alcalay M, Bock C, Dodzian J, Hoskova K, Hoyle D, Hradil O, Christensen SK, Korn B, Kosteas T, Morales M, Skowronek K, Theodorou V, Van Minnebruggen G, Salamero J, Premvardhan L. Acknowledging and citing core facilities. EMBO Rep 2022; 23:e55734. [PMID: 35997112 PMCID: PMC9442286 DOI: 10.15252/embr.202255734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 07/18/2022] [Indexed: 11/29/2022] Open
Affiliation(s)
- Katja Kivinen
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- Institute for Molecular Medicine Finland University of Helsinki Helsinki Finland
| | - Henri G A M van Luenen
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- The Netherlands Cancer Institute Amsterdam The Netherlands
| | - Myriam Alcalay
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- IEO European Institute of Oncology IRCCS Milan Italy
| | - Christoph Bock
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences Vienna Austria
- Center for Medical Data Science, Institute of Artificial Intelligence Medical University of Vienna Vienna Austria
| | - Joanna Dodzian
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- IIMCB International Institute of Molecular and Cell Biology in Warsaw Warsaw Poland
| | - Katerina Hoskova
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- CEITEC MU Brno Czech Republic
| | - Danielle Hoyle
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- Babraham Institute Cambridge UK
| | - Ondrej Hradil
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- Masaryk University Brno Czech Republic
| | | | - Bernhard Korn
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- Friedrich Miescher Institute for Biomedical Research Basel Switzerland
| | - Theodoros Kosteas
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- Institute of Molecular Biology and Biotechnology Foundation for Research and Technology Hellas (IMBB‐FORTH) Heraklion Greece
| | - Mònica Morales
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- CRG Center for Genomic Regulation Barcelona Spain
| | - Krzysztof Skowronek
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- IIMCB International Institute of Molecular and Cell Biology in Warsaw Warsaw Poland
| | - Vasiliki Theodorou
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- Institute of Molecular Biology and Biotechnology Foundation for Research and Technology Hellas (IMBB‐FORTH) Heraklion Greece
| | - Geert Van Minnebruggen
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- VIB Core Facility Program, VIB Leuven Belgium
| | - Jean Salamero
- CNRS, INRIA, Rennes Bretagne Atlantique Rennes France
- Institut Curie PSL Research University Paris France
| | - Lavanya Premvardhan
- EU-LIFE Core Facilities Working Group, EU-LIFE Alliance Barcelona Spain
- Institut Curie PSL Research University Paris France
| |
Collapse
|
16
|
Big Geospatial Data or Geospatial Big Data? A Systematic Narrative Review on the Use of Spatial Data Infrastructures for Big Geospatial Sensing Data in Public Health. REMOTE SENSING 2022. [DOI: 10.3390/rs14132996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Background: Often combined with other traditional and non-traditional types of data, geospatial sensing data have a crucial role in public health studies. We conducted a systematic narrative review to broaden our understanding of the usage of big geospatial sensing, ancillary data, and related spatial data infrastructures in public health studies. Methods: English-written, original research articles published during the last ten years were examined using three leading bibliographic databases (i.e., PubMed, Scopus, and Web of Science) in April 2022. Study quality was assessed by following well-established practices in the literature. Results: A total of thirty-two articles were identified through the literature search. We observed the included studies used various data-driven approaches to make better use of geospatial big data focusing on a range of health and health-related topics. We found the terms ‘big’ geospatial data and geospatial ‘big data’ have been inconsistently used in the existing geospatial sensing studies focusing on public health. We also learned that the existing research made good use of spatial data infrastructures (SDIs) for geospatial sensing data but did not fully use health SDIs for research. Conclusions: This study reiterates the importance of interdisciplinary collaboration as a prerequisite to fully taking advantage of geospatial big data for future public health studies.
Collapse
|
17
|
Fan W, Jeng W, Tang M. Using data citation to define a knowledge domain: A case study of the
Add‐Health
dataset. J Assoc Inf Sci Technol 2022. [DOI: 10.1002/asi.24688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Wei‐Min Fan
- Department of Library and Information Science National Taiwan University Taipei Taiwan
| | - Wei Jeng
- Department of Library and Information Science National Taiwan University Taipei Taiwan
| | - Muh‐Chyun Tang
- Department of Library and Information Science National Taiwan University Taipei Taiwan
| |
Collapse
|
18
|
Rutz A, Sorokina M, Galgonek J, Mietchen D, Willighagen E, Gaudry A, Graham JG, Stephan R, Page R, Vondrášek J, Steinbeck C, Pauli GF, Wolfender JL, Bisson J, Allard PM. The LOTUS initiative for open knowledge management in natural products research. eLife 2022; 11:e70780. [PMID: 35616633 PMCID: PMC9135406 DOI: 10.7554/elife.70780] [Citation(s) in RCA: 129] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 03/22/2022] [Indexed: 12/17/2022] Open
Abstract
Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges for data access, both within the discipline and for integration and interoperability between related fields. The fundamental elements of exchange are referenced structure-organism pairs that establish relationships between distinct molecular structures and the living organisms from which they were identified. Consolidating and sharing such information via an open platform has strong transformative potential for natural products research and beyond. This is the ultimate goal of the newly established LOTUS initiative, which has now completed the first steps toward the harmonization, curation, validation and open dissemination of 750,000+ referenced structure-organism pairs. LOTUS data is hosted on Wikidata and regularly mirrored on https://lotus.naturalproducts.net. Data sharing within the Wikidata framework broadens data access and interoperability, opening new possibilities for community curation and evolving publication models. Furthermore, embedding LOTUS data into the vast Wikidata knowledge graph will facilitate new biological and chemical insights. The LOTUS initiative represents an important advancement in the design and deployment of a comprehensive and collaborative natural products knowledge base.
Collapse
Affiliation(s)
- Adriano Rutz
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
| | - Maria Sorokina
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University JenaJenaGermany
| | - Jakub Galgonek
- Institute of Organic Chemistry and Biochemistry of the CASPragueCzech Republic
| | - Daniel Mietchen
- Ronin InstituteMontclairUnited States
- Leibniz Institute of Freshwater Ecology and Inland FisheriesBerlinGermany
- School of Data Science, University of VirginiaCharlottesvilleUnited States
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, Maastricht UniversityMaastrichtNetherlands
| | - Arnaud Gaudry
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
| | - James G Graham
- Center for Natural Product Technologies and WHO Collaborating Centre for Traditional Medicine (WHO CC/TRM), Pharmacognosy Institute; College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
| | - Ralf Stephan
- Ontario Institute for Cancer Research (OICR), University Ave SuiteTorontoCanada
| | | | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the CASPragueCzech Republic
| | - Christoph Steinbeck
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University JenaJenaGermany
| | - Guido F Pauli
- Center for Natural Product Technologies and WHO Collaborating Centre for Traditional Medicine (WHO CC/TRM), Pharmacognosy Institute; College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
| | - Jean-Luc Wolfender
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
| | - Jonathan Bisson
- Center for Natural Product Technologies and WHO Collaborating Centre for Traditional Medicine (WHO CC/TRM), Pharmacognosy Institute; College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
| | - Pierre-Marie Allard
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
- Department of Biology, University of FribourgFribourgSwitzerland
| |
Collapse
|
19
|
Agostinetto G, Bozzi D, Porro D, Casiraghi M, Labra M, Bruno A. SKIOME Project: a curated collection of skin microbiome datasets enriched with study-related metadata. Database (Oxford) 2022; 2022:6586378. [PMID: 35576001 PMCID: PMC9216470 DOI: 10.1093/database/baac033] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 02/25/2022] [Accepted: 05/09/2022] [Indexed: 04/07/2023]
Abstract
Large amounts of data from microbiome-related studies have been (and are currently being) deposited on international public databases. These datasets represent a valuable resource for the microbiome research community and could serve future researchers interested in integrating multiple datasets into powerful meta-analyses. However, this huge amount of data lacks harmonization and it is far from being completely exploited in its full potential to build a foundation that places microbiome research at the nexus of many subdisciplines within and beyond biology. Thus, it urges the need for data accessibility and reusability, according to findable, accessible, interoperable and reusable (FAIR) principles, as supported by National Microbiome Data Collaborative and FAIR Microbiome. To tackle the challenge of accelerating discovery and advances in skin microbiome research, we collected, integrated and organized existing microbiome data resources from human skin 16S rRNA amplicon-sequencing experiments. We generated a comprehensive collection of datasets, enriched in metadata, and organized this information into data frames ready to be integrated into microbiome research projects and advanced post-processing analyses, such as data science applications (e.g. machine learning). Furthermore, we have created a data retrieval and curation framework built on three different stages to maximize the retrieval of datasets and metadata associated with them. Lastly, we highlighted some caveats regarding metadata retrieval and suggested ways to improve future metadata submissions. Overall, our work resulted in a curated skin microbiome datasets collection accompanied by a state-of-the-art analysis of the last 10 years of the skin microbiome field. Database URL: https://github.com/giuliaago/SKIOMEMetadataRetrieval.
Collapse
Affiliation(s)
- Giulia Agostinetto
- *Corresponding author: Giulia Agostinetto. E-mail: and Antonia Bruno. Tel: +0039 0264483413; E-mail:
| | | | - Danilo Porro
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza, 2, Milan 20126, Italy
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), via Fratelli Cervi, 93, Segrate (MI) 20054, Italy
| | - Maurizio Casiraghi
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza, 2, Milan 20126, Italy
| | - Massimo Labra
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza, 2, Milan 20126, Italy
| | - Antonia Bruno
- *Corresponding author: Giulia Agostinetto. E-mail: and Antonia Bruno. Tel: +0039 0264483413; E-mail:
| |
Collapse
|
20
|
Kowalczyk OS, Lautarescu A, Blok E, Dall'Aglio L, Westwood SJ. What senior academics can do to support reproducible and open research: a short, three-step guide. BMC Res Notes 2022; 15:116. [PMID: 35317865 PMCID: PMC8938725 DOI: 10.1186/s13104-022-05999-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 03/09/2022] [Indexed: 01/31/2023] Open
Abstract
Increasingly, policies are being introduced to reward and recognise open research practices, while the adoption of such practices into research routines is being facilitated by many grassroots initiatives. However, despite this widespread endorsement and support, as well as various efforts led by early career researchers, open research is yet to be widely adopted. For open research to become the norm, initiatives should engage academics from all career stages, particularly senior academics (namely senior lecturers, readers, professors) given their routine involvement in determining the quality of research. Senior academics, however, face unique challenges in implementing policy changes and supporting grassroots initiatives. Given that-like all researchers-senior academics are motivated by self-interest, this paper lays out three feasible steps that senior academics can take to improve the quality and productivity of their research, that also serve to engender open research. These steps include changing (a) hiring criteria, (b) how scholarly outputs are credited, and (c) how we fund and publish in line with open research principles. The guidance we provide is accompanied by material for further reading.
Collapse
Affiliation(s)
- Olivia S Kowalczyk
- Department of Neuroimaging, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Alexandra Lautarescu
- Forensic and Neurodevelopmental Sciences, Institute of Psychiatry, Psychology, Neuroscience, King's College London, London, UK
- Department of Perinatal Imaging and Health, Centre for the Developing Brain, School of Biomedical Imaging and Medical Sciences, King's College London, London, UK
| | - Elisabet Blok
- Department of Child and Adolescent Psychiatry/Psychology, Erasmus MC-Sophia Children's Hospital, University Medical Centre Rotterdam, Rotterdam, The Netherlands
- The Generation R Study Group, Erasmus MC, University Medical Centre Rotterdam, Rotterdam, The Netherlands
| | - Lorenza Dall'Aglio
- Department of Child and Adolescent Psychiatry/Psychology, Erasmus MC-Sophia Children's Hospital, University Medical Centre Rotterdam, Rotterdam, The Netherlands
- The Generation R Study Group, Erasmus MC, University Medical Centre Rotterdam, Rotterdam, The Netherlands
| | - Samuel J Westwood
- Institute of Psychiatry, Psychology, Neuroscience, King's College London, London, UK.
- Department of Psychology, School of Social Science, University of Westminster, 115 New Cavendish Street, London, W1W 6UW, UK.
| |
Collapse
|
21
|
Altman M, Cohen PN. The Scholarly Knowledge Ecosystem: Challenges and Opportunities for the Field of Information. Front Res Metr Anal 2022; 6:751553. [PMID: 35178498 PMCID: PMC8843814 DOI: 10.3389/frma.2021.751553] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 12/15/2021] [Indexed: 01/09/2023] Open
Abstract
The scholarly knowledge ecosystem presents an outstanding exemplar of the challenges of understanding, improving, and governing information ecosystems at scale. This article draws upon significant reports on aspects of the ecosystem to characterize the most important research challenges and promising potential approaches. The focus of this review article is the fundamental scientific research challenges related to developing a better understanding of the scholarly knowledge ecosystem. Across a range of disciplines, we identify reports that are conceived broadly, published recently, and written collectively. We extract the critical research questions, summarize these using quantitative text analysis, and use this quantitative analysis to inform a qualitative synthesis. Three broad themes emerge from this analysis: the need for multi-sectoral cooperation and coordination, for mixed methods analysis at multiple levels, and interdisciplinary collaboration. Further, we draw attention to an emerging consensus that scientific research in this area should by a set of core human values.
Collapse
Affiliation(s)
- Micah Altman
- Center for Research in Equitable and Open Scholarship, MIT Libraries, Massachusetts Institute of Technology, Cambridge, MA, United States
- *Correspondence: Micah Altman
| | - Philip N. Cohen
- Department of Sociology, University of Maryland, College Park, MD, United States
| |
Collapse
|
22
|
Levinson MA, Niestroy J, Al Manir S, Fairchild K, Lake DE, Moorman JR, Clark T. FAIRSCAPE: a Framework for FAIR and Reproducible Biomedical Analytics. Neuroinformatics 2022; 20:187-202. [PMID: 34264488 PMCID: PMC8760356 DOI: 10.1007/s12021-021-09529-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/01/2021] [Indexed: 01/09/2023]
Abstract
Results of computational analyses require transparent disclosure of their supporting resources, while the analyses themselves often can be very large scale and involve multiple processing steps separated in time. Evidence for the correctness of any analysis should include not only a textual description, but also a formal record of the computations which produced the result, including accessible data and software with runtime parameters, environment, and personnel involved. This article describes FAIRSCAPE, a reusable computational framework, enabling simplified access to modern scalable cloud-based components. FAIRSCAPE fully implements the FAIR data principles and extends them to provide fully FAIR Evidence, including machine-interpretable provenance of datasets, software and computations, as metadata for all computed results. The FAIRSCAPE microservices framework creates a complete Evidence Graph for every computational result, including persistent identifiers with metadata, resolvable to the software, computations, and datasets used in the computation; and stores a URI to the root of the graph in the result's metadata. An ontology for Evidence Graphs, EVI ( https://w3id.org/EVI ), supports inferential reasoning over the evidence. FAIRSCAPE can run nested or disjoint workflows and preserves provenance across them. It can run Apache Spark jobs, scripts, workflows, or user-supplied containers. All objects are assigned persistent IDs, including software. All results are annotated with FAIR metadata using the evidence graph model for access, validation, reproducibility, and re-use of archived data and software.
Collapse
Affiliation(s)
- Maxwell Adam Levinson
- Department of Public Health Sciences (Biomedical Informatics), University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Justin Niestroy
- Department of Public Health Sciences (Biomedical Informatics), University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Sadnan Al Manir
- Department of Public Health Sciences (Biomedical Informatics), University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Karen Fairchild
- Department of Pediatrics, University of Virginia School of Medicine, Charlottesville, VA, USA
- Center for Advanced Medical Analytics, University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Douglas E Lake
- Center for Advanced Medical Analytics, University of Virginia School of Medicine, Charlottesville, VA, USA
- Department of Medicine, University of Virginia School of Medicine, Charlottesville, VA, USA
- Department of Statistics, University of Virginia College and Graduate School of Arts and Sciences, Charlottesville, VA, USA
| | - J Randall Moorman
- Center for Advanced Medical Analytics, University of Virginia School of Medicine, Charlottesville, VA, USA
- Department of Medicine, University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Timothy Clark
- Department of Public Health Sciences (Biomedical Informatics), University of Virginia School of Medicine, Charlottesville, VA, USA.
- Center for Advanced Medical Analytics, University of Virginia School of Medicine, Charlottesville, VA, USA.
- University of Virginia School of Data Science, Charlottesville, VA, USA.
| |
Collapse
|
23
|
OLIVEIRA CCD, SILVA MCD, PAVÃO CMG, SILVA FCCD, MOURA AMMD, BARROS THB. A teoria da citação de dados: uma revisão da produção científica na América Latina. TRANSINFORMACAO 2022. [DOI: 10.1590/2318-0889202234e210062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Resumo: Trata-se de uma pesquisa bibliográfica, de caráter qualitativo, que buscou identificar o estado da arte acerca da teoria da citação dos dados na produção científica conduzida na América Latina. Para tanto, foram estabelecidas expressões em português, inglês e espanhol acerca da referida temática, que foram utilizadas para explorar as seguintes bases de dados, repositórios e buscadores: Biblioteca Digital Brasileira de Teses e Dissertações, OasisBR, La referencia, Redalyc, Networked Digital Library of Theses and Dissertations, Portal de Periódicos Capes, Google Acadêmico, SciELO e Brapci (Base de Dados Referenciais de Artigos de Periódicos em Ciência da Informação). Após a análise dos trabalhos recuperados, foram considerados somente aqueles que discutiam a temática de citação de dados de pesquisa de maneira aprofundada, com a finalidade de contribuírem para a reflexão acerca de uma teoria da citação de dados, totalizando 19 trabalhos. Conclui-se que existe uma ausência significativa de trabalhos na América Latina concernente à teoria da citação de dados, ao mesmo tempo em que foram identificados trabalhos que, embora não se refiram a uma teoria propriamente, oferecem contribuições significativas para a temática de citação de dados de pesquisa e que podem servir de base para o desenvolvimento de trabalhos sobre a teoria da citação de dados. Constatou-se ainda que o Brasil se destacou na produção de trabalhos sobre citação de dados de pesquisa, sendo que dos 19 trabalhos analisados nesta pesquisa, 17 eram produções brasileiras.
Collapse
|
24
|
Lange M, Alako BTF, Cochrane G, Ghaffar M, Mascher M, Habekost PK, Hillebrand U, Scholz U, Schorch F, Freitag J, Scholz AH. Quantitative monitoring of nucleotide sequence data from genetic resources in context of their citation in the scientific literature. Gigascience 2021; 10:giab084. [PMID: 34966925 PMCID: PMC8716361 DOI: 10.1093/gigascience/giab084] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 08/04/2021] [Accepted: 11/29/2021] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Linking nucleotide sequence data (NSD) to scientific publication citations can enhance understanding of NSD provenance, scientific use, and reuse in the community. By connecting publications with NSD records, NSD geographical provenance information, and author geographical information, it becomes possible to assess the contribution of NSD to infer trends in scientific knowledge gain at the global level. FINDINGS We extracted and linked records from the European Nucleotide Archive to citations in open-access publications aggregated at Europe PubMed Central. A total of 8,464,292 ENA accessions with geographical provenance information were associated with publications. We conducted a data quality review to uncover potential issues in publication citation information extraction and author affiliation tagging and developed and implemented best-practice recommendations for citation extraction. We constructed flat data tables and a data warehouse with an interactive web application to enable ad hoc exploration of NSD use and summary statistics. CONCLUSIONS The extraction and linking of NSD with associated publication citations enables transparency. The quality review contributes to enhanced text mining methods for identifier extraction and use. Furthermore, the global provision and use of NSD enable scientists worldwide to join literature and sequence databases in a multidimensional fashion. As a concrete use case, we visualized statistics of country clusters concerning NSD access in the context of discussions around digital sequence information under the United Nations Convention on Biological Diversity.
Collapse
Affiliation(s)
- Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Mehmood Ghaffar
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Puschstraße 4, 04103 Leipzig, Germany
| | - Pia-Katharina Habekost
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
- The Harz University of Applied Science, Department of Automation and Computer Science, Friedrichstraße 57, 38855 Wernigerode, Germany
| | - Upneet Hillebrand
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Department Research - Microbial Ecology and Diversity, Inhoffenstraße 7B, 38124 Braunschweig, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Florian Schorch
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
- The Harz University of Applied Science, Department of Automation and Computer Science, Friedrichstraße 57, 38855 Wernigerode, Germany
| | - Jens Freitag
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Amber Hartman Scholz
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Department Research - Microbial Ecology and Diversity, Inhoffenstraße 7B, 38124 Braunschweig, Germany
| |
Collapse
|
25
|
Bliss‐Moreau E, Amara RR, Buffalo EA, Colman RJ, Embers ME, Morrison JH, Quillen EE, Sacha JB, Roberts CT. Improving rigor and reproducibility in nonhuman primate research. Am J Primatol 2021; 83:e23331. [PMID: 34541703 PMCID: PMC8629848 DOI: 10.1002/ajp.23331] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/25/2021] [Accepted: 09/04/2021] [Indexed: 12/23/2022]
Abstract
Nonhuman primates (NHPs) are a critical component of translational/preclinical biomedical research due to the strong similarities between NHP and human physiology and disease pathology. In some cases, NHPs represent the most appropriate, or even the only, animal model for complex metabolic, neurological, and infectious diseases. The increased demand for and limited availability of these valuable research subjects requires that rigor and reproducibility be a prime consideration to ensure the maximal utility of this scarce resource. Here, we discuss a number of approaches that collectively can contribute to enhanced rigor and reproducibility in NHP research.
Collapse
Affiliation(s)
- Eliza Bliss‐Moreau
- California National Primate Research CenterDavisCaliforniaUSA
- Department of PsychologyUniversity of California DavisDavisCaliforniaUSA
| | - Rama R. Amara
- Division of Microbiology and ImmunologyYerkes National Primate Research CenterAtlantaGeorgiaUSA
| | - Elizabeth A. Buffalo
- Washington National Primate Research CenterSeattleWashingtonUSA
- Department of Physiology and BiophysicsUniversity of Washington School of MedicineSeattleWashingtonUSA
| | - Ricki J. Colman
- Wisconsin National Primate Research CenterMadisonWisconsinUSA
- Department of Cell and Regenerative BiologyUniversity of WisconsinMadisonWisconsinUSA
| | - Monica E. Embers
- Division of ImmunologyTulane National Primate Research CenterCovingtonLouisianaUSA
| | - John H. Morrison
- California National Primate Research CenterDavisCaliforniaUSA
- Department of NeurologyUniversity of California DavisDavisCaliforniaUSA
| | - Ellen E. Quillen
- Department of Internal MedicineWake Forest School of MedicineWinston‐SalemNorth CarolinaUSA
| | - Jonah B. Sacha
- Divisions of Pathobiology and Immunology (JS) and Cardiometabolic Health (CR)Oregon National Primate Research CenterBeavertonOregonUSA
- Vaccine and Gene Therapy InstituteOregon Health & Science UniversityBeavertonOregonUSA
| | - Charles T. Roberts
- Divisions of Pathobiology and Immunology (JS) and Cardiometabolic Health (CR)Oregon National Primate Research CenterBeavertonOregonUSA
| | | |
Collapse
|
26
|
Mandeville CP, Koch W, Nilsen EB, Finstad AG. Open Data Practices among Users of Primary Biodiversity Data. Bioscience 2021; 71:1128-1147. [PMID: 34733117 PMCID: PMC8560312 DOI: 10.1093/biosci/biab072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Presence-only biodiversity data are increasingly relied on in biodiversity, ecology, and conservation research, driven by growing digital infrastructures that support open data sharing and reuse. Recent reviews of open biodiversity data have clearly documented the value of data sharing, but the extent to which the biodiversity research community has adopted open data practices remains unclear. We address this question by reviewing applications of presence-only primary biodiversity data, drawn from a variety of sources beyond open databases, in the indexed literature. We characterize how frequently researchers access open data relative to data from other sources, how often they share newly generated or collated data, and trends in metadata documentation and data citation. Our results indicate that biodiversity research commonly relies on presence-only data that are not openly available and neglects to make such data available. Improved data sharing and documentation will increase the value, reusability, and reproducibility of biodiversity research.
Collapse
Affiliation(s)
- Caitlin P Mandeville
- Department of Natural History, Norwegian University of Science and Technology, Trondheim, Norway
| | - Wouter Koch
- Department of Natural History, Norwegian University of Science and Technology, Trondheim, Norway
| | - Erlend B Nilsen
- Faculty of Biosciences and Aquaculture, Nord University, Steinkjer, Norway
| | - Anders G Finstad
- Department of Natural History, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
27
|
Hood ASC, Sutherland WJ. The data-index: An author-level metric that values impactful data and incentivizes data sharing. Ecol Evol 2021; 11:14344-14350. [PMID: 34765110 PMCID: PMC8571609 DOI: 10.1002/ece3.8126] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 08/04/2021] [Accepted: 08/24/2021] [Indexed: 11/08/2022] Open
Abstract
Author-level metrics are a widely used measure of scientific success. The h-index and its variants measure publication output (number of publications) and research impact (number of citations). They are often used to influence decisions, such as allocating funding or jobs. Here, we argue that the emphasis on publication output and impact hinders scientific progress in the fields of ecology and evolution because it disincentivizes two fundamental practices: generating impactful (and therefore often long-term) datasets and sharing data. We describe a new author-level metric, the data-index, which values both dataset output (number of datasets) and impact (number of data-index citations), so promotes generating and sharing data as a result. We discuss how it could be implemented and provide user guidelines. The data-index is designed to complement other metrics of scientific success, as scientific contributions are diverse and our value system should reflect that both for the benefit of scientific progress and to create a value system that is more equitable, diverse, and inclusive. Future work should focus on promoting other scientific contributions, such as communicating science, informing policy, mentoring other scientists, and providing open-access code and tools.
Collapse
Affiliation(s)
- Amelia S. C. Hood
- Conservation Science Group, Department of ZoologyUniversity of CambridgeCambridgeUK
| | - William J. Sutherland
- Conservation Science Group, Department of ZoologyUniversity of CambridgeCambridgeUK
- Biosecurity Research Initiative at St Catharine's (BioRISC), St Catharine's CollegeUniversity of CambridgeCambridgeUK
| |
Collapse
|
28
|
Lessons Learnt from Engineering Science Projects Participating in the Horizon 2020 Open Research Data Pilot. DATA 2021. [DOI: 10.3390/data6090096] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Trends in the sciences are indicative of data management becoming established as a feature of the mainstream research process. In this context, the European Commission introduced an Open Research Data pilot at the start of the Horizon 2020 research programme. This initiative followed the success of the Open Access pilot implemented in the prior (FP7) research programme, which thereafter became an integral component of Horizon 2020. While the Open Access phenomenon can reasonably be argued to be one of many instances of web technologies disrupting established business models (namely publication practices and workflows established over several centuries in the case of Open Access), initiatives designed to promote research data management have no established foundation on which to build. For Open Data to become a reality and, more importantly, to contribute to the scientific process, data management best practices and workflows are required. Furthermore, with the scientific community having operated to good effect in the absence of data management, there is a need to demonstrate the merits of data management. This circumstance is complicated by the lack of the necessary ICT infrastructures, especially interoperability standards, required to facilitate the seamless transfer, aggregation and analysis of research data. Any activity aiming to promote Open Data thus needs to overcome a number of cultural and technological challenges. It is in this context that this paper examines the data management activities and outcomes of a number of projects participating in the Horizon 2020 Open Research Data pilot. The result has been to identify a number of commonly encountered benefits and issues; to assess the utilisation of data management plans; and through the close examination of specific cases, to gain insights into obstacles to data management and potential solutions. Although primarily anecdotal and difficult to quantify, the experiences reported in this paper tend to favour developing data management best practices rather than doggedly pursue the Open Data mantra. While Open Data may prove valuable in certain circumstances, there is good reason to claim that managed access to scientific data of high inherent intellectual and financial value will prove more effective in driving knowledge discovery and innovation.
Collapse
|
29
|
Murphy F, Bar-Sinai M, Martone ME. A tool for assessing alignment of biomedical data repositories with open, FAIR, citation and trustworthy principles. PLoS One 2021; 16:e0253538. [PMID: 34242248 PMCID: PMC8270168 DOI: 10.1371/journal.pone.0253538] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 06/08/2021] [Indexed: 11/19/2022] Open
Abstract
Increasing attention is being paid to the operation of biomedical data repositories in light of efforts to improve how scientific data is handled and made available for the long term. Multiple groups have produced recommendations for functions that biomedical repositories should support, with many using requirements of the FAIR data principles as guidelines. However, FAIR is but one set of principles that has arisen out of the open science community. They are joined by principles governing open science, data citation and trustworthiness, all of which are important aspects for biomedical data repositories to support. Together, these define a framework for data repositories that we call OFCT: Open, FAIR, Citable and Trustworthy. Here we developed an instrument using the open source PolicyModels toolkit that attempts to operationalize key aspects of OFCT principles and piloted the instrument by evaluating eight biomedical community repositories listed by the NIDDK Information Network (dkNET.org). Repositories included both specialist repositories that focused on a particular data type or domain, in this case diabetes and metabolomics, and generalist repositories that accept all data types and domains. The goal of this work was both to obtain a sense of how much the design of current biomedical data repositories align with these principles and to augment the dkNET listing with additional information that may be important to investigators trying to choose a repository, e.g., does the repository fully support data citation? The evaluation was performed from March to November 2020 through inspection of documentation and interaction with the sites by the authors. Overall, although there was little explicit acknowledgement of any of the OFCT principles in our sample, the majority of repositories provided at least some support for their tenets.
Collapse
Affiliation(s)
- Fiona Murphy
- MoreBrains Cooperative Ltd, Chichester, United Kingdom
| | - Michael Bar-Sinai
- Department of Computer Science, Ben-Gurion University of the Negev and The Institute of Quantitative Social Science at Harvard University, Beersheba, Israel
| | - Maryann E. Martone
- Department of Neurosciences, SciCrunch, Inc., University of California, San Diego, California, United States of America
| |
Collapse
|
30
|
Abstract
Brain scientists are now capable of collecting more data in a single experiment than researchers a generation ago might have collected over an entire career. Indeed, the brain itself seems to thirst for more and more data. Such digital information not only comprises individual studies but is also increasingly shared and made openly available for secondary, confirmatory, and/or combined analyses. Numerous web resources now exist containing data across spatiotemporal scales. Data processing workflow technologies running via cloud-enabled computing infrastructures allow for large-scale processing. Such a move toward greater openness is fundamentally changing how brain science results are communicated and linked to available raw data and processed results. Ethical, professional, and motivational issues challenge the whole-scale commitment to data-driven neuroscience. Nevertheless, fueled by government investments into primary brain data collection coupled with increased sharing and community pressure challenging the dominant publishing model, large-scale brain and data science is here to stay.
Collapse
Affiliation(s)
- John Darrell Van Horn
- Department of Psychology, University of Virginia, Charlottesville, Virginia, USA
- School of Data Science, University of Virginia, Charlottesville, Virginia, USA
| |
Collapse
|
31
|
Badran S, Hassona Y. The Online Attention to Cleft Lip and Palate Research: An Altmetric Analysis. Cleft Palate Craniofac J 2021; 59:522-529. [PMID: 33973478 DOI: 10.1177/10556656211014077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
OBJECTIVES To identify research articles related to cleft lip and/or cleft palate (CL/P) that generated the highest online attention. METHODS Altmetric Explorer was used to identify the 100 articles with the highest Altmetric Attention Score (AAS). Descriptive and correlation statistics were performed to study the characteristics of these articles in relation to their publication data, research type and domain, number of Mendeley readers, and dimensions citations. Citation counts were extracted from Scopus and Google Scholar. RESULTS The median AAS for the top 100 outputs was 22 (range from 12 to 458). The outputs were mostly discussed on Twitter (median = 8; range = 0-131). Topics discussing treatment and care for patients with CL/P accounted for 38% of the articles with the highest AAS followed by etiology and risk factors (32%). The majority of articles originated from the USA (46%) followed by Europe (16%) and the United Kingdom (15%). No significant differences were observed in AAS among different study designs, topic domains, journals' ranking and impact factor, and the number of citations in Scopus and Google Scholar. CONCLUSIONS Researchers should consider use of social platforms to disseminate their work among scholars and nonscholars. Altmetrics can be combined with traditional metrics for a more comprehensive assessment of research impact.
Collapse
Affiliation(s)
- Serene Badran
- Department of Orthodontics, Pediatric Dentistry and Preventive Dentistry, School of Dentistry, The University of Jordan, Amman, Jordan
| | - Yazan Hassona
- Department of Oral and Maxillofacial surgery, Oral Medicine and Periodontics, School of Dentistry, The University of Jordan, Amman, Jordan
| |
Collapse
|
32
|
Agarwal DA, Damerow J, Varadharajan C, Christianson DS, Pastorello GZ, Cheah YW, Ramakrishnan L. Balancing the needs of consumers and producers for scientific data collections. ECOL INFORM 2021. [DOI: 10.1016/j.ecoinf.2021.101251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
33
|
Thessen AE, Bogdan P, Patterson DJ, Casey TM, Hinojo-Hinojo C, de Lange O, Haendel MA. From Reductionism to Reintegration: Solving society's most pressing problems requires building bridges between data types across the life sciences. PLoS Biol 2021; 19:e3001129. [PMID: 33770077 PMCID: PMC7997011 DOI: 10.1371/journal.pbio.3001129] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Decades of reductionist approaches in biology have achieved spectacular progress, but the proliferation of subdisciplines, each with its own technical and social practices regarding data, impedes the growth of the multidisciplinary and interdisciplinary approaches now needed to address pressing societal challenges. Data integration is key to a reintegrated biology able to address global issues such as climate change, biodiversity loss, and sustainable ecosystem management. We identify major challenges to data integration and present a vision for a "Data as a Service"-oriented architecture to promote reuse of data for discovery. The proposed architecture includes standards development, new tools and services, and strategies for career-development and sustainability.
Collapse
Affiliation(s)
- Anne E. Thessen
- Department of Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America
- * E-mail:
| | - Paul Bogdan
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, United States of America
| | | | - Theresa M. Casey
- Department of Animal Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - César Hinojo-Hinojo
- Department of Earth System Science, University of California, Irvine, California, United States of America
| | - Orlando de Lange
- Department of Electrical Engineering, University of Washington, Seattle, Washington, United States of America
| | - Melissa A. Haendel
- Department of Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America
| |
Collapse
|
34
|
Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities. mSystems 2021; 6:6/1/e01194-20. [PMID: 33622857 PMCID: PMC8573954 DOI: 10.1128/msystems.01194-20] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.
Collapse
|
35
|
Hail, software! NATURE COMPUTATIONAL SCIENCE 2021; 1:89. [PMID: 38217219 DOI: 10.1038/s43588-021-00037-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2024]
|
36
|
Katz DS, Chue Hong NP, Clark T, Muench A, Stall S, Bouquin D, Cannon M, Edmunds S, Faez T, Feeney P, Fenner M, Friedman M, Grenier G, Harrison M, Heber J, Leary A, MacCallum C, Murray H, Pastrana E, Perry K, Schuster D, Stockhause M, Yeston J. Recognizing the value of software: a software citation guide. F1000Res 2021; 9:1257. [PMID: 33500780 PMCID: PMC7805487 DOI: 10.12688/f1000research.26932.2] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/07/2021] [Indexed: 11/20/2022] Open
Abstract
Software is as integral as a research paper, monograph, or dataset in terms of facilitating the full understanding and dissemination of research. This article provides broadly applicable guidance on software citation for the communities and institutions publishing academic journals and conference proceedings. We expect those communities and institutions to produce versions of this document with software examples and citation styles that are appropriate for their intended audience. This article (and those community-specific versions) are aimed at authors citing software, including software developed by the authors or by others. We also include brief instructions on how software can be made citable, directing readers to more comprehensive guidance published elsewhere. The guidance presented in this article helps to support proper attribution and credit, reproducibility, collaboration and reuse, and encourages building on the work of others to further research.
Collapse
Affiliation(s)
- Daniel S Katz
- University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | - Tim Clark
- University of Virginia, Charlottesville, VA, USA
| | | | | | - Daina Bouquin
- Harvard-Smithsonian Center for Astrophysics, Cambridge, MA, USA
| | | | - Scott Edmunds
- GigaScience Press, BGI Hong Kong, Hong Kong, Hong Kong
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Luo M, Xu Z, Hirsch T, Aung TS, Xu W, Ji L, Qin H, Ma K. The use of Global Biodiversity Information Facility (GBIF)-mediated data in publications written in Chinese. Glob Ecol Conserv 2021. [DOI: 10.1016/j.gecco.2020.e01406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
38
|
Yoon J, Chung E, Schalk J, Kim J. Examination of data citation guidelines in style manuals and data repositories. LEARNED PUBLISHING 2020. [DOI: 10.1002/leap.1349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- JungWon Yoon
- Department of Library and Information Science Jeonbuk National University Jeonju‐si South Korea
| | - EunKyung Chung
- Department of Library and Information Science Ewha Womans University Seoul South Korea
| | - Janet Schalk
- Pasco‐Hernando State College, Porter Campus at Wiregrass Ranch Library Wesley Chapel Florida USA
| | - Jihyun Kim
- Department of Library and Information Science Ewha Womans University Seoul South Korea
| |
Collapse
|
39
|
Implementing the RDA Research Data Policy Framework in Slovenian Scientific Journals. DATA SCIENCE JOURNAL 2020. [DOI: 10.5334/dsj-2020-049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
40
|
Mendes PSF, Siradze S, Pirro L, Thybaut JW. Open Data in Catalysis: From Today's Big Picture to the Future of Small Data. ChemCatChem 2020. [DOI: 10.1002/cctc.202001132] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Pedro S. F. Mendes
- Laboratory for Chemical Technology Department of Materials Textiles and Chemical Engineering Ghent University Technologiepark 125 9052 Ghent Belgium
| | - Sébastien Siradze
- Laboratory for Chemical Technology Department of Materials Textiles and Chemical Engineering Ghent University Technologiepark 125 9052 Ghent Belgium
| | - Laura Pirro
- Laboratory for Chemical Technology Department of Materials Textiles and Chemical Engineering Ghent University Technologiepark 125 9052 Ghent Belgium
| | - Joris W. Thybaut
- Laboratory for Chemical Technology Department of Materials Textiles and Chemical Engineering Ghent University Technologiepark 125 9052 Ghent Belgium
| |
Collapse
|
41
|
ODDPub – a Text-Mining Algorithm to Detect Data Sharing in Biomedical Publications. DATA SCIENCE JOURNAL 2020. [DOI: 10.5334/dsj-2020-042] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
42
|
Katz DS, Chue Hong NP, Clark T, Muench A, Stall S, Bouquin D, Cannon M, Edmunds S, Faez T, Feeney P, Fenner M, Friedman M, Grenier G, Harrison M, Heber J, Leary A, MacCallum C, Murray H, Pastrana E, Perry K, Schuster D, Stockhause M, Yeston J. Recognizing the value of software: a software citation guide. F1000Res 2020; 9:1257. [PMID: 33500780 PMCID: PMC7805487 DOI: 10.12688/f1000research.26932.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/07/2021] [Indexed: 08/25/2023] Open
Abstract
Software is as integral as a research paper, monograph, or dataset in terms of facilitating the full understanding and dissemination of research. This article provides broadly applicable guidance on software citation for the communities and institutions publishing academic journals and conference proceedings. We expect those communities and institutions to produce versions of this document with software examples and citation styles that are appropriate for their intended audience. This article (and those community-specific versions) are aimed at authors citing software, including software developed by the authors or by others. We also include brief instructions on how software can be made citable, directing readers to more comprehensive guidance published elsewhere. The guidance presented in this article helps to support proper attribution and credit, reproducibility, collaboration and reuse, and encourages building on the work of others to further research.
Collapse
Affiliation(s)
- Daniel S. Katz
- University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | - Tim Clark
- University of Virginia, Charlottesville, VA, USA
| | | | | | - Daina Bouquin
- Harvard-Smithsonian Center for Astrophysics, Cambridge, MA, USA
| | | | - Scott Edmunds
- GigaScience Press, BGI Hong Kong, Hong Kong, Hong Kong
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Hollaway MJ, Dean G, Blair GS, Brown M, Henrys PA, Watkins J. Tackling the Challenges of 21 st-Century Open Science and Beyond: A Data Science Lab Approach. PATTERNS (NEW YORK, N.Y.) 2020; 1:100103. [PMID: 33205137 PMCID: PMC7660442 DOI: 10.1016/j.patter.2020.100103] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 06/09/2020] [Accepted: 08/24/2020] [Indexed: 11/10/2022]
Abstract
In recent years, there has been a drive toward more open, cross-disciplinary science taking center stage. This has presented a number of challenges, including providing research platforms for collaborating scientists to explore big data, develop methods, and disseminate their results to stakeholders and decision makers. We present our vision of a "data science lab" as a collaborative space where scientists (from different disciplines), stakeholders, and policy makers can create data-driven solutions to environmental science's grand challenges. We set out a clear and defined research roadmap to serve as a focal point for an international research community progressing toward a more data-driven and transparent approach to environmental data science, centered on data science labs. This includes ongoing case studies of good practice, with the infrastructural and methodological developments required to enable data science labs to support significant increase in our cross- and trans-disciplinary science capabilities.
Collapse
Affiliation(s)
- Michael J. Hollaway
- UK Centre for Ecology and Hydrology, Lancaster Environment Centre, Lancaster, UK
| | - Graham Dean
- UK Centre for Ecology and Hydrology, Lancaster Environment Centre, Lancaster, UK
| | - Gordon S. Blair
- UK Centre for Ecology and Hydrology, Lancaster Environment Centre, Lancaster, UK
- School of Computing and Communications, Lancaster University, Lancaster, UK
| | - Mike Brown
- UK Centre for Ecology and Hydrology, Lancaster Environment Centre, Lancaster, UK
| | - Peter A. Henrys
- UK Centre for Ecology and Hydrology, Lancaster Environment Centre, Lancaster, UK
| | - John Watkins
- UK Centre for Ecology and Hydrology, Lancaster Environment Centre, Lancaster, UK
| |
Collapse
|
44
|
Abstract
AbstractCitation metrics have value because they aim to make scientific assessment a level playing field, but urgent transparency-based adjustments are necessary to ensure that measurements yield the most accurate picture of impact and excellence. One problematic area is the handling of self-citations, which are either excluded or inappropriately accounted for when using bibliometric indicators for research evaluation. Here, in favor of openly tracking self-citations we report on self-referencing behavior among various academic disciplines as captured by the curated Clarivate Analytics Web of Science database. Specifically, we examined the behavior of 385,616 authors grouped into 15 subject areas like Biology, Chemistry, Science and Technology, Engineering, and Physics. These authors have published 3,240,973 papers that have accumulated 90,806,462 citations, roughly five percent of which are self-citations. Up until now, very little is known about the buildup of self-citations at the author-level and in field-specific contexts. Our view is that hiding self-citation data is indefensible and needlessly confuses any attempts to understand the bibliometric impact of one’s work. Instead we urge academics to embrace visibility of citation data in a community of peers, which relies on nuance and openness rather than curated scorekeeping.
Collapse
|
45
|
Shahin MH, Bhattacharya S, Silva D, Kim S, Burton J, Podichetty J, Romero K, Conrado DJ. Open Data Revolution in Clinical Research: Opportunities and Challenges. Clin Transl Sci 2020; 13:665-674. [PMID: 32004409 PMCID: PMC7359943 DOI: 10.1111/cts.12756] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 12/05/2019] [Indexed: 01/24/2023] Open
Abstract
Efforts for sharing individual clinical data are gaining momentum due to a heightened recognition that integrated data sets can catalyze biomedical discoveries and drug development. Among the benefits are the fact that data sharing can help generate and investigate new research hypothesis beyond those explored in the original study. Despite several accomplishments establishing public systems and guidance for data sharing in clinical trials, this practice is not the norm. Among the reasons are ethical challenges, such as privacy of individuals, data ownership, and control. This paper creates awareness of the potential benefits and challenges of sharing individual clinical data, how to overcome these challenges, and how as a clinical pharmacology community we can shape future directions in this field.
Collapse
Affiliation(s)
| | - Sanchita Bhattacharya
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA.,Department of Pediatrics, University of California, San Francisco, San Francisco, California, USA
| | - Diego Silva
- Faculty of Health Sciences, Simon Fraser University, Vancouver, British Columbia, Canada.,Sydney Health Ethics, Faculty of Medicine and Health, University of Sydney, Sydney, Australia
| | - Sarah Kim
- Center for Pharmacometrics and Systems Pharmacology, Department of Pharmaceutics, College of Pharmacy, University of Florida, Orlando, Florida, USA
| | | | | | | | | |
Collapse
|
46
|
Groth P, Cousijn H, Clark T, Goble C. FAIR Data Reuse – the Path through Data Citation. DATA INTELLIGENCE 2020. [DOI: 10.1162/dint_a_00030] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
One of the key goals of the FAIR guiding principles is defined by its final principle – to optimize data sets for reuse by both humans and machines. To do so, data providers need to implement and support consistent machine readable metadata to describe their data sets. This can seem like a daunting task for data providers, whether it is determining what level of detail should be provided in the provenance metadata or figuring out what common shared vocabularies should be used. Additionally, for existing data sets it is often unclear what steps should be taken to enable maximal, appropriate reuse. Data citation already plays an important role in making data findable and accessible, providing persistent and unique identifiers plus metadata on over 16 million data sets. In this paper, we discuss how data citation and its underlying infrastructures, in particular associated metadata, provide an important pathway for enabling FAIR data reuse.
Collapse
Affiliation(s)
- Paul Groth
- Informatics Institute, University of Amsterdam, Amsterdam 1090 GH, The Netherlands
| | | | - Tim Clark
- Data Science Institute, University of Virginia, Charlottesville, VA 22903-1738, USA
| | - Carole Goble
- Department of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK
| |
Collapse
|
47
|
Juty N, Wimalaratne SM, Soiland-Reyes S, Kunze J, Goble CA, Clark T. Unique, Persistent, Resolvable: Identifiers as the Foundation of FAIR. DATA INTELLIGENCE 2020. [DOI: 10.1162/dint_a_00025] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
The FAIR principles describe characteristics intended to support access to and reuse of digital artifacts in the scientific research ecosystem. Persistent, globally unique identifiers, resolvable on the Web, and associated with a set of additional descriptive metadata, are foundational to FAIR data. Here we describe some basic principles and exemplars for their design, use and orchestration with other system elements to achieve FAIRness for digital research objects.
Collapse
Affiliation(s)
- Nick Juty
- Department of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | | | - Stian Soiland-Reyes
- Department of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - John Kunze
- California Digital Library, Oakland, California 94612-2901, USA
| | - Carole A. Goble
- Department of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - Tim Clark
- Data Science Institute, University of Virginia, Charlottesville, VA 22903-1738, USA
| |
Collapse
|
48
|
Descoteaux D, Farinelli C, Soares e Silva M, de Waard A. Playing Well on the Data FAIRground: Initiatives and Infrastructure in Research Data Management. DATA INTELLIGENCE 2019. [DOI: 10.1162/dint_a_00020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Over the past five years, Elsevier has focused on implementing FAIR and best practices in data management, from data preservation through reuse. In this paper we describe a series of efforts undertaken in this time to support proper data management practices. In particular, we discuss our journal data policies and their implementation, the current status and future goals for the research data management platform Mendeley Data, and clear and persistent linkages to individual data sets stored on external data repositories from corresponding published papers through partnership with Scholix. Early analysis of our data policies implementation confirms significant disparities at the subject level regarding data sharing practices, with most uptake within disciplines of Physical Sciences. Future directions at Elsevier include implementing better discoverability of linked data within an article and incorporating research data usage metrics.
Collapse
Affiliation(s)
| | | | | | - Anita de Waard
- Elsevier, Inc, 50 Hampshire St, Cambridge, MA 02139, USA
| |
Collapse
|
49
|
Abstract
It is easy to argue that open data are critical to enabling faster and more effective research discovery. In this article, we describe the approach we have taken at Wiley to support open data and to start enabling more data to be FAIR data (Findable, Accessible, Interoperable and Reusable) with the implementation of four data policies: “Encourages”, “Expects”, “Mandates” and “Mandates and Peer Reviews Data”. We describe the rationale for these policies and levels of adoption so far. In the coming months we plan to measure and monitor the implementation of these policies via the publication of data availability statements and data citations. With this information, we'll be able to celebrate adoption of data-sharing practices by the research communities we work with and serve, and we hope to showcase researchers from those communities leading in open research.
Collapse
Affiliation(s)
- Yan Wu
- Wiley, 805-808 Sun Palace, No. 12A Taiyanggong Middle Road, Beijing 100028, China
| | | | - Hope Inman
- Wiley, 111 River Street, Hoboken, NJ 07030, USA
| | - Chris Graf
- Wiley, 9600 Garsington Road, Oxford, OX4 2DQ, UK
| |
Collapse
|
50
|
Ochsner SA, Abraham D, Martin K, Ding W, McOwiti A, Kankanamge W, Wang Z, Andreano K, Hamilton RA, Chen Y, Hamilton A, Gantner ML, Dehart M, Qu S, Hilsenbeck SG, Becnel LB, Bridges D, Ma'ayan A, Huss JM, Stossi F, Foulds CE, Kralli A, McDonnell DP, McKenna NJ. The Signaling Pathways Project, an integrated 'omics knowledgebase for mammalian cellular signaling pathways. Sci Data 2019; 6:252. [PMID: 31672983 PMCID: PMC6823428 DOI: 10.1038/s41597-019-0193-4] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 09/11/2019] [Indexed: 12/28/2022] Open
Abstract
Mining of integrated public transcriptomic and ChIP-Seq (cistromic) datasets can illuminate functions of mammalian cellular signaling pathways not yet explored in the research literature. Here, we designed a web knowledgebase, the Signaling Pathways Project (SPP), which incorporates community classifications of signaling pathway nodes (receptors, enzymes, transcription factors and co-nodes) and their cognate bioactive small molecules. We then mapped over 10,000 public transcriptomic or cistromic experiments to their pathway node or biosample of study. To enable prediction of pathway node-gene target transcriptional regulatory relationships through SPP, we generated consensus 'omics signatures, or consensomes, which ranked genes based on measures of their significant differential expression or promoter occupancy across transcriptomic or cistromic experiments mapped to a specific node family. Consensomes were validated using alignment with canonical literature knowledge, gene target-level integration of transcriptomic and cistromic data points, and in bench experiments confirming previously uncharacterized node-gene target regulatory relationships. To expose the SPP knowledgebase to researchers, a web browser interface was designed that accommodates numerous routine data mining strategies. SPP is freely accessible at https://www.signalingpathways.org .
Collapse
Affiliation(s)
- Scott A Ochsner
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - David Abraham
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Kirt Martin
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Wei Ding
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Apollo McOwiti
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Wasula Kankanamge
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Zichen Wang
- Icahn School of Medicine, Mount Sinai University, New York, NY, 10029, USA
| | - Kaitlyn Andreano
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine, Durham, NC, 27710, USA
| | - Ross A Hamilton
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Yue Chen
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Angelica Hamilton
- Diabetes & Metabolism Research Institute, City of Hope, Duarte, CA, 91010, USA
| | - Marin L Gantner
- Department of Chemical Physiology, Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Michael Dehart
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Shijing Qu
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Susan G Hilsenbeck
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Lauren B Becnel
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Dave Bridges
- University of Michigan School of Public Health, Ann Arbor, MI, 48109, USA
| | - Avi Ma'ayan
- Icahn School of Medicine, Mount Sinai University, New York, NY, 10029, USA
| | - Janice M Huss
- Diabetes & Metabolism Research Institute, City of Hope, Duarte, CA, 91010, USA
| | - Fabio Stossi
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Charles E Foulds
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Anastasia Kralli
- Department of Chemical Physiology, Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Donald P McDonnell
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine, Durham, NC, 27710, USA
| | - Neil J McKenna
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA.
| |
Collapse
|