1
|
Brooks TG, Lahens NF, Mrčela A, Grant GR. Challenges and best practices in omics benchmarking. Nat Rev Genet 2024; 25:326-339. [PMID: 38216661 DOI: 10.1038/s41576-023-00679-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2023] [Indexed: 01/14/2024]
Abstract
Technological advances enabling massively parallel measurement of biological features - such as microarrays, high-throughput sequencing and mass spectrometry - have ushered in the omics era, now in its third decade. The resulting complex landscape of analytical methods has naturally fostered the growth of an omics benchmarking industry. Benchmarking refers to the process of objectively comparing and evaluating the performance of different computational or analytical techniques when processing and analysing large-scale biological data sets, such as transcriptomics, proteomics and metabolomics. With thousands of omics benchmarking studies published over the past 25 years, the field has matured to the point where the foundations of benchmarking have been established and well described. However, generating meaningful benchmarking data and properly evaluating performance in this complex domain remains challenging. In this Review, we highlight some common oversights and pitfalls in omics benchmarking. We also establish a methodology to bring the issues that can be addressed into focus and to be transparent about those that cannot: this takes the form of a spreadsheet template of guidelines for comprehensive reporting, intended to accompany publications. In addition, a survey of recent developments in benchmarking is provided as well as specific guidance for commonly encountered difficulties.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
2
|
Luecken MD, Gigante S, Burkhardt DB, Cannoodt R, Strobl DC, Markov NS, Zappia L, Palla G, Lewis W, Dimitrov D, Vinyard ME, Magruder DS, Andersson A, Dann E, Qin Q, Otto DJ, Klein M, Botvinnik OB, Deconinck L, Waldrant K, Bloom JM, Pisco AO, Saez-Rodriguez J, Wulsin D, Pinello L, Saeys Y, Theis FJ, Krishnaswamy S. Defining and benchmarking open problems in single-cell analysis. RESEARCH SQUARE 2024:rs.3.rs-4181617. [PMID: 38645152 PMCID: PMC11030530 DOI: 10.21203/rs.3.rs-4181617/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
With the growing number of single-cell analysis tools, benchmarks are increasingly important to guide analysis and method development. However, a lack of standardisation and extensibility in current benchmarks limits their usability, longevity, and relevance to the community. We present Open Problems, a living, extensible, community-guided benchmarking platform including 10 current single-cell tasks that we envision will raise standards for the selection, evaluation, and development of methods in single-cell analysis.
Collapse
Affiliation(s)
- Malte D Luecken
- Institute of computational Biology, Helmholtz Munich, Neuherberg, Germany
- Institute of Lung Health & Immunity, Helmholtz Munich; Member of the German Center for Lung Research (DZL), Munich, Germany
| | | | | | - Robrecht Cannoodt
- Data Intuitive, Lebbeke, Belgium
- Data Mining and Modelling for Biomedicine group, VIB Center for Inflammation Research, Ghent, Belgium
- Department of Applied Mathematics, Computer Science, and Statistics, Ghent University, Ghent, Belgium
| | - Daniel C Strobl
- Institute of computational Biology, Helmholtz Munich, Neuherberg, Germany
- Institute of Clinical Chemistry and Pathobiochemistry, School of Medicine, Technical University of Munich, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Germany
| | - Nikolay S Markov
- Division of Pulmonary and Critical Care Medicine, Feinberg School of Medicine, Northwestern University
| | - Luke Zappia
- Institute of computational Biology, Helmholtz Munich, Neuherberg, Germany
- Department of Mathematics, School of Computing, Information and Technology, Technical University of Munich, Munich, Germany
| | - Giovanni Palla
- Institute of computational Biology, Helmholtz Munich, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Germany
| | - Wesley Lewis
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA
| | - Daniel Dimitrov
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
| | - Michael E Vinyard
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
| | - D S Magruder
- Department of Computer Science, Yale University, New Haven CT, USA
| | - Alma Andersson
- Genentech Inc
- Royal Institute of Technology (KTH), Gene Technology
- Science for Life Laboratory (SciLifeLab)
| | - Emma Dann
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Qian Qin
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Dominik J Otto
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle WA
- Computational Biology Program, Public Health Sciences Division, Seattle WA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle WA
| | | | - Olga Borisovna Botvinnik
- Data Sciences Platform, Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158
- Bridge Bio Pharma, 3160 Porter Drive, Suite 250, Palo Alto, CA, 94304
| | - Louise Deconinck
- Data Mining and Modelling for Biomedicine group, VIB Center for Inflammation Research, Ghent, Belgium
- Department of Applied Mathematics, Computer Science, and Statistics, Ghent University, Ghent, Belgium
| | | | | | - Angela Oliveira Pisco
- Data Sciences Platform, Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158
- Insitro, South San Francisco
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
| | | | - Luca Pinello
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
| | - Yvan Saeys
- Data Mining and Modelling for Biomedicine group, VIB Center for Inflammation Research, Ghent, Belgium
- Department of Applied Mathematics, Computer Science, and Statistics, Ghent University, Ghent, Belgium
- VIB Center for AI & Computational Biology (VIB.AI), Gent, Belgium
| | - Fabian J Theis
- Institute of computational Biology, Helmholtz Munich, Neuherberg, Germany
- Department of Mathematics, School of Computing, Information and Technology, Technical University of Munich, Munich, Germany
- Cellular Genetics Programme, Wellcome Sanger Institute, Hinxton, UK (associated faculty)
| | - Smita Krishnaswamy
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA
- Department of Computer Science, Yale University, New Haven CT, USA
- Department of Genetics, Yale University, New Haven CT, USA
| |
Collapse
|
3
|
Pascual-Reguant A, Kroh S, Hauser AE. Tissue niches and immunopathology through the lens of spatial tissue profiling techniques. Eur J Immunol 2024; 54:e2350484. [PMID: 37985207 DOI: 10.1002/eji.202350484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 11/13/2023] [Accepted: 11/15/2023] [Indexed: 11/22/2023]
Abstract
Spatial organization plays a fundamental role in biology, influencing the function of biological structures at various levels. The immune system, in particular, relies on the orchestrated interactions of immune cells with their microenvironment to mount protective or pathogenic immune responses. The COVID-19 pandemic has underscored the significance of studying immunity within target organs to understand disease progression and severity. To achieve this, multiplex histology and spatial transcriptomics have proven indispensable in providing a spatial context to protein and gene expression patterns. By combining these techniques, researchers gain a more comprehensive understanding of the complex interactions at the cellular and molecular level in distinct tissue niches, key functional units modulating health and disease. In this review, we discuss recent advances in spatial tissue profiling techniques, highlighting their advantages over traditional histopathology studies. The insights gained from these approaches have the potential to revolutionize the diagnosis and treatment of various diseases including cancer, autoimmune disorders, and infectious diseases. However, we also acknowledge their challenges and limitations. Despite these, spatial tissue profiling offers promising opportunities to improve our understanding of how tissue niches direct regional immunity, and their relevance in tissue immunopathology, as a basis for novel therapeutic strategies and personalized medicine.
Collapse
Affiliation(s)
- Anna Pascual-Reguant
- Department of Rheumatology and Clinical Immunology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Immune Dynamics, Deutsches Rheuma-Forschungszentrum (DRFZ), Leibniz Institute, Berlin, Germany
- Spatial Genomics, Centre Nacional d'Anàlisi Genòmica, Barcelona, 08028, Spain
| | - Sandy Kroh
- Immune Dynamics, Deutsches Rheuma-Forschungszentrum (DRFZ), Leibniz Institute, Berlin, Germany
| | - Anja E Hauser
- Department of Rheumatology and Clinical Immunology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Immune Dynamics, Deutsches Rheuma-Forschungszentrum (DRFZ), Leibniz Institute, Berlin, Germany
| |
Collapse
|
4
|
Yang J, Liu Y, Shang J, Chen Q, Chen Q, Ren L, Zhang N, Yu Y, Li Z, Song Y, Yang S, Scherer A, Tong W, Hong H, Xiao W, Shi L, Zheng Y. The Quartet Data Portal: integration of community-wide resources for multiomics quality control. Genome Biol 2023; 24:245. [PMID: 37884999 PMCID: PMC10601216 DOI: 10.1186/s13059-023-03091-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 10/17/2023] [Indexed: 10/28/2023] Open
Abstract
The Quartet Data Portal facilitates community access to well-characterized reference materials, reference datasets, and related resources established based on a family of four individuals with identical twins from the Quartet Project. Users can request DNA, RNA, protein, and metabolite reference materials, as well as datasets generated across omics, platforms, labs, protocols, and batches. Reproducible analysis tools allow for objective performance assessment of user-submitted data, while interactive visualization tools support rapid exploration of reference datasets. A closed-loop "distribution-collection-evaluation-integration" workflow enables updates and integration of community-contributed multiomics data. Ultimately, this portal helps promote the advancement of reference datasets and multiomics quality control.
Collapse
Affiliation(s)
- Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zhihui Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yueqiang Song
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shengpeng Yang
- Intelligent Storage, Alibaba Cloud, Alibaba Group, Hangzhou, Zhejiang, China
| | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Wenming Xiao
- Office of Oncological Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes (Shanghai), Shanghai, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|