1
|
Mora A, Schmidt C, Balderson B, Frezza C, Bodén M. SiRCle (Signature Regulatory Clustering) model integration reveals mechanisms of phenotype regulation in renal cancer. Genome Med 2024; 16:144. [PMID: 39633487 PMCID: PMC11616309 DOI: 10.1186/s13073-024-01415-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 11/18/2024] [Indexed: 12/07/2024] Open
Abstract
BACKGROUND Clear cell renal cell carcinoma (ccRCC) tumours develop and progress via complex remodelling of the kidney epigenome, transcriptome, proteome and metabolome. Given the subsequent tumour and inter-patient heterogeneity, drug-based treatments report limited success, calling for multi-omics studies to extract regulatory relationships, and ultimately, to develop targeted therapies. Yet, methods for multi-omics integration to reveal mechanisms of phenotype regulation are lacking. METHODS Here, we present SiRCle (Signature Regulatory Clustering), a method to integrate DNA methylation, RNA-seq and proteomics data at the gene level by following central dogma of biology, i.e. genetic information proceeds from DNA, to RNA, to protein. To identify regulatory clusters across the different omics layers, we group genes based on the layer where the gene's dysregulation first occurred. We combine the SiRCle clusters with a variational autoencoder (VAE) to reveal key features from omics' data for each SiRCle cluster and compare patient subpopulations in a ccRCC and a PanCan cohort. RESULTS Applying SiRCle to a ccRCC cohort, we showed that glycolysis is upregulated by DNA hypomethylation, whilst mitochondrial enzymes and respiratory chain complexes are translationally suppressed. Additionally, we identify metabolic enzymes associated with survival along with the possible molecular driver behind the gene's perturbations. By using the VAE to integrate omics' data followed by statistical comparisons between tumour stages on the integrated space, we found a stage-dependent downregulation of proximal renal tubule genes, hinting at a loss of cellular identity in cancer cells. We also identified the regulatory layers responsible for their suppression. Lastly, we applied SiRCle to a PanCan cohort and found common signatures across ccRCC and PanCan in addition to the regulatory layer that defines tissue identity. CONCLUSIONS Our results highlight SiRCle's ability to reveal mechanisms of phenotype regulation in cancer, both specifically in ccRCC and broadly in a PanCan context. SiRCle ranks genes according to biological features. https://github.com/ArianeMora/SiRCle_multiomics_integration .
Collapse
Affiliation(s)
- Ariane Mora
- School of Chemistry and Molecular Biosciences, University of Queensland, Molecular Biosciences Building 76, St Lucia, QLD, 4072, Australia
| | - Christina Schmidt
- Medical Research Council Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge Biomedical Campus, Box 197, Cambridge, CB2 0X2, UK
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Institute for Metabolomics in Ageing, Cluster of Excellence Cellular Stress Responses in Aging-associated Diseases (CECAD), Joseph-Stelzmann-Str. 26, Cologne, 50931, Germany
| | - Brad Balderson
- School of Chemistry and Molecular Biosciences, University of Queensland, Molecular Biosciences Building 76, St Lucia, QLD, 4072, Australia
| | - Christian Frezza
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Institute for Metabolomics in Ageing, Cluster of Excellence Cellular Stress Responses in Aging-associated Diseases (CECAD), Joseph-Stelzmann-Str. 26, Cologne, 50931, Germany.
- University of Cologne, Faculty of Mathematics and Natural Sciences, Institute of Genetics, Cluster of Excellence Cellular Stress Responses in Aging-associated Diseases (CECAD), Cologne, Germany.
| | - Mikael Bodén
- School of Chemistry and Molecular Biosciences, University of Queensland, Molecular Biosciences Building 76, St Lucia, QLD, 4072, Australia.
| |
Collapse
|
2
|
Choi S, An JY. Multiomics in cancer biomarker discovery and cancer subtyping. Adv Clin Chem 2024; 124:161-195. [PMID: 39818436 DOI: 10.1016/bs.acc.2024.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2025]
Abstract
The advent of multiomics has ushered in a new era of cancer research characterized by integrated genomic, transcriptomic and proteomic analysis to unravel the complexities of cancer biology and facilitate the discovery of novel biomarkers. This chapter provides a comprehensive overview of the concept of multiomics, detailing the significant advances in the underlying technologies and their contributions to our understanding of cancer. It delves into the evolution of genomics and transcriptomics, breakthroughs in proteomics, and overarching progress in multiomic methodologies, highlighting their collective impact on cancer biomarker discovery. Furthermore, this chapter explores the computational methods essential for multiomic studies, including clustering techniques for delineating cancer subtypes, strategies for estimating molecular features and activities, and utility of pathway enrichment analyses for interpreting multiomic datasets. Particular focus has been placed on the application of these methods for identifying distinct cancer subtypes, thereby enabling a more personalized approach to cancer treatment. Through a detailed discussion of the scientific principles, technological advancements, and practical applications of multiomics, this chapter aims to underscore the pivotal role of multiomics in advancing cancer research and paving the way for personalized medicine. The insights provided herein not only illuminate the current landscape of cancer biomarker discovery, but also forecast future directions of multiomics research in oncology, advocating for a more integrated and nuanced approach to understanding and combating cancer.
Collapse
Affiliation(s)
- Seunghwan Choi
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, Republic of Korea
| | - Joon-Yong An
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, Republic of Korea; Department of Integrated Biomedical and Life Science, Korea University, Seoul, Republic of Korea; BK21FOUR R&E Center for Learning Health Systems, Korea University, Seoul, Republic of Korea; L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, Seoul, Republic of Korea.
| |
Collapse
|
3
|
Acharya D, Mukhopadhyay A. A comprehensive review of machine learning techniques for multi-omics data integration: challenges and applications in precision oncology. Brief Funct Genomics 2024; 23:549-560. [PMID: 38600757 DOI: 10.1093/bfgp/elae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 03/12/2024] [Accepted: 03/22/2024] [Indexed: 04/12/2024] Open
Abstract
Multi-omics data play a crucial role in precision medicine, mainly to understand the diverse biological interaction between different omics. Machine learning approaches have been extensively employed in this context over the years. This review aims to comprehensively summarize and categorize these advancements, focusing on the integration of multi-omics data, which includes genomics, transcriptomics, proteomics and metabolomics, alongside clinical data. We discuss various machine learning techniques and computational methodologies used for integrating distinct omics datasets and provide valuable insights into their application. The review emphasizes both the challenges and opportunities present in multi-omics data integration, precision medicine and patient stratification, offering practical recommendations for method selection in various scenarios. Recent advances in deep learning and network-based approaches are also explored, highlighting their potential to harmonize diverse biological information layers. Additionally, we present a roadmap for the integration of multi-omics data in precision oncology, outlining the advantages, challenges and implementation difficulties. Hence this review offers a thorough overview of current literature, providing researchers with insights into machine learning techniques for patient stratification, particularly in precision oncology. Contact: anirban@klyuniv.ac.in.
Collapse
Affiliation(s)
- Debabrata Acharya
- Department of Computer Science & Engineering, University of Kalyani, Kalyani-741235, West Bengal, India
| | - Anirban Mukhopadhyay
- Department of Computer Science & Engineering, University of Kalyani, Kalyani-741235, West Bengal, India
| |
Collapse
|
4
|
Hernández-Lemus E, Ochoa S. Methods for multi-omic data integration in cancer research. Front Genet 2024; 15:1425456. [PMID: 39364009 PMCID: PMC11446849 DOI: 10.3389/fgene.2024.1425456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 08/28/2024] [Indexed: 10/05/2024] Open
Abstract
Multi-omics data integration is a term that refers to the process of combining and analyzing data from different omic experimental sources, such as genomics, transcriptomics, methylation assays, and microRNA sequencing, among others. Such data integration approaches have the potential to provide a more comprehensive functional understanding of biological systems and has numerous applications in areas such as disease diagnosis, prognosis and therapy. However, quantitative integration of multi-omic data is a complex task that requires the use of highly specialized methods and approaches. Here, we discuss a number of data integration methods that have been developed with multi-omics data in view, including statistical methods, machine learning approaches, and network-based approaches. We also discuss the challenges and limitations of such methods and provide examples of their applications in the literature. Overall, this review aims to provide an overview of the current state of the field and highlight potential directions for future research.
Collapse
Affiliation(s)
- Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| |
Collapse
|
5
|
Valous NA, Popp F, Zörnig I, Jäger D, Charoentong P. Graph machine learning for integrated multi-omics analysis. Br J Cancer 2024; 131:205-211. [PMID: 38729996 PMCID: PMC11263675 DOI: 10.1038/s41416-024-02706-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 04/25/2024] [Accepted: 04/26/2024] [Indexed: 05/12/2024] Open
Abstract
Multi-omics experiments at bulk or single-cell resolution facilitate the discovery of hypothesis-generating biomarkers for predicting response to therapy, as well as aid in uncovering mechanistic insights into cellular and microenvironmental processes. Many methods for data integration have been developed for the identification of key elements that explain or predict disease risk or other biological outcomes. The heterogeneous graph representation of multi-omics data provides an advantage for discerning patterns suitable for predictive/exploratory analysis, thus permitting the modeling of complex relationships. Graph-based approaches-including graph neural networks-potentially offer a reliable methodological toolset that can provide a tangible alternative to scientists and clinicians that seek ideas and implementation strategies in the integrated analysis of their omics sets for biomedical research. Graph-based workflows continue to push the limits of the technological envelope, and this perspective provides a focused literature review of research articles in which graph machine learning is utilized for integrated multi-omics data analyses, with several examples that demonstrate the effectiveness of graph-based approaches.
Collapse
Affiliation(s)
- Nektarios A Valous
- Applied Tumor Immunity Clinical Cooperation Unit, National Center for Tumor Diseases (NCT), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, 69120, Heidelberg, Germany.
- Center for Quantitative Analysis of Molecular and Cellular Biosystems (Bioquant), Heidelberg University, Im Neuenheimer Feld 267, 69120, Heidelberg, Germany.
| | - Ferdinand Popp
- Applied Tumor Immunity Clinical Cooperation Unit, National Center for Tumor Diseases (NCT), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, 69120, Heidelberg, Germany
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120, Heidelberg, Germany
| | - Inka Zörnig
- Center for Quantitative Analysis of Molecular and Cellular Biosystems (Bioquant), Heidelberg University, Im Neuenheimer Feld 267, 69120, Heidelberg, Germany
- Department of Medical Oncology, National Center for Tumor Diseases (NCT), Heidelberg University Hospital (UKHD), Im Neuenheimer Feld 460, 69120, Heidelberg, Germany
| | - Dirk Jäger
- Applied Tumor Immunity Clinical Cooperation Unit, National Center for Tumor Diseases (NCT), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, 69120, Heidelberg, Germany
- Center for Quantitative Analysis of Molecular and Cellular Biosystems (Bioquant), Heidelberg University, Im Neuenheimer Feld 267, 69120, Heidelberg, Germany
- Department of Medical Oncology, National Center for Tumor Diseases (NCT), Heidelberg University Hospital (UKHD), Im Neuenheimer Feld 460, 69120, Heidelberg, Germany
| | - Pornpimol Charoentong
- Center for Quantitative Analysis of Molecular and Cellular Biosystems (Bioquant), Heidelberg University, Im Neuenheimer Feld 267, 69120, Heidelberg, Germany
- Department of Medical Oncology, National Center for Tumor Diseases (NCT), Heidelberg University Hospital (UKHD), Im Neuenheimer Feld 460, 69120, Heidelberg, Germany
| |
Collapse
|
6
|
Vaparanta K, Merilahti JAM, Ojala VK, Elenius K. De Novo Multi-Omics Pathway Analysis Designed for Prior Data Independent Inference of Cell Signaling Pathways. Mol Cell Proteomics 2024; 23:100780. [PMID: 38703893 PMCID: PMC11259815 DOI: 10.1016/j.mcpro.2024.100780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 04/07/2024] [Accepted: 04/30/2024] [Indexed: 05/06/2024] Open
Abstract
New tools for cell signaling pathway inference from multi-omics data that are independent of previous knowledge are needed. Here, we propose a new de novo method, the de novo multi-omics pathway analysis (DMPA), to model and combine omics data into network modules and pathways. DMPA was validated with published omics data and was found accurate in discovering reported molecular associations in transcriptome, interactome, phosphoproteome, methylome, and metabolomics data, and signaling pathways in multi-omics data. DMPA was benchmarked against module discovery and multi-omics integration methods and outperformed previous methods in module and pathway discovery especially when applied to datasets of relatively low sample sizes. Transcription factor, kinase, subcellular location, and function prediction algorithms were devised for transcriptome, phosphoproteome, and interactome modules and pathways, respectively. To apply DMPA in a biologically relevant context, interactome, phosphoproteome, transcriptome, and proteome data were collected from analyses carried out using melanoma cells to address gamma-secretase cleavage-dependent signaling characteristics of the receptor tyrosine kinase TYRO3. The pathways modeled with DMPA reflected the predicted function and its direction in validation experiments.
Collapse
Affiliation(s)
- Katri Vaparanta
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland.
| | - Johannes A M Merilahti
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland
| | - Veera K Ojala
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland
| | - Klaus Elenius
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland; Department of Oncology, Turku University Hospital, Turku, Finland.
| |
Collapse
|
7
|
Guo Y, Luo L, Zhu J, Li C. Advance in Multi-omics Research Strategies on Cholesterol Metabolism in Psoriasis. Inflammation 2024; 47:839-852. [PMID: 38244176 DOI: 10.1007/s10753-023-01961-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 11/29/2023] [Accepted: 12/25/2023] [Indexed: 01/22/2024]
Abstract
The skin is a complex and dynamic organ where homeostasis is maintained through the intricate interplay between the immune system and metabolism, particularly cholesterol metabolism. Various factors such as cytokines, inflammatory mediators, cholesterol metabolites, and metabolic enzymes play crucial roles in facilitating these interactions. Dysregulation of this delicate balance contributes to the pathogenic pathways of inflammatory skin conditions, notably psoriasis. In this article, we provide an overview of omics biomarkers associated with psoriasis in relation to cholesterol metabolism. We explore multi-omics approaches that reveal the communication between immunometabolism and psoriatic inflammation. Additionally, we summarize the use of multi-omics strategies to uncover the complexities of multifactorial and heterogeneous inflammatory diseases. Finally, we highlight potential future perspectives related to targeted drug therapies and research areas that can advance precise medicine. This review aims to serve as a valuable resource for those investigating the role of cholesterol metabolism in psoriasis.
Collapse
Affiliation(s)
- Youming Guo
- Hospital for Skin Diseases, Institute of Dermatology, Chinese Academy of Medical Sciences & Peking Union Medical College, Nanjing, Jiangsu, China
- Jiangsu Key Laboratory of Molecular Biology for Skin Diseases and STIs, Nanjing, Jiangsu, China
| | - Lingling Luo
- Hospital for Skin Diseases, Institute of Dermatology, Chinese Academy of Medical Sciences & Peking Union Medical College, Nanjing, Jiangsu, China
| | - Jing Zhu
- Hospital for Skin Diseases, Institute of Dermatology, Chinese Academy of Medical Sciences & Peking Union Medical College, Nanjing, Jiangsu, China
| | - Chengrang Li
- Hospital for Skin Diseases, Institute of Dermatology, Chinese Academy of Medical Sciences & Peking Union Medical College, Nanjing, Jiangsu, China.
- Jiangsu Key Laboratory of Molecular Biology for Skin Diseases and STIs, Nanjing, Jiangsu, China.
| |
Collapse
|
8
|
Afroz S, Islam N, Habib MA, Reza MS, Ashad Alam M. Multi-omics data integration and drug screening of AML cancer using Generative Adversarial Network. Methods 2024; 226:138-150. [PMID: 38670415 DOI: 10.1016/j.ymeth.2024.04.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 04/02/2024] [Accepted: 04/20/2024] [Indexed: 04/28/2024] Open
Abstract
In the era of precision medicine, accurate disease phenotype prediction for heterogeneous diseases, such as cancer, is emerging due to advanced technologies that link genotypes and phenotypes. However, it is difficult to integrate different types of biological data because they are so varied. In this study, we focused on predicting the traits of a blood cancer called Acute Myeloid Leukemia (AML) by combining different kinds of biological data. We used a recently developed method called Omics Generative Adversarial Network (GAN) to better classify cancer outcomes. The primary advantages of a GAN include its ability to create synthetic data that is nearly indistinguishable from real data, its high flexibility, and its wide range of applications, including multi-omics data analysis. In addition, the GAN was effective at combining two types of biological data. We created synthetic datasets for gene activity and DNA methylation. Our method was more accurate in predicting disease traits than using the original data alone. The experimental results provided evidence that the creation of synthetic data through interacting multi-omics data analysis using GANs improves the overall prediction quality. Furthermore, we identified the top-ranked significant genes through statistical methods and pinpointed potential candidate drug agents through in-silico studies. The proposed drugs, also supported by other independent studies, might play a crucial role in the treatment of AML cancer. The code is available on GitHub; https://github.com/SabrinAfroz/omicsGAN_codes?fbclid=IwAR1-/stuffmlE0hyWgSu2wlXo6dYlKUei3faLdlvpxTOOUPVlmYCloXf4Uk9ejK4I.
Collapse
Affiliation(s)
- Sabrin Afroz
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Bangladesh
| | - Nadira Islam
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Bangladesh
| | - Md Ahsan Habib
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Bangladesh; Statistical Learning Group, Bangladesh
| | - Md Selim Reza
- Tulane Center for Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA; Statistical Learning Group, Bangladesh
| | - Md Ashad Alam
- Ochsner Center for Outcomes Research, Ochsner Research, Ochsner Clinic Foundation, New Orleans, LA 70121, USA; Statistical Learning Group, Bangladesh.
| |
Collapse
|
9
|
Zhao C, Liu A, Zhang X, Cao X, Ding Z, Sha Q, Shen H, Deng HW, Zhou W. CLCLSA: Cross-omics linked embedding with contrastive learning and self attention for integration with incomplete multi-omics data. Comput Biol Med 2024; 170:108058. [PMID: 38295477 PMCID: PMC10959569 DOI: 10.1016/j.compbiomed.2024.108058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 12/30/2023] [Accepted: 01/26/2024] [Indexed: 02/02/2024]
Abstract
Integration of heterogeneous and high-dimensional multi-omics data is becoming increasingly important in understanding etiology of complex genetic diseases. Each omics technique only provides a limited view of the underlying biological process and integrating heterogeneous omics layers simultaneously would lead to a more comprehensive and detailed understanding of diseases and phenotypes. However, one obstacle faced when performing multi-omics data integration is the existence of unpaired multi-omics data due to instrument sensitivity and cost. Studies may fail if certain aspects of the subjects are missing or incomplete. In this paper, we propose a deep learning method for multi-omics integration with incomplete data by Cross-omics Linked unified embedding with Contrastive Learning and Self Attention (CLCLSA). Utilizing complete multi-omics data as supervision, the model employs cross-omics autoencoders to learn the feature representation across different types of biological data. The multi-omics contrastive learning is employed, which maximizes the mutual information between different types of omics. In addition, the feature-level self-attention and omics-level self-attention are employed to dynamically identify the most informative features for multi-omics data integration. Finally, a Softmax classifier is employed to perform multi-omics data classification. Extensive experiments were conducted on four public multi-omics datasets. The experimental results indicate that our proposed CLCLSA produces promising results in multi-omics data classification using both complete and incomplete multi-omics data.
Collapse
Affiliation(s)
- Chen Zhao
- Department of Computer Science, Kennesaw State University, Marietta, GA, 30060, USA
| | - Anqi Liu
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Xiao Zhang
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Xuewei Cao
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI, 49931, USA
| | - Zhengming Ding
- Department of Computer Science, Tulane University, New Orleans, LA, 70118, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI, 49931, USA
| | - Hui Shen
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Hong-Wen Deng
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA.
| | - Weihua Zhou
- Department of Applied Computing, Michigan Technological University, 1400 Townsend Dr, Houghton, MI, 49931, USA; Center for Biocomputing and Digital Health, Institute of Computing and Cybersystems, and Health Research Institute, Michigan Technological University, Houghton, MI, 49931, USA.
| |
Collapse
|
10
|
Fernandez ME, Martinez-Romero J, Aon MA, Bernier M, Price NL, de Cabo R. How is Big Data reshaping preclinical aging research? Lab Anim (NY) 2023; 52:289-314. [PMID: 38017182 DOI: 10.1038/s41684-023-01286-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 10/10/2023] [Indexed: 11/30/2023]
Abstract
The exponential scientific and technological progress during the past 30 years has favored the comprehensive characterization of aging processes with their multivariate nature, leading to the advent of Big Data in preclinical aging research. Spanning from molecular omics to organism-level deep phenotyping, Big Data demands large computational resources for storage and analysis, as well as new analytical tools and conceptual frameworks to gain novel insights leading to discovery. Systems biology has emerged as a paradigm that utilizes Big Data to gain insightful information enabling a better understanding of living organisms, visualized as multilayered networks of interacting molecules, cells, tissues and organs at different spatiotemporal scales. In this framework, where aging, health and disease represent emergent states from an evolving dynamic complex system, context given by, for example, strain, sex and feeding times, becomes paramount for defining the biological trajectory of an organism. Using bioinformatics and artificial intelligence, the systems biology approach is leading to remarkable advances in our understanding of the underlying mechanism of aging biology and assisting in creative experimental study designs in animal models. Future in-depth knowledge acquisition will depend on the ability to fully integrate information from different spatiotemporal scales in organisms, which will probably require the adoption of theories and methods from the field of complex systems. Here we review state-of-the-art approaches in preclinical research, with a focus on rodent models, that are leading to conceptual and/or technical advances in leveraging Big Data to understand basic aging biology and its full translational potential.
Collapse
Affiliation(s)
- Maria Emilia Fernandez
- Experimental Gerontology Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Jorge Martinez-Romero
- Experimental Gerontology Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
- Laboratory of Epidemiology and Population Science, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Miguel A Aon
- Experimental Gerontology Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
- Laboratory of Cardiovascular Science, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Michel Bernier
- Experimental Gerontology Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Nathan L Price
- Experimental Gerontology Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Rafael de Cabo
- Experimental Gerontology Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA.
| |
Collapse
|
11
|
Wu Y, Seufert I, Al-Shaheri FN, Kurilov R, Bauer AS, Manoochehri M, Moskalev EA, Brors B, Tjaden C, Giese NA, Hackert T, Büchler MW, Hoheisel JD. DNA-methylation signature accurately differentiates pancreatic cancer from chronic pancreatitis in tissue and plasma. Gut 2023; 72:2344-2353. [PMID: 37709492 PMCID: PMC10715533 DOI: 10.1136/gutjnl-2023-330155] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 08/31/2023] [Indexed: 09/16/2023]
Abstract
OBJECTIVE Pancreatic ductal adenocarcinoma (PDAC) is a lethal malignancy. Differentiation from chronic pancreatitis (CP) is currently inaccurate in about one-third of cases. Misdiagnoses in both directions, however, have severe consequences for patients. We set out to identify molecular markers for a clear distinction between PDAC and CP. DESIGN Genome-wide variations of DNA-methylation, messenger RNA and microRNA level as well as combinations thereof were analysed in 345 tissue samples for marker identification. To improve diagnostic performance, we established a random-forest machine-learning approach. Results were validated on another 48 samples and further corroborated in 16 liquid biopsy samples. RESULTS Machine-learning succeeded in defining markers to differentiate between patients with PDAC and CP, while low-dimensional embedding and cluster analysis failed to do so. DNA-methylation yielded the best diagnostic accuracy by far, dwarfing the importance of transcript levels. Identified changes were confirmed with data taken from public repositories and validated in independent sample sets. A signature of six DNA-methylation sites in a CpG-island of the protein kinase C beta type gene achieved a validated diagnostic accuracy of 100% in tissue and in circulating free DNA isolated from patient plasma. CONCLUSION The success of machine-learning to identify an effective marker signature documents the power of this approach. The high diagnostic accuracy of discriminating PDAC from CP could have tremendous consequences for treatment success, once the result from still a limited number of liquid biopsy samples would be confirmed in a larger cohort of patients with suspected pancreatic cancer.
Collapse
Affiliation(s)
- Yenan Wu
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Isabelle Seufert
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Fawaz N Al-Shaheri
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
| | - Roman Kurilov
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Andrea S Bauer
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Mehdi Manoochehri
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Evgeny A Moskalev
- Institute of Pathology, Universitätsklinikum Erlangen, Friedrich Alexander Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Christin Tjaden
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Nathalia A Giese
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Thilo Hackert
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Markus W Büchler
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Jörg D Hoheisel
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
12
|
Duroux D, Wohlfart C, Van Steen K, Vladimirova A, King M. Graph-based multi-modality integration for prediction of cancer subtype and severity. Sci Rep 2023; 13:19653. [PMID: 37949935 PMCID: PMC10638406 DOI: 10.1038/s41598-023-46392-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 10/31/2023] [Indexed: 11/12/2023] Open
Abstract
Personalised cancer screening before therapy paves the way toward improving diagnostic accuracy and treatment outcomes. Most approaches are limited to a single data type and do not consider interactions between features, leaving aside the complementary insights that multimodality and systems biology can provide. In this project, we demonstrate the use of graph theory for data integration via individual networks where nodes and edges are individual-specific. We showcase the consequences of early, intermediate, and late graph-based fusion of RNA-Seq data and histopathology whole-slide images for predicting cancer subtypes and severity. The methodology developed is as follows: (1) we create individual networks; (2) we compute the similarity between individuals from these graphs; (3) we train our model on the similarity matrices; (4) we evaluate the performance using the macro F1 score. Pros and cons of elements of the pipeline are evaluated on publicly available real-life datasets. We find that graph-based methods can increase performance over methods that do not study interactions. Additionally, merging multiple data sources often improves classification compared to models based on single data, especially through intermediate fusion. The proposed workflow can easily be adapted to other disease contexts to accelerate and enhance personalized healthcare.
Collapse
Affiliation(s)
- Diane Duroux
- BIO3 - Systems Genetics, GIGA-R Medical Genomics, University of Liège, 4000, Liège, Belgium.
- Post-Doctoral Fellow, ETH AI center, Zürich, Switzerland.
| | | | - Kristel Van Steen
- BIO3 - Systems Genetics, GIGA-R Medical Genomics, University of Liège, 4000, Liège, Belgium
- Department of Human Genetics, BIO3 - Systems Medicine, 3000, Leuven, Belgium
| | - Antoaneta Vladimirova
- Roche Information Solutions, Roche Diagnostics Corporation, Santa Clara, California, United States of America
| | | |
Collapse
|
13
|
Yousefi B, Melograna F, Galazzo G, van Best N, Mommers M, Penders J, Schwikowski B, Van Steen K. Capturing the dynamics of microbial interactions through individual-specific networks. Front Microbiol 2023; 14:1170391. [PMID: 37256048 PMCID: PMC10225591 DOI: 10.3389/fmicb.2023.1170391] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 04/21/2023] [Indexed: 06/01/2023] Open
Abstract
Longitudinal analysis of multivariate individual-specific microbiome profiles over time or across conditions remains dauntin. Most statistical tools and methods that are available to study microbiomes are based on cross-sectional data. Over the past few years, several attempts have been made to model the dynamics of bacterial species over time or across conditions. However, the field needs novel views on handling microbial interactions in temporal analyses. This study proposes a novel data analysis framework, MNDA, that combines representation learning and individual-specific microbial co-occurrence networks to uncover taxon neighborhood dynamics. As a use case, we consider a cohort of newborns with microbiomes available at 6 and 9 months after birth, and extraneous data available on the mode of delivery and diet changes between the considered time points. Our results show that prediction models for these extraneous outcomes based on an MNDA measure of local neighborhood dynamics for each taxon outperform traditional prediction models solely based on individual-specific microbial abundances. Furthermore, our results show that unsupervised similarity analysis of newborns in the study, again using the notion of a taxon's dynamic neighborhood derived from time-matched individual-specific microbial networks, can reveal different subpopulations of individuals, compared to standard microbiome-based clustering, with potential relevance to clinical practice. This study highlights the complementarity of microbial interactions and abundances in downstream analyses and opens new avenues to personalized prediction or stratified medicine with temporal microbiome data.
Collapse
Affiliation(s)
- Behnam Yousefi
- Computational Systems Biomedicine Lab, Institut Pasteur, University Paris City, Paris, France
- École Doctorale Complexite du vivant, Sorbonne University, Paris, France
- BIO3—Laboratory for Systems Medicine, Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Federico Melograna
- BIO3—Laboratory for Systems Medicine, Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Gianluca Galazzo
- Department of Medical Microbiology, Infectious Diseases and Infection Prevention, School of Nutrition and Translational Research in Metabolism, Maastricht University Medical Center+, Maastricht, Netherlands
| | - Niels van Best
- Department of Medical Microbiology, Infectious Diseases and Infection Prevention, School of Nutrition and Translational Research in Metabolism, Maastricht University Medical Center+, Maastricht, Netherlands
- Institute of Medical Microbiology, Rhine-Westphalia Technical University of Aachen, RWTH University, Aachen, Germany
| | - Monique Mommers
- Department of Epidemiology, Care and Public Health Research Institute (CAPHRI), Maastricht University, Maastricht, Netherlands
| | - John Penders
- Department of Medical Microbiology, Infectious Diseases and Infection Prevention, School of Nutrition and Translational Research in Metabolism, Maastricht University Medical Center+, Maastricht, Netherlands
- Department of Medical Microbiology, Infectious Diseases and Infection Prevention, Care and Public Health Research Institute (CAPHRI), Maastricht University Medical Center+, Maastricht, Netherlands
| | - Benno Schwikowski
- Computational Systems Biomedicine Lab, Institut Pasteur, University Paris City, Paris, France
| | - Kristel Van Steen
- BIO3—Laboratory for Systems Medicine, Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
- BIO3—Laboratory for Systems Genetics, GIGA-R Medical Genomics, University of Lièvzge, Liège, Belgium
| |
Collapse
|
14
|
Zhao C, Liu A, Zhang X, Cao X, Ding Z, Sha Q, Shen H, Deng HW, Zhou W. CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics data. RESEARCH SQUARE 2023:rs.3.rs-2768563. [PMID: 37205427 PMCID: PMC10187371 DOI: 10.21203/rs.3.rs-2768563/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Integration of heterogeneous and high-dimensional multi-omics data is becoming increasingly important in understanding genetic data. Each omics technique only provides a limited view of the underlying biological process and integrating heterogeneous omics layers simultaneously would lead to a more comprehensive and detailed understanding of diseases and phenotypes. However, one obstacle faced when performing multi-omics data integration is the existence of unpaired multi-omics data due to instrument sensitivity and cost. Studies may fail if certain aspects of the subjects are missing or incomplete. In this paper, we propose a deep learning method for multi-omics integration with incomplete data by Cross-omics Linked unified embedding with Contrastive Learning and Self Attention (CLCLSA). Utilizing complete multi-omics data as supervision, the model employs cross-omics autoencoders to learn the feature representation across different types of biological data. The multi-omics contrastive learning, which is used to maximize the mutual information between different types of omics, is employed before latent feature concatenation. In addition, the feature-level self-attention and omics-level self-attention are employed to dynamically identify the most informative features for multi-omics data integration. Extensive experiments were conducted on four public multi-omics datasets. The experimental results indicated that the proposed CLCLSA outperformed the state-of-the-art approaches for multi-omics data classification using incomplete multiomics data.
Collapse
Affiliation(s)
- Chen Zhao
- Department of Applied Computing, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
| | - Anqi Liu
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Xiao Zhang
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Xuewei Cao
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
| | - Zhengming Ding
- Department of Computer Science, Tulane University, New Orleans, LA 70118, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
| | - Hui Shen
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Hong-Wen Deng
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Weihua Zhou
- Department of Applied Computing, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
- Center for Biocomputing and Digital Health, Institute of Computing and Cybersystems, and Health Research Institute, Michigan Technological University, Houghton, MI 49931, USA
| |
Collapse
|
15
|
Zhao C, Liu A, Zhang X, Cao X, Ding Z, Sha Q, Shen H, Deng HW, Zhou W. CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics data. ARXIV 2023:arXiv:2304.05542v1. [PMID: 37090237 PMCID: PMC10120753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Integration of heterogeneous and high-dimensional multi-omics data is becoming increasingly important in understanding genetic data. Each omics technique only provides a limited view of the underlying biological process and integrating heterogeneous omics layers simultaneously would lead to a more comprehensive and detailed understanding of diseases and phenotypes. However, one obstacle faced when performing multi-omics data integration is the existence of unpaired multi-omics data due to instrument sensitivity and cost. Studies may fail if certain aspects of the subjects are missing or incomplete. In this paper, we propose a deep learning method for multi-omics integration with incomplete data by Cross-omics Linked unified embedding with Contrastive Learning and Self Attention (CLCLSA). Utilizing complete multi-omics data as supervision, the model employs cross-omics autoencoders to learn the feature representation across different types of biological data. The multi-omics contrastive learning, which is used to maximize the mutual information between different types of omics, is employed before latent feature concatenation. In addition, the feature-level self-attention and omics-level self-attention are employed to dynamically identify the most informative features for multi-omics data integration. Extensive experiments were conducted on four public multi-omics datasets. The experimental results indicated that the proposed CLCLSA outperformed the state-of-the-art approaches for multi-omics data classification using incomplete multi-omics data.
Collapse
Affiliation(s)
- Chen Zhao
- Department of Applied Computing, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
| | - Anqi Liu
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Xiao Zhang
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Xuewei Cao
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
| | - Zhengming Ding
- Department of Computer Science, Tulane University, New Orleans, LA 70118, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
| | - Hui Shen
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Hong-Wen Deng
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Weihua Zhou
- Department of Applied Computing, Michigan Technological University, 1400 Townsend Dr, Houghton, MI 49931, USA
- Center for Biocomputing and Digital Health, Institute of Computing and Cybersystems, and Health Research Institute, Michigan Technological University, Houghton, MI 49931, USA
| |
Collapse
|
16
|
Donovan SM, Aghaeepour N, Andres A, Azad MB, Becker M, Carlson SE, Järvinen KM, Lin W, Lönnerdal B, Slupsky CM, Steiber AL, Raiten DJ. Evidence for human milk as a biological system and recommendations for study design-a report from "Breastmilk Ecology: Genesis of Infant Nutrition (BEGIN)" Working Group 4. Am J Clin Nutr 2023; 117 Suppl 1:S61-S86. [PMID: 37173061 PMCID: PMC10356565 DOI: 10.1016/j.ajcnut.2022.12.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 12/06/2022] [Accepted: 12/08/2022] [Indexed: 05/15/2023] Open
Abstract
Human milk contains all of the essential nutrients required by the infant within a complex matrix that enhances the bioavailability of many of those nutrients. In addition, human milk is a source of bioactive components, living cells and microbes that facilitate the transition to life outside the womb. Our ability to fully appreciate the importance of this matrix relies on the recognition of short- and long-term health benefits and, as highlighted in previous sections of this supplement, its ecology (i.e., interactions among the lactating parent and breastfed infant as well as within the context of the human milk matrix itself). Designing and interpreting studies to address this complexity depends on the availability of new tools and technologies that account for such complexity. Past efforts have often compared human milk to infant formula, which has provided some insight into the bioactivity of human milk, as a whole, or of individual milk components supplemented with formula. However, this experimental approach cannot capture the contributions of the individual components to the human milk ecology, the interaction between these components within the human milk matrix, or the significance of the matrix itself to enhance human milk bioactivity on outcomes of interest. This paper presents approaches to explore human milk as a biological system and the functional implications of that system and its components. Specifically, we discuss study design and data collection considerations and how emerging analytical technologies, bioinformatics, and systems biology approaches could be applied to advance our understanding of this critical aspect of human biology.
Collapse
Affiliation(s)
- Sharon M Donovan
- Department of Food Science and Human Nutrition, University of Illinois, Urbana-Champaign, IL, USA.
| | - Nima Aghaeepour
- Department of Anesthesiology, Pain, and Perioperative Medicine, Department of Pediatrics, and Department of Biomedical Data Sciences, School of Medicine, Stanford University, Stanford, CA, USA
| | - Aline Andres
- Arkansas Children's Nutrition Center and Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Meghan B Azad
- Manitoba Interdisciplinary Lactation Centre (MILC), Children's Hospital Research Institute of Manitoba, Department of Pediatrics and Child Health and Department of Immunology, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Martin Becker
- Department of Anesthesiology, Pain, and Perioperative Medicine, Department of Pediatrics, and Department of Biomedical Data Sciences, School of Medicine, Stanford University, Stanford, CA, USA
| | - Susan E Carlson
- Department of Dietetics and Nutrition, University of Kansas Medical Center, Kansas City, KS, USA
| | - Kirsi M Järvinen
- Department of Pediatrics, Division of Allergy and Immunology and Center for Food Allergy, University of Rochester Medical Center, New York, NY, USA
| | - Weili Lin
- Biomedical Research Imaging Center and Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Bo Lönnerdal
- Department of Nutrition, University of California, Davis, CA, USA
| | - Carolyn M Slupsky
- Department of Nutrition, University of California, Davis, CA, USA; Department of Food Science and Technology, University of California, Davis, CA, USA
| | | | - Daniel J Raiten
- Pediatric Growth and Nutrition Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
17
|
Fu J, Zhu F, Xu CJ, Li Y. Metabolomics meets systems immunology. EMBO Rep 2023; 24:e55747. [PMID: 36916532 PMCID: PMC10074123 DOI: 10.15252/embr.202255747] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 12/24/2022] [Accepted: 02/24/2023] [Indexed: 03/16/2023] Open
Abstract
Metabolic processes play a critical role in immune regulation. Metabolomics is the systematic analysis of small molecules (metabolites) in organisms or biological samples, providing an opportunity to comprehensively study interactions between metabolism and immunity in physiology and disease. Integrating metabolomics into systems immunology allows the exploration of the interactions of multilayered features in the biological system and the molecular regulatory mechanism of these features. Here, we provide an overview on recent technological developments of metabolomic applications in immunological research. To begin, two widely used metabolomics approaches are compared: targeted and untargeted metabolomics. Then, we provide a comprehensive overview of the analysis workflow and the computational tools available, including sample preparation, raw spectra data preprocessing, data processing, statistical analysis, and interpretation. Third, we describe how to integrate metabolomics with other omics approaches in immunological studies using available tools. Finally, we discuss new developments in metabolomics and its prospects for immunology research. This review provides guidance to researchers using metabolomics and multiomics in immunity research, thus facilitating the application of systems immunology to disease research.
Collapse
Affiliation(s)
- Jianbo Fu
- Centre for Individualised Infection Medicine (CiiM), a joint venture between the Helmholtz Centre for Infection Research (HZI) and Hannover Medical School (MHH), Hannover, Germany.,TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Helmholtz Centre for Infection Research (HZI) and the Hannover Medical School (MHH), Hannover, Germany.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Cheng-Jian Xu
- Centre for Individualised Infection Medicine (CiiM), a joint venture between the Helmholtz Centre for Infection Research (HZI) and Hannover Medical School (MHH), Hannover, Germany.,TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Helmholtz Centre for Infection Research (HZI) and the Hannover Medical School (MHH), Hannover, Germany.,Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Yang Li
- Centre for Individualised Infection Medicine (CiiM), a joint venture between the Helmholtz Centre for Infection Research (HZI) and Hannover Medical School (MHH), Hannover, Germany.,TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Helmholtz Centre for Infection Research (HZI) and the Hannover Medical School (MHH), Hannover, Germany.,Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
18
|
Flores JE, Claborne DM, Weller ZD, Webb-Robertson BJM, Waters KM, Bramer LM. Missing data in multi-omics integration: Recent advances through artificial intelligence. Front Artif Intell 2023; 6:1098308. [PMID: 36844425 PMCID: PMC9949722 DOI: 10.3389/frai.2023.1098308] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Biological systems function through complex interactions between various 'omics (biomolecules), and a more complete understanding of these systems is only possible through an integrated, multi-omic perspective. This has presented the need for the development of integration approaches that are able to capture the complex, often non-linear, interactions that define these biological systems and are adapted to the challenges of combining the heterogenous data across 'omic views. A principal challenge to multi-omic integration is missing data because all biomolecules are not measured in all samples. Due to either cost, instrument sensitivity, or other experimental factors, data for a biological sample may be missing for one or more 'omic techologies. Recent methodological developments in artificial intelligence and statistical learning have greatly facilitated the analyses of multi-omics data, however many of these techniques assume access to completely observed data. A subset of these methods incorporate mechanisms for handling partially observed samples, and these methods are the focus of this review. We describe recently developed approaches, noting their primary use cases and highlighting each method's approach to handling missing data. We additionally provide an overview of the more traditional missing data workflows and their limitations; and we discuss potential avenues for further developments as well as how the missing data issue and its current solutions may generalize beyond the multi-omics context.
Collapse
Affiliation(s)
- Javier E. Flores
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Daniel M. Claborne
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Zachary D. Weller
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Bobbie-Jo M. Webb-Robertson
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Katrina M. Waters
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Lisa M. Bramer
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| |
Collapse
|
19
|
Schumacher D, Kramann R. Multiomic Spatial Mapping of Myocardial Infarction and Implications for Personalized Therapy. Arterioscler Thromb Vasc Biol 2023; 43:192-202. [PMID: 36579644 DOI: 10.1161/atvbaha.122.318333] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Ischemic heart disease including myocardial infarction is still the leading cause of death worldwide. Although the survival early after myocardial infarction has been significantly improved by the introduction of percutaneous coronary intervention, long-term morbidity and mortality remain high. The elevated long-term mortality is mainly driven by cardiac remodeling processes triggering ischemic heart failure and electric instability. Despite the new developments in pharmaco-therapy of heart failure, we still lack targeted therapies for cardiac remodeling and fibrosis. Single-cell and genomic technologies allow us to map the human heart at unprecedented resolution and allow to gain insights into cellular and molecular heterogeneity. However, these technologies rely on digested tissue and isolated cells or nuclei and thus lack spatial information. Spatial information is critical to understand tissue homeostasis and disease and can be utilized to identify disease-driving cell populations and mechanisms including cellular cross-talk. Here, we discuss recent advances in single-cell and spatial genomic technologies that give insights into cellular and molecular mechanisms of cardiac remodeling after injury and can be utilized to identify novel therapeutic targets and pave the way toward new therapies in heart failure.
Collapse
Affiliation(s)
- David Schumacher
- Institute of Experimental Medicine and Systems Biology (D.S., R.K.), RWTH Aachen University, Germany.,Department of Anesthesiology, University Hospital (D.S.), RWTH Aachen University, Germany
| | - Rafael Kramann
- Institute of Experimental Medicine and Systems Biology (D.S., R.K.), RWTH Aachen University, Germany.,Department of Nephrology and Clinical Immunology (R.K.), RWTH Aachen University, Germany.,Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, the Netherlands (R.K.)
| |
Collapse
|
20
|
Koh HW, Pilbrow AP, Tan SH, Zhao Q, Benke PI, Burla B, Torta F, Pickering JW, Troughton R, Pemberton C, Soo WM, Ling LH, Doughty RN, Choi H, Wenk MR, Richards AM, Chan MY. An integrated signature of extracellular matrix proteins and a diastolic function imaging parameter predicts post-MI long-term outcomes. Front Cardiovasc Med 2023; 10:1123682. [PMID: 37123479 PMCID: PMC10132266 DOI: 10.3389/fcvm.2023.1123682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 03/20/2023] [Indexed: 05/02/2023] Open
Abstract
Background Patients suffering from acute myocardial infarction (AMI) are at risk of secondary outcomes including major adverse cardiovascular events (MACE) and heart failure (HF). Comprehensive molecular phenotyping and cardiac imaging during the post-discharge time window may provide cues for risk stratification for the outcomes. Materials and methods In a prospective AMI cohort in New Zealand (N = 464), we measured plasma proteins and lipids 30 days after hospital discharge and inferred a unified partial correlation network with echocardiographic variables and established clinical biomarkers (creatinine, c-reactive protein, cardiac troponin I and natriuretic peptides). Using a network-based data integration approach (iOmicsPASS+), we identified predictive signatures of long-term secondary outcomes based on plasma protein, lipid, imaging markers and clinical biomarkers and assessed the prognostic potential in an independent cohort from Singapore (N = 190). Results The post-discharge levels of plasma proteins and lipids showed strong correlations within each molecular type, reflecting concerted homeostatic regulation after primary MI events. However, the two molecular types were largely independent with distinct correlation structures with established prognostic imaging parameters and clinical biomarkers. To deal with massively correlated predictive features, we used iOmicsPASS + to identify subnetwork signatures of 211 and 189 data features (nodes) predictive of MACE and HF events, respectively (160 overlapping). The predictive features were primarily imaging parameters, including left ventricular and atrial parameters, tissue Doppler parameters, and proteins involved in extracellular matrix (ECM) organization, cell differentiation, chemotaxis, and inflammation. The network signatures contained plasma protein pairs with area-under-the-curve (AUC) values up to 0.74 for HF prediction in the validation cohort, but the pair of NT-proBNP and fibulin-3 (EFEMP1) was the best predictor (AUC = 0.80). This suggests that there were a handful of plasma proteins with mechanistic and functional roles in predisposing patients to the secondary outcomes, although they may be weaker prognostic markers than natriuretic peptides individually. Among those, the diastolic function parameter (E/e' - an indicator of left ventricular filling pressure) and two ECM proteins, EFEMP1 and follistatin-like 3 (FSTL3) showed comparable performance to NT-proBNP and outperformed left ventricular measures as benchmark prognostic factors for post-MI HF. Conclusion Post-discharge levels of E/e', EFEMP1 and FSTL3 are promising complementary markers of secondary adverse outcomes in AMI patients.
Collapse
Affiliation(s)
- Hiromi W.L. Koh
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Anna P. Pilbrow
- Department of Medicine, Christchurch Heart Institute, University of Otago, Christchurch, New Zealand
| | - Sock Hwee Tan
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- National University Heart Centre, National University Health System, Singapore, Singapore
| | - Qing Zhao
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Peter I. Benke
- Singapore Lipidomics Incubator (SLING), Life Sciences Institute, National University of Singapore, Singapore, Singapore
| | - Bo Burla
- Singapore Lipidomics Incubator (SLING), Life Sciences Institute, National University of Singapore, Singapore, Singapore
| | - Federico Torta
- Singapore Lipidomics Incubator (SLING), Life Sciences Institute, National University of Singapore, Singapore, Singapore
- Precision Medicine Translational Research Programme and Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - John W. Pickering
- Department of Medicine, Christchurch Heart Institute, University of Otago, Christchurch, New Zealand
| | - Richard Troughton
- Department of Medicine, Christchurch Heart Institute, University of Otago, Christchurch, New Zealand
| | - Christopher Pemberton
- Department of Medicine, Christchurch Heart Institute, University of Otago, Christchurch, New Zealand
| | - Wern-Miin Soo
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- National University Heart Centre, National University Health System, Singapore, Singapore
| | - Lieng Hsi Ling
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- National University Heart Centre, National University Health System, Singapore, Singapore
| | - Robert N. Doughty
- Heart Health Research Group, University of Auckland, Auckland, New Zealand
| | - Hyungwon Choi
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Markus R. Wenk
- Singapore Lipidomics Incubator (SLING), Life Sciences Institute, National University of Singapore, Singapore, Singapore
- Precision Medicine Translational Research Programme and Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - A. Mark Richards
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Medicine, Christchurch Heart Institute, University of Otago, Christchurch, New Zealand
- National University Heart Centre, National University Health System, Singapore, Singapore
- Correspondence: Mark Richards Mark Chan
| | - Mark Y. Chan
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- National University Heart Centre, National University Health System, Singapore, Singapore
- Correspondence: Mark Richards Mark Chan
| |
Collapse
|
21
|
Johnson AC, Silva JAF, Kim SC, Larsen CP. Progress in kidney transplantation: The role for systems immunology. Front Med (Lausanne) 2022; 9:1070385. [PMID: 36590970 PMCID: PMC9800623 DOI: 10.3389/fmed.2022.1070385] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/16/2022] [Indexed: 12/23/2022] Open
Abstract
The development of systems biology represents an immense breakthrough in our ability to perform translational research and deliver personalized and precision medicine. A multidisciplinary approach in combination with use of novel techniques allows for the extraction and analysis of vast quantities of data even from the volume and source limited samples that can be obtained from human subjects. Continued advances in microfluidics, scalability and affordability of sequencing technologies, and development of data analysis tools have made the application of a multi-omics, or systems, approach more accessible for use outside of specialized centers. The study of alloimmune and protective immune responses after solid organ transplant offers innumerable opportunities for a multi-omics approach, however, transplant immunology labs are only just beginning to adopt the systems methodology. In this review, we focus on advances in biological techniques and how they are improving our understanding of the immune system and its interactions, highlighting potential applications in transplant immunology. First, we describe the techniques that are available, with emphasis on major advances that allow for increased scalability. Then, we review initial applications in the field of transplantation with a focus on topics that are nearing clinical integration. Finally, we examine major barriers to adapting these methods and discuss potential future developments.
Collapse
|
22
|
Agamah FE, Bayjanov JR, Niehues A, Njoku KF, Skelton M, Mazandu GK, Ederveen THA, Mulder N, Chimusa ER, 't Hoen PAC. Computational approaches for network-based integrative multi-omics analysis. Front Mol Biosci 2022; 9:967205. [PMID: 36452456 PMCID: PMC9703081 DOI: 10.3389/fmolb.2022.967205] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 10/20/2022] [Indexed: 08/27/2023] Open
Abstract
Advances in omics technologies allow for holistic studies into biological systems. These studies rely on integrative data analysis techniques to obtain a comprehensive view of the dynamics of cellular processes, and molecular mechanisms. Network-based integrative approaches have revolutionized multi-omics analysis by providing the framework to represent interactions between multiple different omics-layers in a graph, which may faithfully reflect the molecular wiring in a cell. Here we review network-based multi-omics/multi-modal integrative analytical approaches. We classify these approaches according to the type of omics data supported, the methods and/or algorithms implemented, their node and/or edge weighting components, and their ability to identify key nodes and subnetworks. We show how these approaches can be used to identify biomarkers, disease subtypes, crosstalk, causality, and molecular drivers of physiological and pathological mechanisms. We provide insight into the most appropriate methods and tools for research questions as showcased around the aetiology and treatment of COVID-19 that can be informed by multi-omics data integration. We conclude with an overview of challenges associated with multi-omics network-based analysis, such as reproducibility, heterogeneity, (biological) interpretability of the results, and we highlight some future directions for network-based integration.
Collapse
Affiliation(s)
- Francis E. Agamah
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Jumamurat R. Bayjanov
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Anna Niehues
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Kelechi F. Njoku
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Michelle Skelton
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Gaston K. Mazandu
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- African Institute for Mathematical Sciences, Cape Town, South Africa
| | - Thomas H. A. Ederveen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Emile R. Chimusa
- Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle, United Kingdom
| | - Peter A. C. 't Hoen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
23
|
Sidak D, Schwarzerová J, Weckwerth W, Waldherr S. Interpretable machine learning methods for predictions in systems biology from omics data. Front Mol Biosci 2022; 9:926623. [PMID: 36387282 PMCID: PMC9650551 DOI: 10.3389/fmolb.2022.926623] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 08/15/2022] [Indexed: 12/02/2022] Open
Abstract
Machine learning has become a powerful tool for systems biologists, from diagnosing cancer to optimizing kinetic models and predicting the state, growth dynamics, or type of a cell. Potential predictions from complex biological data sets obtained by “omics” experiments seem endless, but are often not the main objective of biological research. Often we want to understand the molecular mechanisms of a disease to develop new therapies, or we need to justify a crucial decision that is derived from a prediction. In order to gain such knowledge from data, machine learning models need to be extended. A recent trend to achieve this is to design “interpretable” models. However, the notions around interpretability are sometimes ambiguous, and a universal recipe for building well-interpretable models is missing. With this work, we want to familiarize systems biologists with the concept of model interpretability in machine learning. We consider data sets, data preparation, machine learning methods, and software tools relevant to omics research in systems biology. Finally, we try to answer the question: “What is interpretability?” We introduce views from the interpretable machine learning community and propose a scheme for categorizing studies on omics data. We then apply these tools to review and categorize recent studies where predictive machine learning models have been constructed from non-sequential omics data.
Collapse
Affiliation(s)
- David Sidak
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
| | - Jana Schwarzerová
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic
| | - Wolfram Weckwerth
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
- Vienna Metabolomics Center (VIME), Faculty of Life Sciences, University of Vienna, Vienna, Austria
| | - Steffen Waldherr
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
- *Correspondence: Steffen Waldherr,
| |
Collapse
|
24
|
Beura S, Kundu P, Das AK, Ghosh A. Metagenome-scale community metabolic modelling for understanding the role of gut microbiota in human health. Comput Biol Med 2022; 149:105997. [DOI: 10.1016/j.compbiomed.2022.105997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 07/03/2022] [Accepted: 08/14/2022] [Indexed: 11/03/2022]
|
25
|
Hiort P, Hugo J, Zeinert J, Müller N, Kashyap S, Rajapakse JC, Azuaje F, Renard BY, Baum K. DrDimont: explainable drug response prediction from differential analysis of multi-omics networks. Bioinformatics 2022; 38:ii113-ii119. [PMID: 36124784 PMCID: PMC9486584 DOI: 10.1093/bioinformatics/btac477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION While it has been well established that drugs affect and help patients differently, personalized drug response predictions remain challenging. Solutions based on single omics measurements have been proposed, and networks provide means to incorporate molecular interactions into reasoning. However, how to integrate the wealth of information contained in multiple omics layers still poses a complex problem. RESULTS We present DrDimont, Drug response prediction from Differential analysis of multi-omics networks. It allows for comparative conclusions between two conditions and translates them into differential drug response predictions. DrDimont focuses on molecular interactions. It establishes condition-specific networks from correlation within an omics layer that are then reduced and combined into heterogeneous, multi-omics molecular networks. A novel semi-local, path-based integration step ensures integrative conclusions. Differential predictions are derived from comparing the condition-specific integrated networks. DrDimont's predictions are explainable, i.e. molecular differences that are the source of high differential drug scores can be retrieved. We predict differential drug response in breast cancer using transcriptomics, proteomics, phosphosite and metabolomics measurements and contrast estrogen receptor positive and receptor negative patients. DrDimont performs better than drug prediction based on differential protein expression or PageRank when evaluating it on ground truth data from cancer cell lines. We find proteomic and phosphosite layers to carry most information for distinguishing drug response. AVAILABILITY AND IMPLEMENTATION DrDimont is available on CRAN: https://cran.r-project.org/package=DrDimont. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pauline Hiort
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Julian Hugo
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Justus Zeinert
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Nataniel Müller
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Spoorthi Kashyap
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Jagath C Rajapakse
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | | | - Bernhard Y Renard
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | | |
Collapse
|
26
|
Zou Z, Sun W, Xu Y, Liu W, Zhong J, Lin X, Chen Y. Application of Multi-Omics Approach in Sarcomas: A Tool for Studying Mechanism, Biomarkers, and Therapeutic Targets. Front Oncol 2022; 12:946022. [PMID: 35875106 PMCID: PMC9304858 DOI: 10.3389/fonc.2022.946022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 06/16/2022] [Indexed: 12/18/2022] Open
Abstract
Sarcomas are rare, heterogeneous mesenchymal neoplasms with various subtypes, each exhibiting unique genetic characteristics. Although studies have been conducted to improve the treatment for sarcomas, the specific development from normal somatic cells to sarcoma cells is still unclear and needs further research. The diagnosis of sarcomas depends heavily on the pathological examination, which is yet a difficult work and requires expert analysis. Advanced treatment like precise medicine optimizes the efficacy of treatment and the prognosis of sarcoma patients, yet, in sarcomas, more studies should be done to put such methods in clinical practice. The revolution of advanced technology has pushed the multi-omics approach to the front, and more could be learnt in sarcomas with such methods. Multi-omics combines the character of each omics techniques, analyzes the mechanism of tumor cells from different levels, which makes up for the shortage of single-omics, and gives us an integrated picture of bioactivities inside tumor cells. Multi-omics research of sarcomas has reached appreciable progress in recent years, leading to a better understanding of the mutation, proliferation, and metastasis of sarcomas. With the help of multi-omics approach, novel biomarkers were found, with promising effects in improving the process of diagnosis, prognosis anticipation, and treatment decision. By analyzing large amounts of biological features, subtype clustering could be done in a better precision, which may be useful in the clinical procedure. In this review, we summarized recent discoveries using multi-omics approach in sarcomas, discussed their merits and challenges, and concluded with future perspectives of the sarcoma research.
Collapse
Affiliation(s)
- Zijian Zou
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Wei Sun
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yu Xu
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Wanlin Liu
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Jingqin Zhong
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Xinyi Lin
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yong Chen
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| |
Collapse
|
27
|
Marazzi L, Shah M, Balakrishnan S, Patil A, Vera-Licona P. NETISCE: a network-based tool for cell fate reprogramming. NPJ Syst Biol Appl 2022; 8:21. [PMID: 35725577 PMCID: PMC9209484 DOI: 10.1038/s41540-022-00231-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 05/31/2022] [Indexed: 11/17/2022] Open
Abstract
The search for effective therapeutic targets in fields like regenerative medicine and cancer research has generated interest in cell fate reprogramming. This cellular reprogramming paradigm can drive cells to a desired target state from any initial state. However, methods for identifying reprogramming targets remain limited for biological systems that lack large sets of experimental data or a dynamical characterization. We present NETISCE, a novel computational tool for identifying cell fate reprogramming targets in static networks. In combination with machine learning algorithms, NETISCE estimates the attractor landscape and predicts reprogramming targets using signal flow analysis and feedback vertex set control, respectively. Through validations in studies of cell fate reprogramming from developmental, stem cell, and cancer biology, we show that NETISCE can predict previously identified cell fate reprogramming targets and identify potentially novel combinations of targets. NETISCE extends cell fate reprogramming studies to larger-scale biological networks without the need for full model parameterization and can be implemented by experimental and computational biologists to identify parts of a biological system relevant to the desired reprogramming task.
Collapse
Affiliation(s)
- Lauren Marazzi
- Center for Quantitative Medicine, University of Connecticut School of Medicine, Farmington, CT, 06030, USA
| | - Milan Shah
- Center for Quantitative Medicine, University of Connecticut School of Medicine, Farmington, CT, 06030, USA
| | - Shreedula Balakrishnan
- Center for Quantitative Medicine, University of Connecticut School of Medicine, Farmington, CT, 06030, USA
| | - Ananya Patil
- Center for Quantitative Medicine, University of Connecticut School of Medicine, Farmington, CT, 06030, USA
| | - Paola Vera-Licona
- Center for Quantitative Medicine, University of Connecticut School of Medicine, Farmington, CT, 06030, USA. .,Department of Cell Biology, University of Connecticut School of Medicine, Farmington, CT, 06030, USA. .,Center for Cell Analysis and Modeling, University of Connecticut School of Medicine, Farmington, CT, 06030, USA. .,Institute for Systems Genomics, University of Connecticut School of Medicine, Farmington, CT, 06030, USA.
| |
Collapse
|
28
|
Rintala TJ, Ghosh A, Fortino V. Network approaches for modeling the effect of drugs and diseases. Brief Bioinform 2022; 23:6608969. [PMID: 35704883 PMCID: PMC9294412 DOI: 10.1093/bib/bbac229] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/29/2022] [Accepted: 05/17/2021] [Indexed: 12/12/2022] Open
Abstract
The network approach is quickly becoming a fundamental building block of computational methods aiming at elucidating the mechanism of action (MoA) and therapeutic effect of drugs. By modeling the effect of drugs and diseases on different biological networks, it is possible to better explain the interplay between disease perturbations and drug targets as well as how drug compounds induce favorable biological responses and/or adverse effects. Omics technologies have been extensively used to generate the data needed to study the mechanisms of action of drugs and diseases. These data are often exploited to define condition-specific networks and to study whether drugs can reverse disease perturbations. In this review, we describe network data mining algorithms that are commonly used to study drug’s MoA and to improve our understanding of the basis of chronic diseases. These methods can support fundamental stages of the drug development process, including the identification of putative drug targets, the in silico screening of drug compounds and drug combinations for the treatment of diseases. We also discuss recent studies using biological and omics-driven networks to search for possible repurposed FDA-approved drug treatments for SARS-CoV-2 infections (COVID-19).
Collapse
Affiliation(s)
- T J Rintala
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| | - Arindam Ghosh
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| | - V Fortino
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| |
Collapse
|
29
|
Mokhtari A, Porte B, Belzeaux R, Etain B, Ibrahim EC, Marie-Claire C, Lutz PE, Delahaye-Duriez A. The molecular pathophysiology of mood disorders: From the analysis of single molecular layers to multi-omic integration. Prog Neuropsychopharmacol Biol Psychiatry 2022; 116:110520. [PMID: 35104608 DOI: 10.1016/j.pnpbp.2022.110520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 01/22/2022] [Accepted: 01/22/2022] [Indexed: 12/14/2022]
Abstract
Next-generation sequencing now enables the rapid and affordable production of reliable biological data at multiple molecular levels, collectively referred to as "omics". To maximize the potential for discovery, computational biologists have created and adapted integrative multi-omic analytical methods. When applied to diseases with traceable pathophysiology such as cancer, these new algorithms and statistical approaches have enabled the discovery of clinically relevant molecular mechanisms and biomarkers. In contrast, these methods have been much less applied to the field of molecular psychiatry, although diagnostic and prognostic biomarkers are similarly needed. In the present review, we first briefly summarize main findings from two decades of studies that investigated single molecular processes in relation to mood disorders. Then, we conduct a systematic review of multi-omic strategies that have been proposed and used more recently. We also list databases and types of data available to researchers for future work. Finally, we present the newest methodologies that have been employed for multi-omics integration in other medical fields, and discuss their potential for molecular psychiatry studies.
Collapse
Affiliation(s)
- Amazigh Mokhtari
- NeuroDiderot, Inserm U1141, Université de Paris, F-75019 Paris, France
| | - Baptiste Porte
- NeuroDiderot, Inserm U1141, Université de Paris, F-75019 Paris, France
| | - Raoul Belzeaux
- Aix Marseille Université CNRS, Institut de Neurosciences de la Timone, F-13005 Marseille, France; Fondation FondaMental, F-94000 Créteil, France; Assistance Publique Hôpitaux de Marseille, Pôle de psychiatrie, pédopsychiatrie et addictologie, F-13005 Marseille, France
| | - Bruno Etain
- Assistance Publique des Hôpitaux de Paris, GHU Lariboisière-Saint Louis-Fernand Widal, DMU Neurosciences, Département de psychiatrie et de Médecine Addictologique, F-75010 Paris, France; Université de Paris, INSERM UMR-S 1144, Optimisation thérapeutique en neuropsychopharmacologie, OTeN, F-75006 Paris, France
| | - El Cherif Ibrahim
- Aix Marseille Université CNRS, Institut de Neurosciences de la Timone, F-13005 Marseille, France
| | - Cynthia Marie-Claire
- Université de Paris, INSERM UMR-S 1144, Optimisation thérapeutique en neuropsychopharmacologie, OTeN, F-75006 Paris, France
| | - Pierre-Eric Lutz
- Centre National de la Recherche Scientifique, Université de Strasbourg, Fédération de Médecine Translationnelle de Strasbourg, Institut des Neurosciences Cellulaires et Intégratives UPR3212, F-67000 Strasbourg, France; Douglas Mental Health University Institute, McGill University, QC H4H 1R3 Montréal, Canada.
| | - Andrée Delahaye-Duriez
- NeuroDiderot, Inserm U1141, Université de Paris, F-75019 Paris, France; Assistance Publique des Hôpitaux de Paris, Unité de médecine génomique, Département BioPhaReS, Hôpital Jean Verdier, Hôpitaux Universitaires de Paris Seine Saint Denis, F-93140 Bondy, France; Université Sorbonne Paris Nord, F-93000 Bobigny, France.
| |
Collapse
|
30
|
Avery CL, Howard AG, Ballou AF, Buchanan VL, Collins JM, Downie CG, Engel SM, Graff M, Highland HM, Lee MP, Lilly AG, Lu K, Rager JE, Staley BS, North KE, Gordon-Larsen P. Strengthening Causal Inference in Exposomics Research: Application of Genetic Data and Methods. ENVIRONMENTAL HEALTH PERSPECTIVES 2022; 130:55001. [PMID: 35533073 PMCID: PMC9084332 DOI: 10.1289/ehp9098] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 04/08/2022] [Accepted: 04/12/2022] [Indexed: 05/11/2023]
Abstract
Advances in technologies to measure a broad set of exposures have led to a range of exposome research efforts. Yet, these efforts have insufficiently integrated methods that incorporate genetic data to strengthen causal inference, despite evidence that many exposome-associated phenotypes are heritable. Objective: We demonstrate how integration of methods and study designs that incorporate genetic data can strengthen causal inference in exposomics research by helping address six challenges: reverse causation and unmeasured confounding, comprehensive examination of phenotypic effects, low efficiency, replication, multilevel data integration, and characterization of tissue-specific effects. Examples are drawn from studies of biomarkers and health behaviors, exposure domains where the causal inference methods we describe are most often applied. Discussion: Technological, computational, and statistical advances in genotyping, imputation, and analysis, combined with broad data sharing and cross-study collaborations, offer multiple opportunities to strengthen causal inference in exposomics research. Full application of these opportunities will require an expanded understanding of genetic variants that predict exposome phenotypes as well as an appreciation that the utility of genetic variants for causal inference will vary by exposure and may depend on large sample sizes. However, several of these challenges can be addressed through international scientific collaborations that prioritize data sharing. Ultimately, we anticipate that efforts to better integrate methods that incorporate genetic data will extend the reach of exposomics research by helping address the challenges of comprehensively measuring the exposome and its health effects across studies, the life course, and in varied contexts and diverse populations. https://doi.org/10.1289/EHP9098.
Collapse
Affiliation(s)
- Christy L Avery
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Annie Green Howard
- Department of Biostatistics, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Anna F Ballou
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Victoria L Buchanan
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Jason M Collins
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Carolina G Downie
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Stephanie M Engel
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Mariaelisa Graff
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Heather M Highland
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Moa P Lee
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Adam G Lilly
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Sociology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kun Lu
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Julia E Rager
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Brooke S Staley
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kari E North
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Penny Gordon-Larsen
- Department of Nutrition, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
31
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
32
|
Moon S, Lee H. MOMA: a multi-task attention learning algorithm for multi-omics data interpretation and classification. Bioinformatics 2022; 38:2287-2296. [PMID: 35157023 PMCID: PMC10060719 DOI: 10.1093/bioinformatics/btac080] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 01/01/2022] [Accepted: 02/08/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Accurate diagnostic classification and biological interpretation are important in biology and medicine, which are data-rich sciences. Thus, integration of different data types is necessary for the high predictive accuracy of clinical phenotypes, and more comprehensive analyses for predicting the prognosis of complex diseases are required. RESULTS Here, we propose a novel multi-task attention learning algorithm for multi-omics data, termed MOMA, which captures important biological processes for high diagnostic performance and interpretability. MOMA vectorizes features and modules using a geometric approach and focuses on important modules in multi-omics data via an attention mechanism. Experiments using public data on Alzheimer's disease and cancer with various classification tasks demonstrated the superior performance of this approach. The utility of MOMA was also verified using a comparison experiment with an attention mechanism that was turned on or off and biological analysis. AVAILABILITY AND IMPLEMENTATION The source codes are available at https://github.com/dmcb-gist/MOMA. SUPPLEMENTARY INFORMATION Supplementary materials are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sehwan Moon
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju 61005, South Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju 61005, South Korea
| |
Collapse
|
33
|
Non-Coding RNAs Are Brokers in Breast Cancer Interactome Networks and Add Discrimination Power between Subtypes. J Clin Med 2022; 11:jcm11082103. [PMID: 35456196 PMCID: PMC9029160 DOI: 10.3390/jcm11082103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 04/01/2022] [Accepted: 04/06/2022] [Indexed: 02/04/2023] Open
Abstract
Despite the power of high-throughput genomics, most non-coding RNA (ncRNA) biotypes remain hard to identify, characterize, and validate. This is a clear indication that intensive next-generation sequencing research has led to great efficiency and accuracy in detecting ncRNAs, but not in their functionalization. Computational scientists continue to support the discovery process by spotting significant data features (expression or mutational profiles), elucidating phenotype uncertainty, and delineating complex regulation landscapes for biological pathways and pathophysiological processes. With reference to transcriptome regulation dynamics in cancer, this work introduces a novel network-driven inference approach designed to reveal the potential role of computationally identified ncRNAs in discriminating between breast cancer (BC) subtypes beyond the traditional gene expression signatures. As heterogeneity cast in the subtypes is a characteristic of most cancers, the proposed approach is generalizable beyond BC. Expression profiles of a wide transcriptome spectrum were obtained for a number of BC patients (and controls) listed in TCGA and processed with RNA-Seq. The well-known PAM50 subtype signature was available for the samples and used to move from differentially expressed transcript profiles to subtype-specific biclusters associating gene patterns with patients. Co-expressed gene networks were then generated and annotations were provided, focusing on the biclusters with basal and luminal signatures. These were used to build template maps, i.e., networks in which to embed the ncRNAs and contextually functionalize them based on their interactors. This inference approach is able to assess the influence of ncRNAs at the level of BC subtype. Network topology was considered through the brokerage measure to account for disruptiveness effects induced by the removal of nodes corresponding to ncRNAs. Equivalently, it is shown that ncRNAs can act as brokers of network interactome dynamics, and removing them allows the refinement of subtype-related characteristics previously obtained by gene signatures only. The results of the study elucidate the role of pseudogenes in two major BC subtypes, considering the contextual annotations. Put into a wider perspective, ncRNA brokers may help predictive functionalization studies targeted to new disease phenotypes, for instance those linked to the tumor microenvironment or metabolism, or those specifically involving metastasis. Overall, the approach may represent an in silico prioritization strategy toward the systems identification of new diagnostic and prognostic biomarkers.
Collapse
|
34
|
Mallick P, Maity S, Chakrabarti O, Chakrabarti S. Role of systems biology and multi-omics analyses in delineating spatial interconnectivity and temporal dynamicity of ER stress mediated cellular responses. BIOCHIMICA ET BIOPHYSICA ACTA. MOLECULAR CELL RESEARCH 2022; 1869:119210. [PMID: 35032474 DOI: 10.1016/j.bbamcr.2022.119210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 12/01/2021] [Accepted: 12/30/2021] [Indexed: 06/14/2023]
Abstract
The endoplasmic reticulum (ER) is a membranous organelle involved in calcium storage, lipid biosynthesis, protein folding and processing. Many patho-physiological conditions and pharmacological agents are known to perturb normal ER function and can lead to ER stress, which severely compromise protein folding mechanism and hence poses high risk of proteotoxicity. Upon sensing ER stress, the different stress signaling pathways interconnect with each other and work together to preserve cellular homeostasis. ER stress response is a part of the integrative stress response (ISR) and might play an important role in the pathogenesis of chronic neurodegenerative diseases, where misfolded protein accumulation and cell death are common. The initiation, manifestation and progression of ER stress mediated unfolded protein response (UPR) is a complex procedure involving multiple proteins, pathways and cellular organelles. To understand the cause and consequences of such complex processes, implementation of an integrative holistic approach is required to identify novel players and regulators of ER stress. As multi-omics data-based systems analyses have shown potential to unravel the underneath molecular mechanism of complex biological systems, it is important to emphasize the utility of this approach in understanding the ER stress biology. In this review we first discuss the ER stress signaling pathways and regulatory players, along with their inter-connectivity. We next highlight the importance of systems and network biology approaches using multi-omics data in understanding ER stress mediated cellular responses. This report would help advance our current understanding of the multivariate spatial interconnectivity and temporal dynamicity of ER stress.
Collapse
Affiliation(s)
- Priyanka Mallick
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, CN-6, Sector 5, Salt Lake, Kolkata Pin 700091, WB, India; Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| | - Sebabrata Maity
- Biophysics & Structural Genomics Division, Saha Institute of Nuclear Physics, 1/AF Bidhannagar, Kolkata 700064, India; Homi Bhabha National Institute, India
| | - Oishee Chakrabarti
- Biophysics & Structural Genomics Division, Saha Institute of Nuclear Physics, 1/AF Bidhannagar, Kolkata 700064, India; Homi Bhabha National Institute, India.
| | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, CN-6, Sector 5, Salt Lake, Kolkata Pin 700091, WB, India; Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India.
| |
Collapse
|
35
|
Rajczewski AT, Jagtap PD, Griffin TJ. An overview of technologies for MS-based proteomics-centric multi-omics. Expert Rev Proteomics 2022; 19:165-181. [PMID: 35466851 PMCID: PMC9613604 DOI: 10.1080/14789450.2022.2070476] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
INTRODUCTION Mass spectrometry-based proteomics reveals dynamic molecular signatures underlying phenotypes reflecting normal and perturbed conditions in living systems. Although valuable on its own, the proteome has only one level of moleclar information, with the genome, epigenome, transcriptome, and metabolome, all providing complementary information. Multi-omic analysis integrating information from one or more of these other domains with proteomic information provides a more complete picture of molecular contributors to dynamic biological systems. AREAS COVERED Here, we discuss the improvements to mass spectrometry-based technologies, focused on peptide-based, bottom-up approaches that have enabled deep, quantitative characterization of complex proteomes. These advances are facilitating the integration of proteomics data with other 'omic information, providing a more complete picture of living systems. We also describe the current state of bioinformatics software and approaches for integrating proteomics and other 'omics data, critical for enabling new discoveries driven by multi-omics. EXPERT COMMENTARY Multi-omics, centered on the integration of proteomics information with other 'omic information, has tremendous promise for biological and biomedical studies. Continued advances in approaches for generating deep, reliable proteomic data and bioinformatics tools aimed at integrating data across 'omic domains will ensure the discoveries offered by these multi-omic studies continue to increase.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Coauthor, Research Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| |
Collapse
|
36
|
Network Biology and Artificial Intelligence Drive the Understanding of the Multidrug Resistance Phenotype in Cancer. Drug Resist Updat 2022; 60:100811. [DOI: 10.1016/j.drup.2022.100811] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/22/2022] [Accepted: 01/24/2022] [Indexed: 02/07/2023]
|
37
|
Reiss JD, Peterson LS, Nesamoney SN, Chang AL, Pasca AM, Marić I, Shaw GM, Gaudilliere B, Wong RJ, Sylvester KG, Bonifacio SL, Aghaeepour N, Gibbs RS, Stevenson DK. Perinatal infection, inflammation, preterm birth, and brain injury: A review with proposals for future investigations. Exp Neurol 2022; 351:113988. [DOI: 10.1016/j.expneurol.2022.113988] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 01/06/2022] [Accepted: 01/13/2022] [Indexed: 11/26/2022]
|
38
|
Ahmed KT, Sun J, Cheng S, Yong J, Zhang W. Multi-omics data integration by generative adversarial network. Bioinformatics 2021; 38:179-186. [PMID: 34415323 PMCID: PMC10060730 DOI: 10.1093/bioinformatics/btab608] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 07/27/2021] [Accepted: 08/18/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Accurate disease phenotype prediction plays an important role in the treatment of heterogeneous diseases like cancer in the era of precision medicine. With the advent of high throughput technologies, more comprehensive multi-omics data is now available that can effectively link the genotype to phenotype. However, the interactive relation of multi-omics datasets makes it particularly challenging to incorporate different biological layers to discover the coherent biological signatures and predict phenotypic outcomes. In this study, we introduce omicsGAN, a generative adversarial network model to integrate two omics data and their interaction network. The model captures information from the interaction network as well as the two omics datasets and fuse them to generate synthetic data with better predictive signals. RESULTS Large-scale experiments on The Cancer Genome Atlas breast cancer, lung cancer and ovarian cancer datasets validate that (i) the model can effectively integrate two omics data (e.g. mRNA and microRNA expression data) and their interaction network (e.g. microRNA-mRNA interaction network). The synthetic omics data generated by the proposed model has a better performance on cancer outcome classification and patients survival prediction compared to original omics datasets. (ii) The integrity of the interaction network plays a vital role in the generation of synthetic data with higher predictive quality. Using a random interaction network does not allow the framework to learn meaningful information from the omics datasets; therefore, results in synthetic data with weaker predictive signals. AVAILABILITY AND IMPLEMENTATION Source code is available at: https://github.com/CompbioLabUCF/omicsGAN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Khandakar Tanvir Ahmed
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
- Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, USA
| | - Jiao Sun
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
- Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, USA
| | - Sze Cheng
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Wei Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
- Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
39
|
Ju J, Wismans LV, Mustafa DAM, Reinders MJT, van Eijck CHJ, Stubbs AP, Li Y. Robust deep learning model for prognostic stratification of pancreatic ductal adenocarcinoma patients. iScience 2021; 24:103415. [PMID: 34901786 PMCID: PMC8637475 DOI: 10.1016/j.isci.2021.103415] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 09/27/2021] [Accepted: 11/05/2021] [Indexed: 02/07/2023] Open
Abstract
A major challenge for treating patients with pancreatic ductal adenocarcinoma (PDAC) is the unpredictability of their prognoses due to high heterogeneity. We present Multi-Omics DEep Learning for Prognosis-correlated subtyping (MODEL-P) to identify PDAC subtypes and to predict prognoses of new patients. MODEL-P was trained on autoencoder integrated multi-omics of 146 patients with PDAC together with their survival outcome. Using MODEL-P, we identified two PDAC subtypes with distinct survival outcomes (median survival 10.1 and 22.7 months, respectively, log rank p = 1 × 10−6), which correspond to DNA damage repair and immune response. We rigorously validated MODEL-P by stratifying patients in five independent datasets into these two survival groups and achieved significant survival difference, which is superior to current practice and other subtyping schemas. We believe the subtype-specific signatures would facilitate PDAC pathogenesis discovery, and MODEL-P can provide clinicians the prognoses information in the treatment decision-making to better gauge the benefits versus the risks. We developed DL-based MODEL-P to identify prognosis-correlated PDAC subtypes The identified subtypes related to DNA damage repair and immune response processes MODEL-P stratified patients from independent datasets into distinct survival groups MODEL-P could be used in clinics to aid treatment decision-making
Collapse
Affiliation(s)
- Jie Ju
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Leonoor V Wismans
- Department of Surgery, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Dana A M Mustafa
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Marcel J T Reinders
- The Delft Bioinformatics Lab, Delft University of Technology, Rotterdam, the Netherlands
| | - Casper H J van Eijck
- Department of Surgery, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Andrew P Stubbs
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Yunlei Li
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| |
Collapse
|
40
|
Demirel HC, Arici MK, Tuncbag N. Computational approaches leveraging integrated connections of multi-omic data toward clinical applications. Mol Omics 2021; 18:7-18. [PMID: 34734935 DOI: 10.1039/d1mo00158b] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
In line with the advances in high-throughput technologies, multiple omic datasets have accumulated to study biological systems and diseases coherently. No single omics data type is capable of fully representing cellular activity. The complexity of the biological processes arises from the interactions between omic entities such as genes, proteins, and metabolites. Therefore, multi-omic data integration is crucial but challenging. The impact of the molecular alterations in multi-omic data is not local in the neighborhood of the altered gene or protein; rather, the impact diffuses in the network and changes the functionality of multiple signaling pathways and regulation of the gene expression. Additionally, multi-omic data is high-dimensional and has background noise. Several integrative approaches have been developed to accurately interpret the multi-omic datasets, including machine learning, network-based methods, and their combination. In this review, we overview the most recent integrative approaches and tools with a focus on network-based methods. We then discuss these approaches according to their specific applications, from disease-network and biomarker identification to patient stratification, drug discovery, and repurposing.
Collapse
Affiliation(s)
- Habibe Cansu Demirel
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey
| | - Muslum Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, 06044, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, 34450, Turkey.,School of Medicine, Koc University, Istanbul, 34450, Turkey.,Koc University Research Center for Translational Medicine (KUTTAM), Istanbul, Turkey.
| |
Collapse
|
41
|
Arici MK, Tuncbag N. Performance Assessment of the Network Reconstruction Approaches on Various Interactomes. Front Mol Biosci 2021; 8:666705. [PMID: 34676243 PMCID: PMC8523993 DOI: 10.3389/fmolb.2021.666705] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 07/14/2021] [Indexed: 01/04/2023] Open
Abstract
Beyond the list of molecules, there is a necessity to collectively consider multiple sets of omic data and to reconstruct the connections between the molecules. Especially, pathway reconstruction is crucial to understanding disease biology because abnormal cellular signaling may be pathological. The main challenge is how to integrate the data together in an accurate way. In this study, we aim to comparatively analyze the performance of a set of network reconstruction algorithms on multiple reference interactomes. We first explored several human protein interactomes, including PathwayCommons, OmniPath, HIPPIE, iRefWeb, STRING, and ConsensusPathDB. The comparison is based on the coverage of each interactome in terms of cancer driver proteins, structural information of protein interactions, and the bias toward well-studied proteins. We next used these interactomes to evaluate the performance of network reconstruction algorithms including all-pair shortest path, heat diffusion with flux, personalized PageRank with flux, and prize-collecting Steiner forest (PCSF) approaches. Each approach has its own merits and weaknesses. Among them, PCSF had the most balanced performance in terms of precision and recall scores when 28 pathways from NetPath were reconstructed using the listed algorithms. Additionally, the reference interactome affects the performance of the network reconstruction approaches. The coverage and disease- or tissue-specificity of each interactome may vary, which may result in differences in the reconstructed networks.
Collapse
Affiliation(s)
- M Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, Turkey.,School of Medicine, Koc University, Istanbul, Turkey
| |
Collapse
|
42
|
Karimi MR, Karimi AH, Abolmaali S, Sadeghi M, Schmitz U. Prospects and challenges of cancer systems medicine: from genes to disease networks. Brief Bioinform 2021; 23:6361045. [PMID: 34471925 PMCID: PMC8769701 DOI: 10.1093/bib/bbab343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
It is becoming evident that holistic perspectives toward cancer are crucial in deciphering the overwhelming complexity of tumors. Single-layer analysis of genome-wide data has greatly contributed to our understanding of cellular systems and their perturbations. However, fundamental gaps in our knowledge persist and hamper the design of effective interventions. It is becoming more apparent than ever, that cancer should not only be viewed as a disease of the genome but as a disease of the cellular system. Integrative multilayer approaches are emerging as vigorous assets in our endeavors to achieve systemic views on cancer biology. Herein, we provide a comprehensive review of the approaches, methods and technologies that can serve to achieve systemic perspectives of cancer. We start with genome-wide single-layer approaches of omics analyses of cellular systems and move on to multilayer integrative approaches in which in-depth descriptions of proteogenomics and network-based data analysis are provided. Proteogenomics is a remarkable example of how the integration of multiple levels of information can reduce our blind spots and increase the accuracy and reliability of our interpretations and network-based data analysis is a major approach for data interpretation and a robust scaffold for data integration and modeling. Overall, this review aims to increase cross-field awareness of the approaches and challenges regarding the omics-based study of cancer and to facilitate the necessary shift toward holistic approaches.
Collapse
Affiliation(s)
| | | | | | - Mehdi Sadeghi
- Department of Cell & Molecular Biology, Semnan University, Semnan, Iran
| | - Ulf Schmitz
- Department of Molecular & Cell Biology, James Cook University, Townsville, QLD 4811, Australia
| |
Collapse
|
43
|
Pan Y, Lei X, Zhang Y. Association predictions of genomics, proteinomics, transcriptomics, microbiome, metabolomics, pathomics, radiomics, drug, symptoms, environment factor, and disease networks: A comprehensive approach. Med Res Rev 2021; 42:441-461. [PMID: 34346083 DOI: 10.1002/med.21847] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 05/22/2021] [Accepted: 07/07/2021] [Indexed: 12/12/2022]
Abstract
Currently, the research of multi-omics, such as genomics, proteinomics, transcriptomics, microbiome, metabolomics, pathomics, and radiomics, are hot spots. The relationship between multi-omics data, drugs, and diseases has received extensive attention from researchers. At the same time, multi-omics can effectively predict the diagnosis, prognosis, and treatment of diseases. In essence, these research entities, such as genes, RNAs, proteins, microbes, metabolites, pathways as well as pathological and medical imaging data, can all be represented by the network at different levels. And some computer and biology scholars have tried to use computational methods to explore the potential relationships between biological entities. We summary a comprehensive research strategy, that is to build a multi-omics heterogeneous network, covering multimodal data, and use the current popular computational methods to make predictions. In this study, we first introduce the calculation method of the similarity of biological entities at the data level, second discuss multimodal data fusion and methods of feature extraction. Finally, the challenges and opportunities at this stage are summarized. Some scholars have used such a framework to calculate and predict. We also summarize them and discuss the challenges. We hope that our review could help scholars who are interested in the field of bioinformatics, biomedical image, and computer research.
Collapse
Affiliation(s)
- Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
44
|
Towle-Miller LM, Miecznikowski JC, Zhang F, Tritchler DL. SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis. PLoS One 2021; 16:e0255579. [PMID: 34343218 PMCID: PMC8330944 DOI: 10.1371/journal.pone.0255579] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 07/20/2021] [Indexed: 11/18/2022] Open
Abstract
Multi-omic analyses that integrate many high-dimensional datasets often present significant deficiencies in statistical power and require time consuming computations to execute the analytical methods. We present SuMO-Fil to remedy against these issues which is a pre-processing method for Supervised Multi-Omic Filtering that removes variables or features considered to be irrelevant noise. SuMO-Fil is intended to be performed prior to downstream analyses that detect supervised gene networks in sparse settings. We accomplish this by implementing variable filters based on low similarity across the datasets in conjunction with low similarity with the outcome. This approach can improve accuracy, as well as reduce run times for a variety of computationally expensive downstream analyses. This method has applications in a setting where the downstream analysis may include sparse canonical correlation analysis. Filtering methods specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. The SuMO-Fil method performs favorably by eliminating non-network features while maintaining important biological signal under a variety of different signal settings as compared to popular filtering techniques based on low means or low variances. We show that the speed and accuracy of methods such as supervised sparse canonical correlation are increased after using SuMO-Fil, thus greatly improving the scalability of these approaches.
Collapse
Affiliation(s)
- Lorin M. Towle-Miller
- Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America
| | | | - Fan Zhang
- Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America
| | - David L. Tritchler
- Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America
- Biostatistics Division, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
45
|
Mishra B, Kumar N, Mukhtar MS. Network biology to uncover functional and structural properties of the plant immune system. CURRENT OPINION IN PLANT BIOLOGY 2021; 62:102057. [PMID: 34102601 DOI: 10.1016/j.pbi.2021.102057] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 04/15/2021] [Accepted: 04/18/2021] [Indexed: 06/12/2023]
Abstract
In the last two decades, advances in network science have facilitated the discovery of important systems' entities in diverse biological networks. This graph-based technique has revealed numerous emergent properties of a system that enable us to understand several complex biological processes including plant immune systems. With the accumulation of multiomics data sets, the comprehensive understanding of plant-pathogen interactions can be achieved through the analyses and efficacious integration of multidimensional qualitative and quantitative relationships among the components of hosts and their microbes. This review highlights comparative network topology analyses in plant-pathogen co-expression networks and interactomes, outlines dynamic network modeling for cell-specific immune regulatory networks, and discusses the new frontiers of single-cell sequencing as well as multiomics data integration that are necessary for unraveling the intricacies of plant immune systems.
Collapse
Affiliation(s)
- Bharat Mishra
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL, 35294, USA
| | - Nilesh Kumar
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL, 35294, USA
| | - M Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL, 35294, USA.
| |
Collapse
|
46
|
Heo YJ, Hwa C, Lee GH, Park JM, An JY. Integrative Multi-Omics Approaches in Cancer Research: From Biological Networks to Clinical Subtypes. Mol Cells 2021; 44:433-443. [PMID: 34238766 PMCID: PMC8334347 DOI: 10.14348/molcells.2021.0042] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 04/09/2021] [Accepted: 05/12/2021] [Indexed: 11/27/2022] Open
Abstract
Multi-omics approaches are novel frameworks that integrate multiple omics datasets generated from the same patients to better understand the molecular and clinical features of cancers. A wide range of emerging omics and multi-view clustering algorithms now provide unprecedented opportunities to further classify cancers into subtypes, improve the survival prediction and therapeutic outcome of these subtypes, and understand key pathophysiological processes through different molecular layers. In this review, we overview the concept and rationale of multi-omics approaches in cancer research. We also introduce recent advances in the development of multi-omics algorithms and integration methods for multiple-layered datasets from cancer patients. Finally, we summarize the latest findings from large-scale multi-omics studies of various cancers and their implications for patient subtyping and drug development.
Collapse
Affiliation(s)
- Yong Jin Heo
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul 02841, Korea
- Department of Integrated Biomedical and Life Science, Korea University, Seoul 02841, Korea
| | - Chanwoong Hwa
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul 02841, Korea
| | - Gang-Hee Lee
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul 02841, Korea
| | - Jae-Min Park
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul 02841, Korea
| | - Joon-Yong An
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul 02841, Korea
- Department of Integrated Biomedical and Life Science, Korea University, Seoul 02841, Korea
| |
Collapse
|
47
|
Stanton JE, Malijauskaite S, McGourty K, Grabrucker AM. The Metallome as a Link Between the "Omes" in Autism Spectrum Disorders. Front Mol Neurosci 2021; 14:695873. [PMID: 34290588 PMCID: PMC8289253 DOI: 10.3389/fnmol.2021.695873] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 06/14/2021] [Indexed: 12/26/2022] Open
Abstract
Metal dyshomeostasis plays a significant role in various neurological diseases such as Alzheimer's disease, Parkinson's disease, Autism Spectrum Disorders (ASD), and many more. Like studies investigating the proteome, transcriptome, epigenome, microbiome, etc., for years, metallomics studies have focused on data from their domain, i.e., trace metal composition, only. Still, few have considered the links between other "omes," which may together result in an individual's specific pathologies. In particular, ASD have been reported to have multitudes of possible causal effects. Metallomics data focusing on metal deficiencies and dyshomeostasis can be linked to functions of metalloenzymes, metal transporters, and transcription factors, thus affecting the proteome and transcriptome. Furthermore, recent studies in ASD have emphasized the gut-brain axis, with alterations in the microbiome being linked to changes in the metabolome and inflammatory processes. However, the microbiome and other "omes" are heavily influenced by the metallome. Thus, here, we will summarize the known implications of a changed metallome for other "omes" in the body in the context of "omics" studies in ASD. We will highlight possible connections and propose a model that may explain the so far independently reported pathologies in ASD.
Collapse
Affiliation(s)
- Janelle E Stanton
- Department of Biological Sciences, University of Limerick, Limerick, Ireland.,Bernal Institute, University of Limerick, Limerick, Ireland
| | - Sigita Malijauskaite
- Bernal Institute, University of Limerick, Limerick, Ireland.,Department of Chemical Sciences, University of Limerick, Limerick, Ireland
| | - Kieran McGourty
- Bernal Institute, University of Limerick, Limerick, Ireland.,Department of Chemical Sciences, University of Limerick, Limerick, Ireland.,Health Research Institute, University of Limerick, Limerick, Ireland
| | - Andreas M Grabrucker
- Department of Biological Sciences, University of Limerick, Limerick, Ireland.,Bernal Institute, University of Limerick, Limerick, Ireland.,Health Research Institute, University of Limerick, Limerick, Ireland
| |
Collapse
|
48
|
Ding J, Blencowe M, Nghiem T, Ha SM, Chen YW, Li G, Yang X. Mergeomics 2.0: a web server for multi-omics data integration to elucidate disease networks and predict therapeutics. Nucleic Acids Res 2021; 49:W375-W387. [PMID: 34048577 PMCID: PMC8262738 DOI: 10.1093/nar/gkab405] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 04/28/2021] [Accepted: 05/02/2021] [Indexed: 12/13/2022] Open
Abstract
The Mergeomics web server is a flexible online tool for multi-omics data integration to derive biological pathways, networks, and key drivers important to disease pathogenesis and is based on the open source Mergeomics R package. The web server takes summary statistics of multi-omics disease association studies (GWAS, EWAS, TWAS, PWAS, etc.) as input and features four functions: Marker Dependency Filtering (MDF) to correct for known dependency between omics markers, Marker Set Enrichment Analysis (MSEA) to detect disease relevant biological processes, Meta-MSEA to examine the consistency of biological processes informed by various omics datasets, and Key Driver Analysis (KDA) to identify essential regulators of disease-associated pathways and networks. The web server has been extensively updated and streamlined in version 2.0 including an overhauled user interface, improved tutorials and results interpretation for each analytical step, inclusion of numerous disease GWAS, functional genomics datasets, and molecular networks to allow for comprehensive omics integrations, increased functionality to decrease user workload, and increased flexibility to cater to user-specific needs. Finally, we have incorporated our newly developed drug repositioning pipeline PharmOmics for prediction of potential drugs targeting disease processes that were identified by Mergeomics. Mergeomics is freely accessible at http://mergeomics.research.idre.ucla.edu and does not require login.
Collapse
Affiliation(s)
- Jessica Ding
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Montgomery Blencowe
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Thien Nghiem
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Sung-min Ha
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Yen-Wei Chen
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular Toxicology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Gaoyan Li
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular Toxicology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| |
Collapse
|
49
|
Odenkirk MT, Reif DM, Baker ES. Multiomic Big Data Analysis Challenges: Increasing Confidence in the Interpretation of Artificial Intelligence Assessments. Anal Chem 2021; 93:7763-7773. [PMID: 34029068 PMCID: PMC8465926 DOI: 10.1021/acs.analchem.0c04850] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The need for holistic molecular measurements to better understand disease initiation, development, diagnosis, and therapy has led to an increasing number of multiomic analyses. The wealth of information available from multiomic assessments, however, requires both the evaluation and interpretation of extremely large data sets, limiting analysis throughput and ease of adoption. Computational methods utilizing artificial intelligence (AI) provide the most promising way to address these challenges, yet despite the conceptual benefits of AI and its successful application in singular omic studies, the widespread use of AI in multiomic studies remains limited. Here, we discuss present and future capabilities of AI techniques in multiomic studies while introducing analytical checks and balances to validate the computational conclusions.
Collapse
Affiliation(s)
- Melanie T Odenkirk
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - David M Reif
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27606, United States
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - Erin S Baker
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27606, United States
| |
Collapse
|
50
|
multiSLIDE is a web server for exploring connected elements of biological pathways in multi-omics data. Nat Commun 2021; 12:2279. [PMID: 33863886 PMCID: PMC8052434 DOI: 10.1038/s41467-021-22650-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 03/24/2021] [Indexed: 12/12/2022] Open
Abstract
Quantitative multi-omics data are difficult to interpret and visualize due to large volume of data, complexity among data features, and heterogeneity of information represented by different omics platforms. Here, we present multiSLIDE, a web-based interactive tool for the simultaneous visualization of interconnected molecular features in heatmaps of multi-omics data sets. multiSLIDE visualizes biologically connected molecular features by keyword search of pathways or genes, offering convenient functionalities to query, rearrange, filter, and cluster data on a web browser in real time. Various querying mechanisms make it adaptable to diverse omics types, and visualizations are customizable. We demonstrate the versatility of multiSLIDE through three examples, showcasing its applicability to a wide range of multi-omics data sets, by allowing users to visualize established links between molecules from different omics data, as well as incorporate custom inter-molecular relationship information into the visualization. Online and stand-alone versions of multiSLIDE are available at https://github.com/soumitag/multiSLIDE. The integration and interpretation of different omics data types is an ongoing challenge for biologists. Here, the authors present a web-based, interactive tool called multiSLIDE for the visualization of protein, phosphoprotein, and RNA data presented as interlinked heatmaps.
Collapse
|