51
|
Ouyang D, Liang Y, Li L, Ai N, Lu S, Yu M, Liu X, Xie S. Integration of multi-omics data using adaptive graph learning and attention mechanism for patient classification and biomarker identification. Comput Biol Med 2023; 164:107303. [PMID: 37586201 DOI: 10.1016/j.compbiomed.2023.107303] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 07/08/2023] [Accepted: 07/28/2023] [Indexed: 08/18/2023]
Abstract
With the rapid development and accumulation of high-throughput sequencing technology and omics data, many studies have conducted a more comprehensive understanding of human diseases from a multi-omics perspective. Meanwhile, graph-based methods have been widely used to process multi-omics data due to its powerful expressive ability. However, most existing graph-based methods utilize fixed graphs to learn sample embedding representations, which often leads to sub-optimal results. Furthermore, treating embedding representations of different omics equally usually cannot obtain more reasonable integrated information. In addition, the complex correlation between omics is not fully taken into account. To this end, we propose an end-to-end interpretable multi-omics integration method, named MOGLAM, for disease classification prediction. Dynamic graph convolutional network with feature selection is first utilized to obtain higher quality omic-specific embedding information by adaptively learning the graph structure and discover important biomarkers. Then, multi-omics attention mechanism is applied to adaptively weight the embedding representations of different omics, thereby obtaining more reasonable integrated information. Finally, we propose omic-integrated representation learning to capture complex common and complementary information between omics while performing multi-omics integration. Experimental results on three datasets show that MOGLAM achieves superior performance than other state-of-the-art multi-omics integration methods. Moreover, MOGLAM can identify important biomarkers from different omics data types in an end-to-end manner.
Collapse
Affiliation(s)
- Dong Ouyang
- Peng Cheng Laboratory, Shenzhen, 518055, China; School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China
| | - Yong Liang
- Peng Cheng Laboratory, Shenzhen, 518055, China.
| | - Le Li
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China
| | - Ning Ai
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China
| | - Shanghui Lu
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China
| | - Mingkun Yu
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China
| | - Xiaoying Liu
- Computer Engineering Technical College, Guangdong Polytechnic of Science and Technology, Zhuhai, 519090, China
| | - Shengli Xie
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou, 510000, China
| |
Collapse
|
52
|
Chen Y, Wen Y, Xie C, Chen X, He S, Bo X, Zhang Z. MOCSS: Multi-omics data clustering and cancer subtyping via shared and specific representation learning. iScience 2023; 26:107378. [PMID: 37559907 PMCID: PMC10407241 DOI: 10.1016/j.isci.2023.107378] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 05/23/2023] [Accepted: 07/07/2023] [Indexed: 08/11/2023] Open
Abstract
Cancer is an extremely complex disease and each type of cancer usually has several different subtypes. Multi-omics data can provide more comprehensive biological information for identifying and discovering cancer subtypes. However, existing unsupervised cancer subtyping methods cannot effectively learn comprehensive shared and specific information of multi-omics data. Therefore, a novel method is proposed based on shared and specific representation learning. For each omics data, two autoencoders are applied to extract shared and specific information, respectively. To reduce redundancy and mutual interference, orthogonality constraint is introduced to separate shared and specific information. In addition, contrastive learning is applied to align the shared information and strengthen their consistency. Finally, the obtained shared and specific information for all samples are used for clustering tasks to achieve cancer subtyping. Experimental results demonstrate that the proposed method can effectively capture shared and specific information of multi-omics data and outperform other state-of-the-art methods on cancer subtyping.
Collapse
Affiliation(s)
- Yuxin Chen
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Chenyang Xie
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Xinjian Chen
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Zhongnan Zhang
- School of Informatics, Xiamen University, Xiamen 361005, China
| |
Collapse
|
53
|
Gygi JP, Kleinstein SH, Guan L. Predictive overfitting in immunological applications: Pitfalls and solutions. Hum Vaccin Immunother 2023; 19:2251830. [PMID: 37697867 PMCID: PMC10498807 DOI: 10.1080/21645515.2023.2251830] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/27/2023] [Accepted: 08/21/2023] [Indexed: 09/13/2023] Open
Abstract
Overfitting describes the phenomenon where a highly predictive model on the training data generalizes poorly to future observations. It is a common concern when applying machine learning techniques to contemporary medical applications, such as predicting vaccination response and disease status in infectious disease or cancer studies. This review examines the causes of overfitting and offers strategies to counteract it, focusing on model complexity reduction, reliable model evaluation, and harnessing data diversity. Through discussion of the underlying mathematical models and illustrative examples using both synthetic data and published real datasets, our objective is to equip analysts and bioinformaticians with the knowledge and tools necessary to detect and mitigate overfitting in their research.
Collapse
Affiliation(s)
- Jeremy P. Gygi
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
| | - Steven H. Kleinstein
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA
- Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
| | - Leying Guan
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| |
Collapse
|
54
|
Erdem C, Gross SM, Heiser LM, Birtwistle MR. MOBILE pipeline enables identification of context-specific networks and regulatory mechanisms. Nat Commun 2023; 14:3991. [PMID: 37414767 PMCID: PMC10326020 DOI: 10.1038/s41467-023-39729-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 06/27/2023] [Indexed: 07/08/2023] Open
Abstract
Robust identification of context-specific network features that control cellular phenotypes remains a challenge. We here introduce MOBILE (Multi-Omics Binary Integration via Lasso Ensembles) to nominate molecular features associated with cellular phenotypes and pathways. First, we use MOBILE to nominate mechanisms of interferon-γ (IFNγ) regulated PD-L1 expression. Our analyses suggest that IFNγ-controlled PD-L1 expression involves BST2, CLIC2, FAM83D, ACSL5, and HIST2H2AA3 genes, which were supported by prior literature. We also compare networks activated by related family members transforming growth factor-beta 1 (TGFβ1) and bone morphogenetic protein 2 (BMP2) and find that differences in ligand-induced changes in cell size and clustering properties are related to differences in laminin/collagen pathway activity. Finally, we demonstrate the broad applicability and adaptability of MOBILE by analyzing publicly available molecular datasets to investigate breast cancer subtype specific networks. Given the ever-growing availability of multi-omics datasets, we envision that MOBILE will be broadly useful for identification of context-specific molecular features and pathways.
Collapse
Affiliation(s)
- Cemal Erdem
- Department of Chemical and Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Sean M Gross
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Laura M Heiser
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
| | - Marc R Birtwistle
- Department of Chemical and Biomolecular Engineering, Clemson University, Clemson, SC, USA.
- Department of Bioengineering, Clemson University, Clemson, SC, USA.
| |
Collapse
|
55
|
Mahdi-Esferizi R, Haji Molla Hoseyni B, Mehrpanah A, Golzade Y, Najafi A, Elahian F, Zadeh Shirazi A, Gomez GA, Tahmasebian S. DeeP4med: deep learning for P4 medicine to predict normal and cancer transcriptome in multiple human tissues. BMC Bioinformatics 2023; 24:275. [PMID: 37403016 DOI: 10.1186/s12859-023-05400-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 06/25/2023] [Indexed: 07/06/2023] Open
Abstract
BACKGROUND P4 medicine (predict, prevent, personalize, and participate) is a new approach to diagnosing and predicting diseases on a patient-by-patient basis. For the prevention and treatment of diseases, prediction plays a fundamental role. One of the intelligent strategies is the design of deep learning models that can predict the state of the disease using gene expression data. RESULTS We create an autoencoder deep learning model called DeeP4med, including a Classifier and a Transferor that predicts cancer's gene expression (mRNA) matrix from its matched normal sample and vice versa. The range of the F1 score of the model, depending on tissue type in the Classifier, is from 0.935 to 0.999 and in Transferor from 0.944 to 0.999. The accuracy of DeeP4med for tissue and disease classification was 0.986 and 0.992, respectively, which performed better compared to seven classic machine learning models (Support Vector Classifier, Logistic Regression, Linear Discriminant Analysis, Naive Bayes, Decision Tree, Random Forest, K Nearest Neighbors). CONCLUSIONS Based on the idea of DeeP4med, by having the gene expression matrix of a normal tissue, we can predict its tumor gene expression matrix and, in this way, find effective genes in transforming a normal tissue into a tumor tissue. Results of Differentially Expressed Genes (DEGs) and enrichment analysis on the predicted matrices for 13 types of cancer showed a good correlation with the literature and biological databases. This led that by using the gene expression matrix, to train the model with features of each person in a normal and cancer state, this model could predict diagnosis based on gene expression data from healthy tissue and be used to identify possible therapeutic interventions for those patients.
Collapse
Affiliation(s)
- Roohallah Mahdi-Esferizi
- Department of Medical Biotechnology, School of Advanced Technologies, Shahrekord University of Medical Sciences, Shahrekord, Iran
| | | | - Amir Mehrpanah
- Faculty of Mathematics, Shahid Beheshti University, Tehran, Iran
| | - Yazdan Golzade
- Department of Mathematics, Faculty of Basic Sciences, Iran University of Science and Technology,(IUST), Tehran, Iran
| | - Ali Najafi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Fatemeh Elahian
- Department of Medical Biotechnology, School of Advanced Technologies, Shahrekord University of Medical Sciences, Shahrekord, Iran
| | - Amin Zadeh Shirazi
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, 5000, Australia
| | - Guillermo A Gomez
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, 5000, Australia
| | - Shahram Tahmasebian
- Cellular and Molecular Research Center, Basic Health Sciences Institute, Shahrekord University of Medical Sciences, Shahrekord, Iran.
| |
Collapse
|
56
|
Chicco D, Cumbo F, Angione C. Ten quick tips for avoiding pitfalls in multi-omics data integration analyses. PLoS Comput Biol 2023; 19:e1011224. [PMID: 37410704 DOI: 10.1371/journal.pcbi.1011224] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023] Open
Abstract
Data are the most important elements of bioinformatics: Computational analysis of bioinformatics data, in fact, can help researchers infer new knowledge about biology, chemistry, biophysics, and sometimes even medicine, influencing treatments and therapies for patients. Bioinformatics and high-throughput biological data coming from different sources can even be more helpful, because each of these different data chunks can provide alternative, complementary information about a specific biological phenomenon, similar to multiple photos of the same subject taken from different angles. In this context, the integration of bioinformatics and high-throughput biological data gets a pivotal role in running a successful bioinformatics study. In the last decades, data originating from proteomics, metabolomics, metagenomics, phenomics, transcriptomics, and epigenomics have been labelled -omics data, as a unique name to refer to them, and the integration of these omics data has gained importance in all biological areas. Even if this omics data integration is useful and relevant, due to its heterogeneity, it is not uncommon to make mistakes during the integration phases. We therefore decided to present these ten quick tips to perform an omics data integration correctly, avoiding common mistakes we experienced or noticed in published studies in the past. Even if we designed our ten guidelines for beginners, by using a simple language that (we hope) can be understood by anyone, we believe our ten recommendations should be taken into account by all the bioinformaticians performing omics data integration, including experts.
Collapse
Affiliation(s)
- Davide Chicco
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Fabio Cumbo
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Claudio Angione
- School of Computing Engineering and Digital Technologies, Teesside University, Middlesbrough, United Kingdom
| |
Collapse
|
57
|
Salimy S, Lanjanian H, Abbasi K, Salimi M, Najafi A, Tapak L, Masoudi-Nejad A. A deep learning-based framework for predicting survival-associated groups in colon cancer by integrating multi-omics and clinical data. Heliyon 2023; 9:e17653. [PMID: 37455955 PMCID: PMC10344710 DOI: 10.1016/j.heliyon.2023.e17653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 05/30/2023] [Accepted: 06/25/2023] [Indexed: 07/18/2023] Open
Abstract
Precise prognostic classification of patients and identifying survival subgroups and their associated genes can be important clinical references when designing treatment strategies for cancer patients. Multi-omics and data integration techniques are powerful tools to achieve this goal. This study aimed to introduce a machine learning method to integrate three types of biological data, and investigate the performance of two other methods, in identifying the survival dependency of patients. The data included TCGA RNA-seq gene expression, DNA methylation, and clinical data from 368 patients with colon cancer also we use an independent external validation data set, containing 232 samples. Three methods including, hyper-parameter optimized autoencoders (HPOAE), normal autoencoder, and penalized principal component analysis (PPCA) were used for simultaneous data integration and estimation under a COX hazards model. The HPOAE was thought to outperform other methods. The HPOAE had the Log Rank Mantel-Cox value of 14.27 ± 2, and a Breslow-Generalized Wilcoxon value of 13.13 ± 1. Ten miRNA, 11 methylated genes, and 28 mRNA all by (importance of marginal cutoff > 0.95) were identified. The study demonstrated that hsa-miR-485-5p targets both ZMYM1 and tp53, the latter of which has been previously associated with cancer in numerous studies. Furthermore, compared to other methods, the HPOAE exhibited a greater capacity for identifying survival subgroups and the genes associated with them in patients with colon cancer. However, all of the results were obtained by computational methods, and clinical and experimental studies are needed to validate these results.
Collapse
Affiliation(s)
- Siamak Salimy
- Laboratory of System Biology and Bioinformatics (LBB), Department of Bioinformatics, University of Tehran, Kish International Campus, Kish, Iran
| | - Hossein Lanjanian
- Cellular and Molecular Endocrine Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics & Artificial Intelligent in Medicine (LBBai), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, Iran
| | - Mahdieh Salimi
- Department of Medical Genetics, Institute of Medical Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Ali Najafi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Tehran, Iran
| | - Leili Tapak
- Department of Biostatistics, School of Public Health and Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Ali Masoudi-Nejad
- Laboratory of System Biology and Bioinformatics (LBB), Department of Bioinformatics, University of Tehran, Kish International Campus, Kish, Iran
| |
Collapse
|
58
|
Chatterjee B, Thakur SS. Proteins and metabolites fingerprints of gestational diabetes mellitus forming protein-metabolite interactomes are its potential biomarkers. Proteomics 2023; 23:e2200257. [PMID: 36919629 DOI: 10.1002/pmic.202200257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 03/04/2023] [Accepted: 03/06/2023] [Indexed: 03/16/2023]
Abstract
Gestational diabetes mellitus (GDM) is a consequence of glucose intolerance with an inadequate production of insulin that happens during pregnancy and leads to adverse health consequences for both mother and fetus. GDM patients are at higher risk for preeclampsia, and developing diabetes mellitus type 2 in later life, while the child born to GDM mothers are more prone to macrosomia, and hypoglycemia. The universally accepted diagnostic criteria for GDM are lacking, therefore there is a need for a diagnosis of GDM that can identify GDM at its early stage (first trimester). We have reviewed the literature on proteins and metabolites fingerprints of GDM. Further, we have performed protein-protein, metabolite-metabolite, and protein-metabolite interaction network studies on GDM proteins and metabolites fingerprints. Notably, some proteins and metabolites fingerprints are forming strong interaction networks at high confidence scores. Therefore, we have suggested that those proteins and metabolites that are forming protein-metabolite interactomes are the potential biomarkers of GDM. The protein-metabolite biomarkers interactome may help in a deep understanding of the prognosis, pathogenesis of GDM, and also detection of GDM. The protein-metabolites interactome may be further applied in planning future therapeutic strategies to promote long-term health benefits in GDM mothers and their children.
Collapse
Affiliation(s)
- Bhaswati Chatterjee
- National Institute of Pharmaceutical Education and Research, Hyderabad, India
- National Institute of Animal Biotechnology (NIAB), Hyderabad, India
| | - Suman S Thakur
- Centre for Cellular and Molecular Biology, Hyderabad, India
| |
Collapse
|
59
|
Demir Karaman E, Işık Z. Multi-Omics Data Analysis Identifies Prognostic Biomarkers across Cancers. Med Sci (Basel) 2023; 11:44. [PMID: 37489460 PMCID: PMC10366886 DOI: 10.3390/medsci11030044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 06/18/2023] [Accepted: 06/20/2023] [Indexed: 07/26/2023] Open
Abstract
Combining omics data from different layers using integrative methods provides a better understanding of the biology of a complex disease such as cancer. The discovery of biomarkers related to cancer development or prognosis helps to find more effective treatment options. This study integrates multi-omics data of different cancer types with a network-based approach to explore common gene modules among different tumors by running community detection methods on the integrated network. The common modules were evaluated by several biological metrics adapted to cancer. Then, a new prognostic scoring method was developed by weighting mRNA expression, methylation, and mutation status of genes. The survival analysis pointed out statistically significant results for GNG11, CBX2, CDKN3, ARHGEF10, CLN8, SEC61G and PTDSS1 genes. The literature search reveals that the identified biomarkers are associated with the same or different types of cancers. Our method does not only identify known cancer-specific biomarker genes, but also proposes new potential biomarkers. Thus, this study provides a rationale for identifying new gene targets and expanding treatment options across cancer types.
Collapse
Affiliation(s)
- Ezgi Demir Karaman
- Department of Computer Engineering, Institute of Natural and Applied Sciences, Dokuz Eylul University, Izmir 35390, Turkey
| | - Zerrin Işık
- Department of Computer Engineering, Faculty of Engineering, Dokuz Eylul University, Izmir 35390, Turkey
| |
Collapse
|
60
|
Kwoji ID, Aiyegoro OA, Okpeku M, Adeleke MA. 'Multi-omics' data integration: applications in probiotics studies. NPJ Sci Food 2023; 7:25. [PMID: 37277356 DOI: 10.1038/s41538-023-00199-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 05/22/2023] [Indexed: 06/07/2023] Open
Abstract
The concept of probiotics is witnessing increasing attention due to its benefits in influencing the host microbiome and the modulation of host immunity through the strengthening of the gut barrier and stimulation of antibodies. These benefits, combined with the need for improved nutraceuticals, have resulted in the extensive characterization of probiotics leading to an outburst of data generated using several 'omics' technologies. The recent development in system biology approaches to microbial science is paving the way for integrating data generated from different omics techniques for understanding the flow of molecular information from one 'omics' level to the other with clear information on regulatory features and phenotypes. The limitations and tendencies of a 'single omics' application to ignore the influence of other molecular processes justify the need for 'multi-omics' application in probiotics selections and understanding its action on the host. Different omics techniques, including genomics, transcriptomics, proteomics, metabolomics and lipidomics, used for studying probiotics and their influence on the host and the microbiome are discussed in this review. Furthermore, the rationale for 'multi-omics' and multi-omics data integration platforms supporting probiotics and microbiome analyses was also elucidated. This review showed that multi-omics application is useful in selecting probiotics and understanding their functions on the host microbiome. Hence, recommend a multi-omics approach for holistically understanding probiotics and the microbiome.
Collapse
Affiliation(s)
- Iliya Dauda Kwoji
- Discipline of Genetics, School of Life Sciences, College of Agriculture, Engineering and Sciences, University of KwaZulu-Natal, 4090, Durban, South Africa
| | - Olayinka Ayobami Aiyegoro
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom, Northwest, South Africa
| | - Moses Okpeku
- Discipline of Genetics, School of Life Sciences, College of Agriculture, Engineering and Sciences, University of KwaZulu-Natal, 4090, Durban, South Africa
| | - Matthew Adekunle Adeleke
- Discipline of Genetics, School of Life Sciences, College of Agriculture, Engineering and Sciences, University of KwaZulu-Natal, 4090, Durban, South Africa.
| |
Collapse
|
61
|
Devonshire A, Gautam Y, Johansson E, Mersha TB. Multi-omics profiling approach in food allergy. World Allergy Organ J 2023; 16:100777. [PMID: 37214173 PMCID: PMC10199264 DOI: 10.1016/j.waojou.2023.100777] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 04/05/2023] [Accepted: 04/05/2023] [Indexed: 05/24/2023] Open
Abstract
The prevalence of food allergy (FA) among children is increasing, affecting nearly 8% of children, and FA is the most common cause of anaphylaxis and anaphylaxis-related emergency department visits in children. Importantly, FA is a complex, multi-system, multifactorial disease mediated by food-specific immunoglobulin E (IgE) and type 2 immune responses and involving environmental and genetic factors and gene-environment interactions. Early exposure to external and internal environmental factors largely influences the development of immune responses to allergens. Genetic factors and gene-environment interactions have established roles in the FA pathophysiology. To improve diagnosis and identification of FA therapeutic targets, high-throughput omics approaches have emerged and been applied over the past decades to screen for potential FA biomarkers, such as genes, transcripts, proteins, and metabolites. In this article, we provide an overview of the current status of FA omics studies, namely genomic, transcriptomic, epigenomic, proteomic, exposomic, and metabolomic. The current development of multi-omics integration of FA studies is also briefly discussed. As individual omics technologies only provide limited information on the multi-system biological processes of FA, integration of population-based multi-omics data and clinical data may lead to robust biomarker discovery that could translate into advances in disease management and clinical care and ultimately lead to precision medicine approaches.
Collapse
Affiliation(s)
- Ashley Devonshire
- Division of Allergy and Immunology, Cincinnati Children's Hospital Medical Center, Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Yadu Gautam
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Elisabet Johansson
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Tesfaye B. Mersha
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| |
Collapse
|
62
|
Choi JM, Chae H. moBRCA-net: a breast cancer subtype classification framework based on multi-omics attention neural networks. BMC Bioinformatics 2023; 24:169. [PMID: 37101124 PMCID: PMC10131354 DOI: 10.1186/s12859-023-05273-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 04/05/2023] [Indexed: 04/28/2023] Open
Abstract
BACKGROUND Breast cancer is a highly heterogeneous disease that comprises multiple biological components. Owing its diversity, patients have different prognostic outcomes; hence, early diagnosis and accurate subtype prediction are critical for treatment. Standardized breast cancer subtyping systems, mainly based on single-omics datasets, have been developed to ensure proper treatment in a systematic manner. Recently, multi-omics data integration has attracted attention to provide a comprehensive view of patients but poses a challenge due to the high dimensionality. In recent years, deep learning-based approaches have been proposed, but they still present several limitations. RESULTS In this study, we describe moBRCA-net, an interpretable deep learning-based breast cancer subtype classification framework that uses multi-omics datasets. Three omics datasets comprising gene expression, DNA methylation and microRNA expression data were integrated while considering the biological relationships among them, and a self-attention module was applied to each omics dataset to capture the relative importance of each feature. The features were then transformed to new representations considering the respective learned importance, allowing moBRCA-net to predict the subtype. CONCLUSIONS Experimental results confirmed that moBRCA-net has a significantly enhanced performance compared with other methods, and the effectiveness of multi-omics integration and omics-level attention were identified. moBRCA-net is publicly available at https://github.com/cbi-bioinfo/moBRCA-net .
Collapse
Affiliation(s)
- Joung Min Choi
- Department of Computer Science, Virginia Tech, Blacksburg, USA
| | - Heejoon Chae
- Division of Computer Science, Sookmyung Women's University, Seoul, Republic of Korea.
| |
Collapse
|
63
|
Yang D, Wu Y, Wan Z, Xu Z, Li W, Yuan P, Shang Q, Peng J, Tao L, Chen Q, Dan H, Xu H. HISMD: A Novel Immune Subtyping System for HNSCC. J Dent Res 2023; 102:270-279. [PMID: 36333876 DOI: 10.1177/00220345221134605] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Immune subtyping is an important way to reveal immune heterogeneity, which may contribute to the diversity of the progression and treatment in head and neck squamous cell carcinoma (HNSCC). However, reported immune subtypes mainly focus on levels of immune infiltration and are mostly based on a mono-omics profile. This study aimed to identify a comprehensive immune subtype for HNSCC via multi-omics clustering and build a novel subtype prediction system for clinical application. Data were obtained from The Cancer Genome Atlas database and our independent multicenter cohort. Multi-omics clustering was performed to identify 3 clusters of 499 patients in The Cancer Genome Atlas based on immune-related gene expression and somatic mutations. The immune characteristics and biological features of the obtained clusters were revealed by bioinformatics, and 3 immune subtypes were identified: 1) adaptive immune activation subtype predominantly enriched in T cells, 2) innate immune activation subtype predominantly enriched in macrophages, and 3) immune desert subtype. Subsequently, the clinical implications of each subtype were analyzed per clinical epidemiology. We found that adaptive immune activation showed better survival outcomes and had a similar response to chemotherapy with innate immune activation, whereas immune desert might be relatively resistant to chemotherapy. Moreover, a subtype prediction system was developed by deep learning with whole slide images and named HISMD: HNSCC Immune Subtypes via Multi-omics and Deep Learning. We endowed HISMD with interpretability through image-based key feature extraction. The clinical implications, biological significances, and predictive stability of HISMD were successfully verified by using our independent multicenter cohort data set. In summary, this study revealed the immune heterogeneity of HNSCC and obtained a novel, highly accurate, and interpretable immune subtyping prediction system. For clinical implementation in the future, additional validation and utility studies are warranted.
Collapse
Affiliation(s)
- D Yang
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Y Wu
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Z Wan
- Department of Pathology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Z Xu
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - W Li
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - P Yuan
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Q Shang
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - J Peng
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - L Tao
- College of Mathematics, Sichuan University, Chengdu, China
| | - Q Chen
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China.,Key Laboratory of Oral Biomedical Research of Zhejiang Province, Affiliated Stomatology Hospital, Zhejiang University School of Stomatology, Hangzhou, China
| | - H Dan
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - H Xu
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Research Unit of Oral Carcinogenesis and Management, Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| |
Collapse
|
64
|
Echegaray N, Yilmaz B, Sharma H, Kumar M, Pateiro M, Ozogul F, Lorenzo JM. A novel approach to Lactiplantibacillus plantarum: From probiotic properties to the omics insights. Microbiol Res 2023; 268:127289. [PMID: 36571922 DOI: 10.1016/j.micres.2022.127289] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 10/24/2022] [Accepted: 12/15/2022] [Indexed: 12/24/2022]
Abstract
Lactiplantibacillus plantarum (previously known as Lactobacillus plantarum) strains are one of the lactic acid bacteria (LAB) commonly used in fermentation and their probiotic and functional properties along with their health-promoting roles come to the fore. Food-derived L. plantarum strains have shown good resistance and adhesion in the gastrointestinal tract (GI) and excellent antioxidant and antimicrobial properties. Furthermore, many strains of L. plantarum can produce bacteriocins with interesting antimicrobial activity. This probiotic properties of L. plantarum and existing in different niches give a great potential to have beneficial effects on health. It is also has been shown that L. plantarum can regulate the intestinal microbiota composition in a good way. Recently, omics approaches such as metabolomics, secretomics, proteomics, transcriptomics and genomics try to understand the roles and mechanisms of L. plantarum that are related to its functional characteristics. This review provides an overview of the probiotic properties, including the specific interactions between microbiota and host, and omics insights of L. plantarum.
Collapse
Affiliation(s)
- Noemí Echegaray
- Centro Tecnológico de la Carne de Galicia, Avda. Galicia nº 4, Parque Tecnológico de Galicia, San Cibrao das Viñas, 32900 Ourense, Spain
| | - Birsen Yilmaz
- Department of Nutrition and Dietetics, Cukurova University, Sarıcam, 01330 Adana, Turkey
| | - Heena Sharma
- Dairy Technology Division, ICAR-National Dairy Research Institute, Karnāl, Haryana, 132001, India
| | - Manoj Kumar
- Chemical and Biochemical Processing Division, Central Institute for Research on Cotton Technology, Mumbai 400019, India
| | - Mirian Pateiro
- Centro Tecnológico de la Carne de Galicia, Avda. Galicia nº 4, Parque Tecnológico de Galicia, San Cibrao das Viñas, 32900 Ourense, Spain
| | - Fatih Ozogul
- Department of Seafood Processing Technology, Faculty of Fisheries, Cukurova University, 01330, Adana, Turkey
| | - Jose Manuel Lorenzo
- Centro Tecnológico de la Carne de Galicia, Avda. Galicia nº 4, Parque Tecnológico de Galicia, San Cibrao das Viñas, 32900 Ourense, Spain; Universidade de Vigo, Área de Tecnoloxía dos Alimentos, Facultade de Ciencias de Ourense, 32004 Ourense, Spain.
| |
Collapse
|
65
|
Zafari N, Bathaei P, Velayati M, Khojasteh-Leylakoohi F, Khazaei M, Fiuji H, Nassiri M, Hassanian SM, Ferns GA, Nazari E, Avan A. Integrated analysis of multi-omics data for the discovery of biomarkers and therapeutic targets for colorectal cancer. Comput Biol Med 2023; 155:106639. [PMID: 36805214 DOI: 10.1016/j.compbiomed.2023.106639] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/14/2023] [Accepted: 02/05/2023] [Indexed: 02/12/2023]
Abstract
The considerable burden of colorectal cancer and the rising trend in young adults emphasize the necessity of understanding its underlying mechanisms, providing new diagnostic and prognostic markers, and improving therapeutic approaches. Precision medicine is a new trend all over the world and identification of novel biomarkers and therapeutic targets is a step forward towards this trend. In this context, multi-omics data and integrated analysis are being investigated to develop personalized medicine in the management of colorectal cancer. Given the large amount of data from multi-omics approach, data integration and analysis is a great challenge. In this Review, we summarize how statistical and machine learning techniques are applied to analyze multi-omics data and how it contributes to the discovery of useful diagnostic and prognostic biomarkers and therapeutic targets. Moreover, we discuss the importance of these biomarkers and therapeutic targets in the clinical management of colorectal cancer in the future. Taken together, integrated analysis of multi-omics data has great potential for finding novel diagnostic and prognostic biomarkers and therapeutic targets, however, there are still challenges to overcome in future studies.
Collapse
Affiliation(s)
- Nima Zafari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Parsa Bathaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mahla Velayati
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fatemeh Khojasteh-Leylakoohi
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Hamid Fiuji
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammadreza Nassiri
- Recombinant Proteins Research Group, The Research Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, Sussex, BN1 9PH, UK
| | - Elham Nazari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
66
|
Valles-Colomer M, Menni C, Berry SE, Valdes AM, Spector TD, Segata N. Cardiometabolic health, diet and the gut microbiome: a meta-omics perspective. Nat Med 2023; 29:551-561. [PMID: 36932240 PMCID: PMC11258867 DOI: 10.1038/s41591-023-02260-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/16/2023] [Indexed: 03/19/2023]
Abstract
Cardiometabolic diseases have become a leading cause of morbidity and mortality globally. They have been tightly linked to microbiome taxonomic and functional composition, with diet possibly mediating some of the associations described. Both the microbiome and diet are modifiable, which opens the way for novel therapeutic strategies. High-throughput omics techniques applied on microbiome samples (meta-omics) hold the unprecedented potential to shed light on the intricate links between diet, the microbiome, the metabolome and cardiometabolic health, with a top-down approach. However, effective integration of complementary meta-omic techniques is an open challenge and their application on large cohorts is still limited. Here we review meta-omics techniques and discuss their potential in this context, highlighting recent large-scale efforts and the novel insights they provided. Finally, we look to the next decade of meta-omics research and discuss various translational and clinical pathways to improving cardiometabolic health.
Collapse
Affiliation(s)
- Mireia Valles-Colomer
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Cristina Menni
- Department of Twin Research, King's College London, London, UK
| | - Sarah E Berry
- Department of Nutritional Sciences, King's College London, London, UK
| | - Ana M Valdes
- School of Medicine, University of Nottingham, Nottingham, UK
- Nottingham National Institute for Health Research Biomedical Research Centre, Nottingham, UK
| | - Tim D Spector
- Department of Twin Research, King's College London, London, UK
| | - Nicola Segata
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- European Institute of Oncology, Scientific Institute for Research, Hospitalization and Healthcare, Milan, Italy.
| |
Collapse
|
67
|
Boßelmann CM, Hedrich UBS, Lerche H, Pfeifer N. Predicting functional effects of ion channel variants using new phenotypic machine learning methods. PLoS Comput Biol 2023; 19:e1010959. [PMID: 36877742 PMCID: PMC10019634 DOI: 10.1371/journal.pcbi.1010959] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 03/16/2023] [Accepted: 02/19/2023] [Indexed: 03/07/2023] Open
Abstract
Missense variants in genes encoding ion channels are associated with a spectrum of severe diseases. Variant effects on biophysical function correlate with clinical features and can be categorized as gain- or loss-of-function. This information enables a timely diagnosis, facilitates precision therapy, and guides prognosis. Functional characterization presents a bottleneck in translational medicine. Machine learning models may be able to rapidly generate supporting evidence by predicting variant functional effects. Here, we describe a multi-task multi-kernel learning framework capable of harmonizing functional results and structural information with clinical phenotypes. This novel approach extends the human phenotype ontology towards kernel-based supervised machine learning. Our gain- or loss-of-function classifier achieves high performance (mean accuracy 0.853 SD 0.016, mean AU-ROC 0.912 SD 0.025), outperforming both conventional baseline and state-of-the-art methods. Performance is robust across different phenotypic similarity measures and largely insensitive to phenotypic noise or sparsity. Localized multi-kernel learning offered biological insight and interpretability by highlighting channels with implicit genotype-phenotype correlations or latent task similarity for downstream analysis.
Collapse
Affiliation(s)
- Christian Malte Boßelmann
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Tuebingen, Germany
- Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen, Germany
| | - Ulrike B. S. Hedrich
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Tuebingen, Germany
| | - Holger Lerche
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Tuebingen, Germany
- * E-mail: (HL); (NP)
| | - Nico Pfeifer
- Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tuebingen, Germany
- * E-mail: (HL); (NP)
| |
Collapse
|
68
|
Price BA, Marron JS, Mose LE, Perou CM, Parker JS. Translating transcriptomic findings from cancer model systems to humans through joint dimension reduction. Commun Biol 2023; 6:179. [PMID: 36797360 PMCID: PMC9935626 DOI: 10.1038/s42003-023-04529-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 01/25/2023] [Indexed: 02/18/2023] Open
Abstract
Model systems are an essential resource in cancer research. They simulate effects that we can infer into humans, but come at a risk of inaccurately representing human biology. This inaccuracy can lead to inconclusive experiments or misleading results, urging the need for an improved process for translating model system findings into human-relevant data. We present a process for applying joint dimension reduction (jDR) to horizontally integrate gene expression data across model systems and human tumor cohorts. We then use this approach to combine human TCGA gene expression data with data from human cancer cell lines and mouse model tumors. By identifying the aspects of genomic variation joint-acting across cohorts, we demonstrate how predictive modeling and clinical biomarkers from model systems can be improved.
Collapse
Affiliation(s)
- Brandon A Price
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - J S Marron
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Lisle E Mose
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Charles M Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Joel S Parker
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
69
|
Flores JE, Claborne DM, Weller ZD, Webb-Robertson BJM, Waters KM, Bramer LM. Missing data in multi-omics integration: Recent advances through artificial intelligence. Front Artif Intell 2023; 6:1098308. [PMID: 36844425 PMCID: PMC9949722 DOI: 10.3389/frai.2023.1098308] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Biological systems function through complex interactions between various 'omics (biomolecules), and a more complete understanding of these systems is only possible through an integrated, multi-omic perspective. This has presented the need for the development of integration approaches that are able to capture the complex, often non-linear, interactions that define these biological systems and are adapted to the challenges of combining the heterogenous data across 'omic views. A principal challenge to multi-omic integration is missing data because all biomolecules are not measured in all samples. Due to either cost, instrument sensitivity, or other experimental factors, data for a biological sample may be missing for one or more 'omic techologies. Recent methodological developments in artificial intelligence and statistical learning have greatly facilitated the analyses of multi-omics data, however many of these techniques assume access to completely observed data. A subset of these methods incorporate mechanisms for handling partially observed samples, and these methods are the focus of this review. We describe recently developed approaches, noting their primary use cases and highlighting each method's approach to handling missing data. We additionally provide an overview of the more traditional missing data workflows and their limitations; and we discuss potential avenues for further developments as well as how the missing data issue and its current solutions may generalize beyond the multi-omics context.
Collapse
Affiliation(s)
- Javier E. Flores
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Daniel M. Claborne
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Zachary D. Weller
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Bobbie-Jo M. Webb-Robertson
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Katrina M. Waters
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Lisa M. Bramer
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| |
Collapse
|
70
|
Hu X, Carver BF, El-Kassaby YA, Zhu L, Chen C. Weighted kernels improve multi-environment genomic prediction. Heredity (Edinb) 2023; 130:82-91. [PMID: 36522412 PMCID: PMC9905581 DOI: 10.1038/s41437-022-00582-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 11/27/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Crucial to variety improvement programs is the reliable and accurate prediction of genotype's performance across environments. However, due to the impactful presence of genotype by environment (G×E) interaction that dictates how changes in expression and function of genes influence target traits in different environments, prediction performance of genomic selection (GS) using single-environment models often falls short. Furthermore, despite the successes of genome-wide association studies (GWAS), the genetic insights derived from genome-to-phenome mapping have not yet been incorporated in predictive analytics, making GS models that use Gaussian kernel primarily an estimator of genomic similarity, instead of the underlying genetics characteristics of the populations. Here, we developed a GS framework that, in addition to capturing the overall genomic relationship, can capitalize on the signal of genetic associations of the phenotypic variation as well as the genetic characteristics of the populations. The capacity of predicting the performance of populations across environments was demonstrated by an overall gain in predictability up to 31% for the winter wheat DH population. Compared to Gaussian kernels, we showed that our multi-environment weighted kernels could better leverage the significance of genetic associations and yielded a marked improvement of 4-33% in prediction accuracy for half-sib families. Furthermore, the flexibility incorporated in our Bayesian implementation provides the generalizable capacity required for predicting multiple highly genetic heterogeneous populations across environments, allowing reliable GS for genetic improvement programs that have no access to genetically uniform material.
Collapse
Affiliation(s)
- Xiaowei Hu
- grid.65519.3e0000 0001 0721 7331Department of Statistics, Oklahoma State University, Stillwater, OK USA ,grid.27755.320000 0000 9136 933XPresent Address: Center for Public Health Genomics, University of Virginia, Charlottesville, VA USA
| | - Brett F. Carver
- grid.65519.3e0000 0001 0721 7331Department of Plant and Soil Sciences, Oklahoma State University, Stillwater, OK USA
| | - Yousry A. El-Kassaby
- grid.17091.3e0000 0001 2288 9830Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC Canada
| | - Lan Zhu
- grid.65519.3e0000 0001 0721 7331Department of Statistics, Oklahoma State University, Stillwater, OK USA
| | - Charles Chen
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK, USA.
| |
Collapse
|
71
|
Yu Y, Shi S. Development and Perspective of Rhodotorula toruloides as an Efficient Cell Factory. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:1802-1819. [PMID: 36688927 DOI: 10.1021/acs.jafc.2c07361] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Rhodotorula toruloides is receiving significant attention as a novel cell factory because of its high production of lipids and carotenoids, fast growth and high cell density, as well as the ability to utilize a wide variety of substrates. These attractive traits of R. toruloides make it possible to become a low-cost producer that can be engineered for the production of various fuels and chemicals. However, the lack of understanding and genetic engineering tools impedes its metabolic engineering applications. A number of research efforts have been devoted to filling these gaps. This review focuses on recent developments in genetic engineering tools, advances in systems biology for improved understandings, and emerging engineered strains for metabolic engineering applications. Finally, future trends and barriers in developing R. toruloides as a cell factory are also discussed.
Collapse
Affiliation(s)
- Yi Yu
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering, College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Shuobo Shi
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering, College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| |
Collapse
|
72
|
Big Data in Gastroenterology Research. Int J Mol Sci 2023; 24:ijms24032458. [PMID: 36768780 PMCID: PMC9916510 DOI: 10.3390/ijms24032458] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/18/2023] [Accepted: 01/20/2023] [Indexed: 01/28/2023] Open
Abstract
Studying individual data types in isolation provides only limited and incomplete answers to complex biological questions and particularly falls short in revealing sufficient mechanistic and kinetic details. In contrast, multi-omics approaches to studying health and disease permit the generation and integration of multiple data types on a much larger scale, offering a comprehensive picture of biological and disease processes. Gastroenterology and hepatobiliary research are particularly well-suited to such analyses, given the unique position of the luminal gastrointestinal (GI) tract at the nexus between the gut (mucosa and luminal contents), brain, immune and endocrine systems, and GI microbiome. The generation of 'big data' from multi-omic, multi-site studies can enhance investigations into the connections between these organ systems and organisms and more broadly and accurately appraise the effects of dietary, pharmacological, and other therapeutic interventions. In this review, we describe a variety of useful omics approaches and how they can be integrated to provide a holistic depiction of the human and microbial genetic and proteomic changes underlying physiological and pathophysiological phenomena. We highlight the potential pitfalls and alternatives to help avoid the common errors in study design, execution, and analysis. We focus on the application, integration, and analysis of big data in gastroenterology and hepatobiliary research.
Collapse
|
73
|
Chen X, Han M, Li Y, Li X, Zhang J, Zhu Y. Identification of functional gene modules by integrating multi-omics data and known molecular interactions. Front Genet 2023; 14:1082032. [PMID: 36760999 PMCID: PMC9902936 DOI: 10.3389/fgene.2023.1082032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 01/11/2023] [Indexed: 01/25/2023] Open
Abstract
Multi-omics data integration has emerged as a promising approach to identify patient subgroups. However, in terms of grouping genes (or gene products) into co-expression modules, data integration methods suffer from two main drawbacks. First, most existing methods only consider genes or samples measured in all different datasets. Second, known molecular interactions (e.g., transcriptional regulatory interactions, protein-protein interactions and biological pathways) cannot be utilized to assist in module detection. Herein, we present a novel data integration framework, Correlation-based Local Approximation of Membership (CLAM), which provides two methodological innovations to address these limitations: 1) constructing a trans-omics neighborhood matrix by integrating multi-omics datasets and known molecular interactions, and 2) using a local approximation procedure to define gene modules from the matrix. Applying Correlation-based Local Approximation of Membership to human colorectal cancer (CRC) and mouse B-cell differentiation multi-omics data obtained from The Cancer Genome Atlas (TCGA), Clinical Proteomics Tumor Analysis Consortium (CPTAC), Gene Expression Omnibus (GEO) and ProteomeXchange database, we demonstrated its superior ability to recover biologically relevant modules and gene ontology (GO) terms. Further investigation of the colorectal cancer modules revealed numerous transcription factors and KEGG pathways that played crucial roles in colorectal cancer progression. Module-based survival analysis constructed four survival-related networks in which pairwise gene correlations were significantly correlated with colorectal cancer patient survival. Overall, the series of evaluations demonstrated the great potential of Correlation-based Local Approximation of Membership for identifying modular biomarkers for complex diseases. We implemented Correlation-based Local Approximation of Membership as a user-friendly application available at https://github.com/free1234hm/CLAM.
Collapse
Affiliation(s)
- Xiaoqing Chen
- Basic Medical School, Anhui Medical University, Hefei, China,National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China
| | - Mingfei Han
- National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China
| | - Yingxing Li
- Central Research Laboratory, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiao Li
- National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China
| | - Jiaqi Zhang
- National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China
| | - Yunping Zhu
- Basic Medical School, Anhui Medical University, Hefei, China,National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China,*Correspondence: Yunping Zhu,
| |
Collapse
|
74
|
Harbig TA, Fratte J, Krone M, Nieselt K. OmicsTIDE: interactive exploration of trends in multi-omics data. BIOINFORMATICS ADVANCES 2023; 3:vbac093. [PMID: 36698763 PMCID: PMC9869718 DOI: 10.1093/bioadv/vbac093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 10/18/2022] [Accepted: 12/06/2022] [Indexed: 01/22/2023]
Abstract
Motivation The increasing amount of data produced by omics technologies has enabled researchers to study phenomena across multiple omics layers. Besides data-driven analysis strategies, interactive visualization tools have been developed for a more transparent analysis. However, most state-of-the-art tools do not reconstruct the impact of a single omics layer on the integration result. Results We developed a data classification scheme focusing on different aspects of multi-omics datasets for a systemic understanding. Based on this classification, we developed the Omics Trend-comparing Interactive Data Explorer (OmicsTIDE), an interactive visualization tool for the comparison of gene-based quantitative omics data. The tool consists of a computational part that clusters omics datasets to determine trends and an interactive visualization. The trends are visualized as profile plots and are connected by a Sankey diagram that allows for an interactive pairwise trend comparison to discover concordant and discordant trends. Moreover, large-scale omics datasets are broken down into small subsets that can be analyzed functionally using Gene Ontology enrichment within few analysis steps. We demonstrate the interactive analysis using OmicsTIDE with two case studies focusing on different experimental designs. Availability and implementation OmicsTIDE is a web tool available via http://omicstide-tuevis.cs.uni-tuebingen.de/. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Theresa A Harbig
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tuebingen 72076, Germany
| | - Julian Fratte
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tuebingen 72076, Germany
| | - Michael Krone
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tuebingen 72076, Germany
| | - Kay Nieselt
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tuebingen 72076, Germany
| |
Collapse
|
75
|
Ochoa S, Hernández-Lemus E. Functional impact of multi-omic interactions in breast cancer subtypes. Front Genet 2023; 13:1078609. [PMID: 36685900 PMCID: PMC9850112 DOI: 10.3389/fgene.2022.1078609] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/15/2022] [Indexed: 01/07/2023] Open
Abstract
Multi-omic approaches are expected to deliver a broader molecular view of cancer. However, the promised mechanistic explanations have not quite settled yet. Here, we propose a theoretical and computational analysis framework to semi-automatically produce network models of the regulatory constraints influencing a biological function. This way, we identified functions significantly enriched on the analyzed omics and described associated features, for each of the four breast cancer molecular subtypes. For instance, we identified functions sustaining over-representation of invasion-related processes in the basal subtype and DNA modification processes in the normal tissue. We found limited overlap on the omics-associated functions between subtypes; however, a startling feature intersection within subtype functions also emerged. The examples presented highlight new, potentially regulatory features, with sound biological reasons to expect a connection with the functions. Multi-omic regulatory networks thus constitute reliable models of the way omics are connected, demonstrating a capability for systematic generation of mechanistic hypothesis.
Collapse
Affiliation(s)
- Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico,Programa de Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico,Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico,*Correspondence: Enrique Hernández-Lemus,
| |
Collapse
|
76
|
Jihad M, Yet İ. Multiomics Integration at Single-Cell Resolution Using Bayesian Networks: A Case Study in Hepatocellular Carcinoma. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023; 27:24-33. [PMID: 36602810 DOI: 10.1089/omi.2022.0170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Multiomics data integration is one of the leading frontiers of complex disease research and integrative biology. The advances in single-cell sequencing technologies offer yet another crucial dimension in multiomics research. The single-cell studies enable the study and integration of multiomics data simultaneously in the same cell. We report in this study multiomics data integration in single-cell resolution using Bayesian networks (BNs) in a case study of hepatocellular carcinoma (HCC). A BN encodes the conditional dependencies/independencies of variables using a graphical model with an accompanying joint probability. RNA-seq and Reduced Representation Bisulfite Sequencing data were analyzed separately, and copy number variations were estimated by the hidden Markov model method. Several BN models were constructed to reveal omics' causal and associational relationships. These methods were subjected to a validation study using an independent data set. We show the heterogeneity of the multiple cellular layers of HCC at single-cell omics resolution by identifying best-fitted BN models of 295 genes. We also provide novel insights into the multiomics mechanistic relationships in the human lymphocyte antigen class I genes in HCC. To the best of our knowledge, this is the first study to focus on integrating omics data using a machine learning algorithm, BNs, at the single-cell resolution using a case study of HCC.
Collapse
Affiliation(s)
- Muntadher Jihad
- Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
| | - İdil Yet
- Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
| |
Collapse
|
77
|
Vos WAJW, Groenendijk AL, Blaauw MJT, van Eekeren LE, Navas A, Cleophas MCP, Vadaq N, Matzaraki V, dos Santos JC, Meeder EMG, Fröberg J, Weijers G, Zhang Y, Fu J, ter Horst R, Bock C, Knoll R, Aschenbrenner AC, Schultze J, Vanderkerckhove L, Hwandih T, Wonderlich ER, Vemula SV, van der Kolk M, de Vet SCP, Blok WL, Brinkman K, Rokx C, Schellekens AFA, de Mast Q, Joosten LAB, Berrevoets MAH, Stalenhoef JE, Verbon A, van Lunzen J, Netea MG, van der Ven AJAM. The 2000HIV study: Design, multi-omics methods and participant characteristics. Front Immunol 2022; 13:982746. [PMID: 36605197 PMCID: PMC9809279 DOI: 10.3389/fimmu.2022.982746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 10/25/2022] [Indexed: 01/07/2023] Open
Abstract
Background Even during long-term combination antiretroviral therapy (cART), people living with HIV (PLHIV) have a dysregulated immune system, characterized by persistent immune activation, accelerated immune ageing and increased risk of non-AIDS comorbidities. A multi-omics approach is applied to a large cohort of PLHIV to understand pathways underlying these dysregulations in order to identify new biomarkers and novel genetically validated therapeutic drugs targets. Methods The 2000HIV study is a prospective longitudinal cohort study of PLHIV on cART. In addition, untreated HIV spontaneous controllers were recruited. In-depth multi-omics characterization will be performed, including genomics, epigenomics, transcriptomics, proteomics, metabolomics and metagenomics, functional immunological assays and extensive immunophenotyping. Furthermore, the latent viral reservoir will be assessed through cell associated HIV-1 RNA and DNA, and full-length individual proviral sequencing on a subset. Clinical measurements include an ECG, carotid intima-media thickness and plaque measurement, hepatic steatosis and fibrosis measurement as well as psychological symptoms and recreational drug questionnaires. Additionally, considering the developing pandemic, COVID-19 history and vaccination was recorded. Participants return for a two-year follow-up visit. The 2000HIV study consists of a discovery and validation cohort collected at separate sites to immediately validate any finding in an independent cohort. Results Overall, 1895 PLHIV from four sites were included for analysis, 1559 in the discovery and 336 in the validation cohort. The study population was representative of a Western European HIV population, including 288 (15.2%) cis-women, 463 (24.4%) non-whites, and 1360 (71.8%) MSM (Men who have Sex with Men). Extreme phenotypes included 114 spontaneous controllers, 81 rapid progressors and 162 immunological non-responders. According to the Framingham score 321 (16.9%) had a cardiovascular risk of >20% in the next 10 years. COVID-19 infection was documented in 234 (12.3%) participants and 474 (25.0%) individuals had received a COVID-19 vaccine. Conclusion The 2000HIV study established a cohort of 1895 PLHIV that employs multi-omics to discover new biological pathways and biomarkers to unravel non-AIDS comorbidities, extreme phenotypes and the latent viral reservoir that impact the health of PLHIV. The ultimate goal is to contribute to a more personalized approach to the best standard of care and a potential cure for PLHIV.
Collapse
Affiliation(s)
- Wilhelm A. J. W. Vos
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands,Department of Internal Medicine and Infectious Diseases, OLVG, Amsterdam, Netherlands,*Correspondence: Wilhelm A. J. W. Vos,
| | - Albert L. Groenendijk
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands,Department of Internal Medicine and Department of Medical Microbiology and Infectious diseases, Erasmus Medical Center (MC), Erasmus University, Rotterdam, Netherlands
| | - Marc J. T. Blaauw
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands,Department of Internal Medicine and Infectious Diseases, Elizabeth-Tweesteden Ziekenhuis, Tilburg, Netherlands
| | - Louise E. van Eekeren
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Adriana Navas
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Maartje C. P. Cleophas
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Nadira Vadaq
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Vasiliki Matzaraki
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Jéssica C. dos Santos
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Elise M. G. Meeder
- Department of Psychiatry, Radboudumc, Radboud University, Nijmegen, Netherlands,Donders Institute for Brain, Radboud University, Cognition and Behavior, Nijmegen, Netherlands,Nijmegen Institute for Scientist-Practitioners in Addiction (NISPA), Radboud University, Nijmegen, Netherlands
| | - Janeri Fröberg
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Gert Weijers
- Medical UltraSound Imaging Center (MUSIC) Department of Medical Imaging, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Yue Zhang
- Universitair Medisch Centrum Groningen, University of Groningen, Groningen, Netherlands
| | - Jingyuan Fu
- Universitair Medisch Centrum Groningen, University of Groningen, Groningen, Netherlands
| | - Rob ter Horst
- Center for Molecular Medicine (CeMM) Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Christoph Bock
- Center for Molecular Medicine (CeMM) Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria,Medical University of Vienna, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Institute of Artificial Intelligence, Vienna, Austria
| | - Rainer Knoll
- Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) eingetragener Verein (e.V.), Bonn, Germany,Genomics & Immunoregulation, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany
| | - Anna C. Aschenbrenner
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands,Platform for Single Cell Genomics and Epigenomics (PRECISE), DZNE and University of Bonn, Bonn, Germany
| | - Joachim Schultze
- Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) eingetragener Verein (e.V.), Bonn, Germany,Genomics & Immunoregulation, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany,Platform for Single Cell Genomics and Epigenomics (PRECISE), DZNE and University of Bonn, Bonn, Germany
| | - Linos Vanderkerckhove
- HIV Cure Research Center, Department of Internal Medicine and Pediatrics, Ghent University Hospital, Ghent University, Ghent, Belgium
| | - Talent Hwandih
- Medical Science Department, Sysmex Europe Societas Europaea (SE), Norderstedt, Germany
| | | | - Sai V. Vemula
- Clinical Development, ViiV Healthcare, Durham, NC, United States
| | - Mike van der Kolk
- Translational Medical Research, ViiV Healthcare, Brentford, United Kingdom
| | - Sterre C. P. de Vet
- Department of Internal Medicine and Infectious Diseases, OLVG, Amsterdam, Netherlands
| | - Willem L. Blok
- Department of Internal Medicine and Infectious Diseases, OLVG, Amsterdam, Netherlands
| | - Kees Brinkman
- Department of Internal Medicine and Infectious Diseases, OLVG, Amsterdam, Netherlands
| | - Casper Rokx
- Department of Internal Medicine and Department of Medical Microbiology and Infectious diseases, Erasmus Medical Center (MC), Erasmus University, Rotterdam, Netherlands
| | - Arnt F. A. Schellekens
- Department of Psychiatry, Radboudumc, Radboud University, Nijmegen, Netherlands,Donders Institute for Brain, Radboud University, Cognition and Behavior, Nijmegen, Netherlands,Nijmegen Institute for Scientist-Practitioners in Addiction (NISPA), Radboud University, Nijmegen, Netherlands
| | - Quirijn de Mast
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| | - Leo A. B. Joosten
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands,Department of Medical Genetics, Iuliu Hatieganu University of Medicine and Pharmacy, Cluj-Napoca, Romania
| | - Marvin A. H. Berrevoets
- Department of Internal Medicine and Infectious Diseases, Elizabeth-Tweesteden Ziekenhuis, Tilburg, Netherlands
| | - Janneke E. Stalenhoef
- Department of Internal Medicine and Infectious Diseases, OLVG, Amsterdam, Netherlands
| | - Annelies Verbon
- Department of Internal Medicine and Department of Medical Microbiology and Infectious diseases, Erasmus Medical Center (MC), Erasmus University, Rotterdam, Netherlands
| | - Jan van Lunzen
- Translational Medical Research, ViiV Healthcare, Brentford, United Kingdom
| | - Mihai G. Netea
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands,Department of Immunology and Metabolism, Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Andre J. A. M. van der Ven
- Department of Internal Medicine and Infectious Diseases, Radboudumc, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
78
|
Alfatemi A, Peng H, Rong W, Zhang B, Cai H. Patient subgrouping with distinct survival rates via integration of multiomics data on a Grassmann manifold. BMC Med Inform Decis Mak 2022; 22:190. [PMID: 35870923 PMCID: PMC9308936 DOI: 10.1186/s12911-022-01938-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 07/15/2022] [Indexed: 11/10/2022] Open
Abstract
Background Patient subgroups are important for easily understanding a disease and for providing precise yet personalized treatment through multiple omics dataset integration. Multiomics datasets are produced daily. Thus, the fusion of heterogeneous big data into intrinsic structures is an urgent problem. Novel mathematical methods are needed to process these data in a straightforward way. Results We developed a novel method for subgrouping patients with distinct survival rates via the integration of multiple omics datasets and by using principal component analysis to reduce the high data dimensionality. Then, we constructed similarity graphs for patients, merged the graphs in a subspace, and analyzed them on a Grassmann manifold. The proposed method could identify patient subgroups that had not been reported previously by selecting the most critical information during the merging at each level of the omics dataset. Our method was tested on empirical multiomics datasets from The Cancer Genome Atlas. Conclusion Through the integration of microRNA, gene expression, and DNA methylation data, our method accurately identified patient subgroups and achieved superior performance compared with popular methods. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-022-01938-y.
Collapse
|
79
|
Pino JC, Lubbock AL, Harris LA, Gutierrez DB, Farrow MA, Muszynski N, Tsui T, Sherrod SD, Norris JL, McLean JA, Caprioli RM, Wikswo JP, Lopez CF. Processes in DNA damage response from a whole-cell multi-omics perspective. iScience 2022; 25:105341. [PMID: 36339253 PMCID: PMC9633746 DOI: 10.1016/j.isci.2022.105341] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 08/10/2022] [Accepted: 10/10/2022] [Indexed: 11/09/2022] Open
Abstract
Technological advances have made it feasible to collect multi-condition multi-omic time courses of cellular response to perturbation, but the complexity of these datasets impedes discovery due to challenges in data management, analysis, visualization, and interpretation. Here, we report a whole-cell mechanistic analysis of HL-60 cellular response to bendamustine. We integrate both enrichment and network analysis to show the progression of DNA damage and programmed cell death over time in molecular, pathway, and process-level detail using an interactive analysis framework for multi-omics data. Our framework, Mechanism of Action Generator Involving Network analysis (MAGINE), automates network construction and enrichment analysis across multiple samples and platforms, which can be integrated into our annotated gene-set network to combine the strengths of networks and ontology-driven analysis. Taken together, our work demonstrates how multi-omics integration can be used to explore signaling processes at various resolutions and demonstrates multi-pathway involvement beyond the canonical bendamustine mechanism.
Collapse
Affiliation(s)
- James C. Pino
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
- Pacific Northwest National Laboratory, Seattle, WA, USA
| | - Alexander L.R. Lubbock
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Leonard A. Harris
- Department of Biomedical Engineering, University of Arkansas, Fayetteville, AR, USA
- Interdisciplinary Graduate Program in Cell & Molecular Biology, University of Arkansas, Fayetteville, AR, USA
- Cancer Biology Program, Winthrop P. Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Danielle B. Gutierrez
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Melissa A. Farrow
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Nicole Muszynski
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA
| | - Tina Tsui
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Stacy D. Sherrod
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Innovative Technology (CIT), Vanderbilt University, Nashville, TN, USA
| | - Jeremy L. Norris
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - John A. McLean
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Innovative Technology (CIT), Vanderbilt University, Nashville, TN, USA
| | - Richard M. Caprioli
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
- Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - John P. Wikswo
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA
- Department of Physics and Astronomy, Vanderbilt University, Nashville, TN, USA
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN, USA
| | - Carlos F. Lopez
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Pacific Northwest National Laboratory, Seattle, WA, USA
| |
Collapse
|
80
|
Guillotin S, Delcourt N. Studying the Impact of Persistent Organic Pollutants Exposure on Human Health by Proteomic Analysis: A Systematic Review. Int J Mol Sci 2022; 23:ijms232214271. [PMID: 36430748 PMCID: PMC9692675 DOI: 10.3390/ijms232214271] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/14/2022] [Accepted: 11/15/2022] [Indexed: 11/19/2022] Open
Abstract
Persistent organic pollutants (POPs) are organic chemical substances that are widely distributed in environments around the globe. POPs accumulate in living organisms and are found at high concentrations in the food chain. Humans are thus continuously exposed to these chemical substances, in which they exert hepatic, reproductive, developmental, behavioral, neurologic, endocrine, cardiovascular, and immunologic adverse health effects. However, considerable information is unknown regarding the mechanism by which POPs exert their adverse effects in humans, as well as the molecular and cellular responses involved. Data are notably lacking concerning the consequences of acute and chronic POP exposure on changes in gene expression, protein profile, and metabolic pathways. We conducted a systematic review to provide a synthesis of knowledge of POPs arising from proteomics-based research. The data source used for this review was PubMed. This study was carried out following the PRISMA guidelines. Of the 742 items originally identified, 89 were considered in the review. This review presents a comprehensive overview of the most recent research and available solutions to explore proteomics datasets to identify new features relevant to human health. Future perspectives in proteomics studies are discussed.
Collapse
Affiliation(s)
- Sophie Guillotin
- Poison Control Centre, Toulouse University Hospital, 31059 Toulouse, France
- INSERM UMR 1295, Centre d’Epidémiologie et de Recherche en Santé des Populations, 31000 Toulouse, France
| | - Nicolas Delcourt
- Poison Control Centre, Toulouse University Hospital, 31059 Toulouse, France
- INSERM UMR 1214, Toulouse NeuroImaging Center, 31024 Toulouse, France
- Correspondence: ; Tel.: +33-(0)-567691640
| |
Collapse
|
81
|
Way GP, Natoli T, Adeboye A, Litichevskiy L, Yang A, Lu X, Caicedo JC, Cimini BA, Karhohs K, Logan DJ, Rohban MH, Kost-Alimova M, Hartland K, Bornholdt M, Chandrasekaran SN, Haghighi M, Weisbart E, Singh S, Subramanian A, Carpenter AE. Morphology and gene expression profiling provide complementary information for mapping cell state. Cell Syst 2022; 13:911-923.e9. [PMID: 36395727 PMCID: PMC10246468 DOI: 10.1016/j.cels.2022.10.001] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 05/12/2022] [Accepted: 09/28/2022] [Indexed: 01/26/2023]
Abstract
Morphological and gene expression profiling can cost-effectively capture thousands of features in thousands of samples across perturbations by disease, mutation, or drug treatments, but it is unclear to what extent the two modalities capture overlapping versus complementary information. Here, using both the L1000 and Cell Painting assays to profile gene expression and cell morphology, respectively, we perturb human A549 lung cancer cells with 1,327 small molecules from the Drug Repurposing Hub across six doses, providing a data resource including dose-response data from both assays. The two assays capture both shared and complementary information for mapping cell state. Cell Painting profiles from compound perturbations are more reproducible and show more diversity but measure fewer distinct groups of features. Applying unsupervised and supervised methods to predict compound mechanisms of action (MOAs) and gene targets, we find that the two assays not only provide a partially shared but also a complementary view of drug mechanisms. Given the numerous applications of profiling in biology, our analyses provide guidance for planning experiments that profile cells for detecting distinct cell types, disease phenotypes, and response to chemical or genetic perturbations.
Collapse
Affiliation(s)
- Gregory P Way
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Ted Natoli
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Adeniyi Adeboye
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Lev Litichevskiy
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Andrew Yang
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Xiaodong Lu
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Juan C Caicedo
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Beth A Cimini
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kyle Karhohs
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - David J Logan
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Mohammad H Rohban
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Maria Kost-Alimova
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kate Hartland
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Michael Bornholdt
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Marzieh Haghighi
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Erin Weisbart
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Shantanu Singh
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Aravind Subramanian
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Anne E Carpenter
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
82
|
Hawe JS, Saha A, Waldenberger M, Kunze S, Wahl S, Müller-Nurasyid M, Prokisch H, Grallert H, Herder C, Peters A, Strauch K, Theis FJ, Gieger C, Chambers J, Battle A, Heinig M. Network reconstruction for trans acting genetic loci using multi-omics data and prior information. Genome Med 2022; 14:125. [PMID: 36344995 PMCID: PMC9641770 DOI: 10.1186/s13073-022-01124-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/11/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks regulating cellular processes leave footprints in QTL results as so-called trans-QTL hotspots. Reconstructing these networks is a complex endeavor and use of biological prior information can improve network inference. However, previous efforts were limited in the types of priors used or have only been applied to model systems. In this study, we reconstruct the regulatory networks underlying trans-QTL hotspots using human cohort data and data-driven prior information. METHODS We devised a new strategy to integrate QTL with human population scale multi-omics data. State-of-the art network inference methods including BDgraph and glasso were applied to these data. Comprehensive prior information to guide network inference was manually curated from large-scale biological databases. The inference approach was extensively benchmarked using simulated data and cross-cohort replication analyses. Best performing methods were subsequently applied to real-world human cohort data. RESULTS Our benchmarks showed that prior-based strategies outperform methods without prior information in simulated data and show better replication across datasets. Application of our approach to human cohort data highlighted two novel regulatory networks related to schizophrenia and lean body mass for which we generated novel functional hypotheses. CONCLUSIONS We demonstrate that existing biological knowledge can improve the integrative analysis of networks underlying trans associations and generate novel hypotheses about regulatory mechanisms.
Collapse
Affiliation(s)
- Johann S. Hawe
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- German Heart Centre Munich, Department of Cardiology, Technical University Munich, Munich, Germany
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Ashis Saha
- Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
| | - Melanie Waldenberger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Sonja Kunze
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Simone Wahl
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Martina Müller-Nurasyid
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- IBE, Faculty of Medicine, LMU Munich, 81377 Munich, Germany
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
- Department of Internal Medicine I (Cardiology), Hospital of the Ludwig-Maximilians-University (LMU) Munich, Munich, Germany
| | - Holger Prokisch
- Institute of Human Genetics, School of Medicine, Technische Universität München, Munich, Germany
| | - Harald Grallert
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Christian Herder
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany
- Division of Endocrinology and Diabetology, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Annette Peters
- Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Konstantin Strauch
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
- Chair of Genetic Epidemiology, IBE, Faculty of Medicine, LMU Munich, Munich, Germany
| | - Fabian J. Theis
- Department of Informatics, Technical University of Munich, Garching, Germany
- Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Christian Gieger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - John Chambers
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK
- Lee Kong Chian School of Medicine, Nanyang Technological University, 308232 Singapore, Singapore
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Matthias Heinig
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
- Department of Informatics, Technical University of Munich, Garching, Germany
- Munich Heart Association, Partner Site Munich, DZHK (German Centre for Cardiovascular Research), 10785 Berlin, Germany
| |
Collapse
|
83
|
Zhanpeng H, Jiekang W. A Multiview Clustering Method With Low-Rank and Sparsity Constraints for Cancer Subtyping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3213-3223. [PMID: 34705654 DOI: 10.1109/tcbb.2021.3122917] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiomics data clustering is one of the major challenges in the field of precision medicine. Integration of multiomics data for cancer subtyping can improve the understanding on cancer and reveal systems-level insights. How to integrate multiomics data for accurate cancer subtyping is an interesting and challenging research problem. To capture the global and the local structure of omics data, a novel framework for integrating multiomics data is proposed for cancer subtyping. Multiview clustering with low-rank and sparsity constraints (MVCLRS) can measure the local similarities of samples in each omics data and obtain global consensus structures by integrating the multiomics data. The main insight provided by MVCLRS is that low-rank sparse subspace clustering for the construction of an affinity matrix can best capture the local similarities in omics data. Extensive testing is conducted on 10 real world cancer datasets with multiomics from The Cancer Genome Atlas. Compared with 10 state-of-the-art multiomics clustering algorithms, the MVCLRS performs better in the 10 cancer datasets by providing its clustering results with at least one enriched clinical label in nine of ten cancer subtypes, the most of any method.
Collapse
|
84
|
Li H, Chiang AWT, Lewis NE. Artificial intelligence in the analysis of glycosylation data. Biotechnol Adv 2022; 60:108008. [PMID: 35738510 PMCID: PMC11157671 DOI: 10.1016/j.biotechadv.2022.108008] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 06/15/2022] [Accepted: 06/16/2022] [Indexed: 11/18/2022]
Abstract
Glycans are complex, yet ubiquitous across biological systems. They are involved in diverse essential organismal functions. Aberrant glycosylation may lead to disease development, such as cancer, autoimmune diseases, and inflammatory diseases. Glycans, both normal and aberrant, are synthesized using extensive glycosylation machinery, and understanding this machinery can provide invaluable insights for diagnosis, prognosis, and treatment of various diseases. Increasing amounts of glycomics data are being generated thanks to advances in glycoanalytics technologies, but to maximize the value of such data, innovations are needed for analyzing and interpreting large-scale glycomics data. Artificial intelligence (AI) provides a powerful analysis toolbox in many scientific fields, and here we review state-of-the-art AI approaches on glycosylation analysis. We further discuss how models can be analyzed to gain mechanistic insights into glycosylation machinery and how the machinery shapes glycans under different scenarios. Finally, we propose how to leverage the gained knowledge for developing predictive AI-based models of glycosylation. Thus, guiding future research of AI-based glycosylation model development will provide valuable insights into glycosylation and glycan machinery.
Collapse
Affiliation(s)
- Haining Li
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Austin W T Chiang
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Nathan E Lewis
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA; Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
85
|
Chen S, Zang Y, Xu B, Lu B, Ma R, Miao P, Chen B. An Unsupervised Deep Learning-Based Model Using Multiomics Data to Predict Prognosis of Patients with Stomach Adenocarcinoma. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:5844846. [PMID: 36339684 PMCID: PMC9633210 DOI: 10.1155/2022/5844846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 09/25/2022] [Accepted: 10/08/2022] [Indexed: 09/08/2023]
Abstract
METHODS Patients (363 in total) with stomach adenocarcinoma from The Cancer Genome Atlas (TCGA) cohort were included. An autoencoder was constructed to integrate the RNA sequencing, miRNA sequencing, and methylation data. The features of the bottleneck layer were used to perform the k-means clustering algorithm to obtain different subgroups for evaluating the prognosis-related risk of stomach adenocarcinoma. The model's robustness was verified using a 10-fold cross-validation (CV). Survival was analyzed by the Kaplan-Meier method. Univariate and multivariate Cox regression was used to estimate hazard risk. The model was validated in three independent cohorts with different endpoints. RESULTS The patients were divided into low-risk and high-risk groups according to the k-means clustering algorithm. The high-risk group had a significantly higher risk of poor survival (log-rank P value = 2.80e - 06; adjusted hazard ratio = 2.386, 95% confidence interval: 1.607~3.543), a concordance index (C-index) of 0.714, and a Brier score of 0.184. The model performed well both in the 10-fold CV procedure and three independent cohorts from the Gene Expression Omnibus (GEO) repository. CONCLUSIONS A robust and generalizable model based on the autoencoder was proposed to integrate multiomics data and predict the prognosis of patients with stomach adenocarcinoma. The model demonstrates better performance than two alternative approaches on prognosis prediction. The results might provide the grounds for further exploring the potential biomarkers to predict the prognosis of patients with stomach adenocarcinoma.
Collapse
Affiliation(s)
- Sizhen Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Yiteng Zang
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Biyun Xu
- Department of Biostatistics, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008, China
| | - Beier Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Rongji Ma
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Pengcheng Miao
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Bingwei Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| |
Collapse
|
86
|
de Novais FJ, Yu H, Cesar ASM, Momen M, Poleti MD, Petry B, Mourão GB, Regitano LCDA, Morota G, Coutinho LL. Multi-omic data integration for the study of production, carcass, and meat quality traits in Nellore cattle. Front Genet 2022; 13:948240. [PMID: 36338989 PMCID: PMC9634488 DOI: 10.3389/fgene.2022.948240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 10/06/2022] [Indexed: 11/18/2022] Open
Abstract
Data integration using hierarchical analysis based on the central dogma or common pathway enrichment analysis may not reveal non-obvious relationships among omic data. Here, we applied factor analysis (FA) and Bayesian network (BN) modeling to integrate different omic data and complex traits by latent variables (production, carcass, and meat quality traits). A total of 14 latent variables were identified: five for phenotype, three for miRNA, four for protein, and two for mRNA data. Pearson correlation coefficients showed negative correlations between latent variables miRNA 1 (mirna1) and miRNA 2 (mirna2) (-0.47), ribeye area (REA) and protein 4 (prot4) (-0.33), REA and protein 2 (prot2) (-0.3), carcass and prot4 (-0.31), carcass and prot2 (-0.28), and backfat thickness (BFT) and miRNA 3 (mirna3) (-0.25). Positive correlations were observed among the four protein factors (0.45-0.83): between meat quality and fat content (0.71), fat content and carcass (0.74), fat content and REA (0.76), and REA and carcass (0.99). BN presented arcs from the carcass, meat quality, prot2, and prot4 latent variables to REA; from meat quality, REA, mirna2, and gene expression mRNA1 to fat content; from protein 1 (prot1) and mirna2 to protein 5 (prot5); and from prot5 and carcass to prot2. The relations of protein latent variables suggest new hypotheses about the impact of these proteins on REA. The network also showed relationships among miRNAs and nebulin proteins. REA seems to be the central node in the network, influencing carcass, prot2, prot4, mRNA1, and meat quality, suggesting that REA is a good indicator of meat quality. The connection among miRNA latent variables, BFT, and fat content relates to the influence of miRNAs on lipid metabolism. The relationship between mirna1 and prot5 composed of isoforms of nebulin needs further investigation. The FA identified latent variables, decreasing the dimensionality and complexity of the data. The BN was capable of generating interrelationships among latent variables from different types of data, allowing the integration of omics and complex traits and identifying conditional independencies. Our framework based on FA and BN is capable of generating new hypotheses for molecular research, by integrating different types of data and exploring non-obvious relationships.
Collapse
Affiliation(s)
- Francisco José de Novais
- Department of Animal Science, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Haipeng Yu
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Aline Silva Mello Cesar
- Department of Agri-Food Industry, Food and Nutrition, University of São Paulo, Piracicaba, Brazil
| | - Mehdi Momen
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Mirele Daiana Poleti
- Department of Veterinary Medicine, School of Animal Science and Food Engineering, University of Sao Paulo, Pirassununga, Brazil
| | - Bruna Petry
- Department of Animal Science, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Gerson Barreto Mourão
- Department of Animal Science, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, Brazil
| | | | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Luiz Lehmann Coutinho
- Department of Animal Science, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, Brazil
| |
Collapse
|
87
|
Samtal C, El Jaddaoui I, Hamdi S, Bouguenouch L, Ouldim K, Nejjari C, Ghazal H, Bekkari H. Review of prostate cancer genomic studies in Africa. Front Genet 2022; 13:911101. [PMID: 36303548 PMCID: PMC9593051 DOI: 10.3389/fgene.2022.911101] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 09/28/2022] [Indexed: 09/07/2024] Open
Abstract
Prostate cancer (PCa) is the second most commonly diagnosed in men worldwide and one of the most frequent cancers in men in Africa. The heterogeneity of this cancer fosters the need to identify potential genetic risk factors/biomarkers. Omics variations may significantly contribute to early diagnosis and personalized treatment. However, there are few genomic studies of this disease in African populations. This review sheds light on the status of genomics research on PCa in Africa and outlines the common variants identified thus far. The allele frequencies of the most significant SNPs in Afro-native, Afro-descendants, and European populations were compared. We advocate how these few but promising data will aid in understanding, better diagnosing, and precisely treating this cancer and the need for further collaborative research on the genomics of PCa in the African continent.
Collapse
Affiliation(s)
- Chaimae Samtal
- Laboratory of Biotechnology, Environment, Agri-food and Health, Faculty of Sciences Dhar El Mahraz–Sidi Mohammed Ben Abdellah University, Fez, Morocco
| | - Islam El Jaddaoui
- Laboratory of Human Pathologies Biology, Department of Biology, Faculty of Sciences, and Genomic Center of Human Pathologies, Faculty of Medicine and Pharmacy, University Mohammed V, Rabat, Morocco
| | - Salsabil Hamdi
- Laboratory of Environmental Health, Institut Pasteur Maroc, Casablanca, Morocco
| | - Laila Bouguenouch
- Faculty of Medicine, Pharmacy and Dentistry‒Sidi Mohammed Ben Abdellah University, University Hospital Hassan II, Fez, Morocco
| | - Karim Ouldim
- Faculty of Medicine, Pharmacy and Dentistry‒Sidi Mohammed Ben Abdellah University, University Hospital Hassan II, Fez, Morocco
| | - Chakib Nejjari
- Department of Medicine, School of Medicine, Mohammed VI University of Health Sciences, Casablanca, Morocco
- School of Medicine and Pharmacy, Fes, Morocco
| | - Hassan Ghazal
- Laboratory of Biotechnology, Environment, Agri-food and Health, Faculty of Sciences Dhar El Mahraz–Sidi Mohammed Ben Abdellah University, Fez, Morocco
- Laboratory of Genomics and Bioinformatics, School of Pharmacy, Mohammed VI University of Health Sciences, Casablanca, Morocco
- National Center for Scientific and Technical Research, Rabat, Morocco
| | - Hicham Bekkari
- Laboratory of Biotechnology, Environment, Agri-food and Health, Faculty of Sciences Dhar El Mahraz–Sidi Mohammed Ben Abdellah University, Fez, Morocco
| |
Collapse
|
88
|
Li Y, Mansmann U, Du S, Hornung R. Benchmark study of feature selection strategies for multi-omics data. BMC Bioinformatics 2022; 23:412. [PMID: 36199022 PMCID: PMC9533501 DOI: 10.1186/s12859-022-04962-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 09/21/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the last few years, multi-omics data, that is, datasets containing different types of high-dimensional molecular variables for the same samples, have become increasingly available. To date, several comparison studies focused on feature selection methods for omics data, but to our knowledge, none compared these methods for the special case of multi-omics data. Given that these data have specific structures that differentiate them from single-omics data, it is unclear whether different feature selection strategies may be optimal for such data. In this paper, using 15 cancer multi-omics datasets we compared four filter methods, two embedded methods, and two wrapper methods with respect to their performance in the prediction of a binary outcome in several situations that may affect the prediction results. As classifiers, we used support vector machines and random forests. The methods were compared using repeated fivefold cross-validation. The accuracy, the AUC, and the Brier score served as performance metrics. RESULTS The results suggested that, first, the chosen number of selected features affects the predictive performance for many feature selection methods but not all. Second, whether the features were selected by data type or from all data types concurrently did not considerably affect the predictive performance, but for some methods, concurrent selection took more time. Third, regardless of which performance measure was considered, the feature selection methods mRMR, the permutation importance of random forests, and the Lasso tended to outperform the other considered methods. Here, mRMR and the permutation importance of random forests already delivered strong predictive performance when considering only a few selected features. Finally, the wrapper methods were computationally much more expensive than the filter and embedded methods. CONCLUSIONS We recommend the permutation importance of random forests and the filter method mRMR for feature selection using multi-omics data, where, however, mRMR is considerably more computationally costly.
Collapse
Affiliation(s)
- Yingxia Li
- Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, 81377, Munich, Germany.
| | - Ulrich Mansmann
- Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, 81377, Munich, Germany
| | - Shangming Du
- Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, 81377, Munich, Germany
| | - Roman Hornung
- Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, 81377, Munich, Germany
| |
Collapse
|
89
|
Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022; 141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]
Abstract
Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Kevin Wilhelm
- Graduate School of Biomedical Sciences, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Biochemistry, Human Genetics and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
90
|
Stanojevic S, Li Y, Ristivojevic A, Garmire LX. Computational Methods for Single-cell Multi-omics Integration and Alignment. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:836-849. [PMID: 36581065 PMCID: PMC10025765 DOI: 10.1016/j.gpb.2022.11.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 08/09/2022] [Accepted: 11/04/2022] [Indexed: 12/27/2022]
Abstract
Recently developed technologies to generate single-cell genomic data have made a revolutionary impact in the field of biology. Multi-omics assays offer even greater opportunities to understand cellular states and biological processes. The problem of integrating different omics data with very different dimensionality and statistical properties remains, however, quite challenging. A growing body of computational tools is being developed for this task, leveraging ideas ranging from machine translation to the theory of networks, and represents another frontier on the interface of biology and data science. Our goal in this review is to provide a comprehensive, up-to-date survey of computational techniques for the integration of single-cell multi-omics data, while making the concepts behind each algorithm approachable to a non-expert audience.
Collapse
Affiliation(s)
- Stefan Stanojevic
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yijun Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - Lana X Garmire
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
91
|
Colombelli F, Kowalski TW, Recamonde-Mendoza M. A hybrid ensemble feature selection design for candidate biomarkers discovery from transcriptome profiles. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
92
|
Liang M, An B, Chang T, Deng T, Du L, Li K, Cao S, Du Y, Xu L, Zhang L, Gao X, Li J, Gao H. Incorporating kernelized multi-omics data improves the accuracy of genomic prediction. J Anim Sci Biotechnol 2022; 13:103. [PMID: 36127743 PMCID: PMC9490992 DOI: 10.1186/s40104-022-00756-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 07/08/2022] [Indexed: 11/18/2022] Open
Abstract
Background Genomic selection (GS) has revolutionized animal and plant breeding after the first implementation via early selection before measuring phenotypes. Besides genome, transcriptome and metabolome information are increasingly considered new sources for GS. Difficulties in building the model with multi-omics data for GS and the limit of specimen availability have both delayed the progress of investigating multi-omics. Results We utilized the Cosine kernel to map genomic and transcriptomic data as \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${n}\times {n}$$\end{document}n×n symmetric matrix (G matrix and T matrix), combined with the best linear unbiased prediction (BLUP) for GS. Here, we defined five kernel-based prediction models: genomic BLUP (GBLUP), transcriptome-BLUP (TBLUP), multi-omics BLUP (MBLUP, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\boldsymbol M=\mathrm{ratio}\times\boldsymbol G+(1-\mathrm{ratio})\times\boldsymbol T$$\end{document}M=ratio×G+(1-ratio)×T), multi-omics single-step BLUP (mssBLUP), and weighted multi-omics single-step BLUP (wmssBLUP) to integrate transcribed individuals and genotyped resource population. The predictive accuracy evaluations in four traits of the Chinese Simmental beef cattle population showed that (1) MBLUP was far preferred to GBLUP (ratio = 1.0), (2) the prediction accuracy of wmssBLUP and mssBLUP had 4.18% and 3.37% average improvement over GBLUP, (3) We also found the accuracy of wmssBLUP increased with the growing proportion of transcribed cattle in the whole resource population. Conclusions We concluded that the inclusion of transcriptome data in GS had the potential to improve accuracy. Moreover, wmssBLUP is accepted to be a promising alternative for the present situation in which plenty of individuals are genotyped when fewer are transcribed. Supplementary Information The online version contains supplementary material available at 10.1186/s40104-022-00756-6.
Collapse
Affiliation(s)
- Mang Liang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Bingxing An
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Tianpeng Chang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Tianyu Deng
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Lili Du
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Keanning Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Sheng Cao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Yueying Du
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Lingyang Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Lupei Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Xue Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Junya Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Huijiang Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China.
| |
Collapse
|
93
|
Hiort P, Hugo J, Zeinert J, Müller N, Kashyap S, Rajapakse JC, Azuaje F, Renard BY, Baum K. DrDimont: explainable drug response prediction from differential analysis of multi-omics networks. Bioinformatics 2022; 38:ii113-ii119. [PMID: 36124784 PMCID: PMC9486584 DOI: 10.1093/bioinformatics/btac477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION While it has been well established that drugs affect and help patients differently, personalized drug response predictions remain challenging. Solutions based on single omics measurements have been proposed, and networks provide means to incorporate molecular interactions into reasoning. However, how to integrate the wealth of information contained in multiple omics layers still poses a complex problem. RESULTS We present DrDimont, Drug response prediction from Differential analysis of multi-omics networks. It allows for comparative conclusions between two conditions and translates them into differential drug response predictions. DrDimont focuses on molecular interactions. It establishes condition-specific networks from correlation within an omics layer that are then reduced and combined into heterogeneous, multi-omics molecular networks. A novel semi-local, path-based integration step ensures integrative conclusions. Differential predictions are derived from comparing the condition-specific integrated networks. DrDimont's predictions are explainable, i.e. molecular differences that are the source of high differential drug scores can be retrieved. We predict differential drug response in breast cancer using transcriptomics, proteomics, phosphosite and metabolomics measurements and contrast estrogen receptor positive and receptor negative patients. DrDimont performs better than drug prediction based on differential protein expression or PageRank when evaluating it on ground truth data from cancer cell lines. We find proteomic and phosphosite layers to carry most information for distinguishing drug response. AVAILABILITY AND IMPLEMENTATION DrDimont is available on CRAN: https://cran.r-project.org/package=DrDimont. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pauline Hiort
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Julian Hugo
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Justus Zeinert
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Nataniel Müller
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Spoorthi Kashyap
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | - Jagath C Rajapakse
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | | | - Bernhard Y Renard
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany
| | | |
Collapse
|
94
|
Chiorean A, Farncombe KM, Delong S, Andric V, Ansar S, Chan C, Clark K, Danos AM, Gao Y, Giles RH, Goldenberg A, Jani P, Krysiak K, Kujan L, Macpherson S, Maher ER, McCoy LG, Salama Y, Saliba J, Sheta L, Griffith M, Griffith OL, Erdman L, Ramani A, Kim RH. Large scale genotype- and phenotype-driven machine learning in Von Hippel-Lindau disease. Hum Mutat 2022; 43:1268-1285. [PMID: 35475554 PMCID: PMC9356987 DOI: 10.1002/humu.24392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 03/29/2022] [Accepted: 04/25/2022] [Indexed: 12/30/2022]
Abstract
Von Hippel-Lindau (VHL) disease is a hereditary cancer syndrome where individuals are predisposed to tumor development in the brain, adrenal gland, kidney, and other organs. It is caused by pathogenic variants in the VHL tumor suppressor gene. Standardized disease information has been difficult to collect due to the rarity and diversity of VHL patients. Over 4100 unique articles published until October 2019 were screened for germline genotype-phenotype data. Patient data were translated into standardized descriptions using Human Genome Variation Society gene variant nomenclature and Human Phenotype Ontology terms and has been manually curated into an open-access knowledgebase called Clinical Interpretation of Variants in Cancer. In total, 634 unique VHL variants, 2882 patients, and 1991 families from 427 papers were captured. We identified relationship trends between phenotype and genotype data using classic statistical methods and spectral clustering unsupervised learning. Our analyses reveal earlier onset of pheochromocytoma/paraganglioma and retinal angiomas, phenotype co-occurrences and genotype-phenotype correlations including hotspots. It confirms existing VHL associations and can be used to identify new patterns and associations in VHL disease. Our database serves as an aggregate knowledge translation tool to facilitate sharing information about the pathogenicity of VHL variants.
Collapse
Affiliation(s)
- Andreea Chiorean
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Kirsten M. Farncombe
- Toronto General Hospital Research InstituteUniversity Health NetworkTorontoOntarioCanada
| | - Sean Delong
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Veronica Andric
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Safa Ansar
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Clarissa Chan
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Kaitlin Clark
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Arpad M. Danos
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Yizhuo Gao
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Rachel H. Giles
- International Kidney Cancer Coalition, Duivendrecht‐AmsterdamDuivendrechtThe Netherlands
| | - Anna Goldenberg
- Genetics and Genome BiologyThe Hospital for Sick ChildrenTorontoOntarioCanada
| | - Payal Jani
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Kilannin Krysiak
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Lynzey Kujan
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Samantha Macpherson
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Eamonn R. Maher
- Department of Medical GeneticsUniversity of CambridgeCambridgeUK,NIHR Cambridge Biomedical Research CentreCambridge Biomedical CampusCambridgeUK
| | - Liam G. McCoy
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Yasser Salama
- Department of Medicine, Division of Medical OncologyUniversity Health NetworkTorontoOntarioCanada
| | - Jason Saliba
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Lana Sheta
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Malachi Griffith
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Obi L. Griffith
- Department of Medicine, Division of Oncology, Washington University School of MedicineWashington UniversitySt. LouisMissouriUSA,McDonnell Genome InstituteWashington University School of MedicineMissouriSt. LouisUSA
| | - Lauren Erdman
- Genetics and Genome BiologyThe Hospital for Sick ChildrenTorontoOntarioCanada
| | - Arun Ramani
- Genetics and Genome BiologyThe Hospital for Sick ChildrenTorontoOntarioCanada
| | - Raymond H. Kim
- Division of Medical Oncology and Hematology, Princess Margaret Cancer CentreUniversity Health Network and Sinai Health SystemTorontoOntarioCanada,Division of Clinical and Metabolic GeneticsThe Hospital for Sick ChildrenTorontoOntarioCanada,Ontario Institute for Cancer ResearchTorontoOntarioCanada,Department of MedicineUniversity of TorontoTorontoOntarioCanada
| |
Collapse
|
95
|
Zhao C, Dong J, Deng L, Tan Y, Jiang W, Cai Z. Molecular network strategy in multi-omics and mass spectrometry imaging. Curr Opin Chem Biol 2022; 70:102199. [PMID: 36027696 DOI: 10.1016/j.cbpa.2022.102199] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 06/01/2022] [Accepted: 07/10/2022] [Indexed: 11/30/2022]
Abstract
Human physiological activities and pathological changes arise from the coordinated interactions of multiple molecules. Mass spectrometry (MS)-based multi-omics and MS imaging (MSI)-based spatial omics are powerful methods used to investigate molecular information related to the phenotype of interest from homogenated or sliced samples, including the qualitative, relative quantitative and spatial distributions. Molecular network strategy provides efficient methods to help us understand and mine the biological patterns behind the phenotypic data. It illustrates and combines various relationships between molecules, and further performs the molecule identification and biological interpretation. Here, we describe the recent advances of network-based analysis and its applications for different biological processes, such as, obesity, central nervous system diseases, and environmental toxicology.
Collapse
Affiliation(s)
- Chao Zhao
- Bionic Sensing and Intelligence Center, Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jiyang Dong
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
| | - Lingli Deng
- Department of Information Engineering, East China University of Technology, China
| | - Yawen Tan
- Department of Breast and Thyroid Surgery, Shenzhen Second People's Hospital, Shenzhen, China
| | - Wei Jiang
- Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China
| | - Zongwei Cai
- State Key Laboratory of Environmental and Biological Analysis, Department of Chemistry, Hong Kong Baptist University, Hong Kong SAR, China.
| |
Collapse
|
96
|
Gardner L, Kostarelos K, Mallick P, Dive C, Hadjidemetriou M. Nano-omics: nanotechnology-based multidimensional harvesting of the blood-circulating cancerome. Nat Rev Clin Oncol 2022; 19:551-561. [PMID: 35739399 DOI: 10.1038/s41571-022-00645-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2022] [Indexed: 02/08/2023]
Abstract
Over the past decade, the development of 'simple' blood tests that enable cancer screening, diagnosis or monitoring and facilitate the design of personalized therapies without the need for invasive tumour biopsy sampling has been a core ambition in cancer research. Data emerging from ongoing biomarker development efforts indicate that multiple markers, used individually or as part of a multimodal panel, are required to enhance the sensitivity and specificity of assays for early stage cancer detection. The discovery of cancer-associated molecular alterations that are reflected in blood at multiple dimensions (genome, epigenome, transcriptome, proteome and metabolome) and integration of the resultant multi-omics data have the potential to uncover novel biomarkers as well as to further elucidate the underlying molecular pathways. Herein, we review key advances in multi-omics liquid biopsy approaches and introduce the 'nano-omics' paradigm: the development and utilization of nanotechnology tools for the enrichment and subsequent omics analysis of the blood-circulating cancerome.
Collapse
Affiliation(s)
- Lois Gardner
- Nanomedicine Lab, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
- Cancer Research UK Manchester Institute Cancer Biomarker Centre, The University of Manchester, Manchester, UK
| | - Kostas Kostarelos
- Nanomedicine Lab, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
- Catalan Institute of Nanoscience & Nanotechnology (ICN2), UAB Campus, Barcelona, Spain
| | - Parag Mallick
- Canary Center at Stanford for Cancer Early Detection, Stanford University, California, USA
| | - Caroline Dive
- Cancer Research UK Manchester Institute Cancer Biomarker Centre, The University of Manchester, Manchester, UK
| | - Marilena Hadjidemetriou
- Nanomedicine Lab, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK.
| |
Collapse
|
97
|
Moon S, Hwang J, Lee H. SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-Omics Integration. J Comput Biol 2022; 29:892-907. [PMID: 35951002 PMCID: PMC9805883 DOI: 10.1089/cmb.2021.0598] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Integration of multi-omics data provides opportunities for revealing biological mechanisms related to certain phenotypes. We propose a novel method of multi-omics integration called supervised deep generalized canonical correlation analysis (SDGCCA) for modeling correlation structures between nonlinear multi-omics manifolds that aims at improving the classification of phenotypes and revealing the biomarkers related to phenotypes. SDGCCA addresses the limitations of other canonical correlation analysis (CCA)-based models (such as deep CCA, deep generalized CCA) by considering complex/nonlinear cross-data correlations between multiple (≥2) modalities. Although there are a few methods to learn nonlinear CCA projections for classifying phenotypes, they only consider two views. Methods extended to multiple views either do not perform classification or do not provide feature ranking. In contrast, SDGCCA is a nonlinear multi-view CCA projection method that performs classification and ranks features. When we applied SDGCCA in predicting patients with Alzheimer's disease (AD) and discrimination of early- and late-stage cancers, it outperformed other CCA-based and other supervised methods. In addition, we demonstrate that SDGCCA can be applied for feature selection to identify important multi-omics biomarkers. On applying AD data, SDGCCA identified clusters of genes in multi-omics data, well known to be associated with AD.
Collapse
Affiliation(s)
- Sehwan Moon
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Jeongyoung Hwang
- Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, South Korea.,Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, Gwangju, South Korea.,Address correspondence to: Dr. Hyunju Lee, School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju 61005, South Korea
| |
Collapse
|
98
|
Wang X, Wen Y. A penalized linear mixed model with generalized method of moments for prediction analysis on high-dimensional multi-omics data. Brief Bioinform 2022; 23:bbac193. [PMID: 35649346 PMCID: PMC9310531 DOI: 10.1093/bib/bbac193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 03/18/2022] [Accepted: 04/27/2022] [Indexed: 11/13/2022] Open
Abstract
With the advances in high-throughput biotechnologies, high-dimensional multi-layer omics data become increasingly available. They can provide both confirmatory and complementary information to disease risk and thus have offered unprecedented opportunities for risk prediction studies. However, the high-dimensionality and complex inter/intra-relationships among multi-omics data have brought tremendous analytical challenges. Here we present a computationally efficient penalized linear mixed model with generalized method of moments estimator (MpLMMGMM) for the prediction analysis on multi-omics data. Our method extends the widely used linear mixed model proposed for genomic risk predictions to model multi-omics data, where kernel functions are used to capture various types of predictive effects from different layers of omics data and penalty terms are introduced to reduce the impact of noise. Compared with existing penalized linear mixed models, the proposed method adopts the generalized method of moments estimator and it is much more computationally efficient. Through extensive simulation studies and the analysis of positron emission tomography imaging outcomes, we have demonstrated that MpLMMGMM can simultaneously consider a large number of variables and efficiently select those that are predictive from the corresponding omics layers. It can capture both linear and nonlinear predictive effects and achieves better prediction performance than competing methods.
Collapse
Affiliation(s)
- Xiaqiong Wang
- Department of Statistics, University of Auckland, 38 Princes Street, 1010, Auckland, New Zealand
| | - Yalu Wen
- Department of Statistics, University of Auckland, 38 Princes Street, 1010, Auckland, New Zealand
| |
Collapse
|
99
|
Schmidt ST, Akhave N, Knightly RE, Reuben A, Vokes N, Zhang J, Li J, Fujimoto J, Byers LA, Sanchez-Espiridion B, Diao L, Wang J, Federico L, Forget MA, McGrail DJ, Weissferdt A, Lin SY, Lee Y, Suzuki E, Kovacs JJ, Behrens C, Wistuba II, Futreal A, Vaporciyan A, Sepesi B, Heymach JV, Bernatchez C, Haymaker C, Cascone T, Zhang J, Bristow CA, Heffernan TP, Negrao MV, Gibbons DL. Shared Nearest Neighbors Approach and Interactive Browser for Network Analysis of a Comprehensive Non-Small-Cell Lung Cancer Data Set. JCO Clin Cancer Inform 2022; 6:e2200040. [PMID: 35944232 PMCID: PMC9470146 DOI: 10.1200/cci.22.00040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 05/25/2022] [Accepted: 06/30/2022] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Advances in biological measurement technologies are enabling large-scale studies of patient cohorts across multiple omics platforms. Holistic analysis of these data can generate actionable insights for translational research and necessitate new approaches for data integration and mining. METHODS We present a novel approach for integrating data across platforms on the basis of the shared nearest neighbors algorithm and use it to create a network of multiplatform data from the immunogenomic profiling of non-small-cell lung cancer project. RESULTS Benchmarking demonstrates that the shared nearest neighbors-based network approach outperforms a traditional gene-gene network in capturing established interactions while providing new ones on the basis of the interplay between measurements from different platforms. When used to examine patient characteristics of interest, our approach provided signatures associated with and new leads related to recurrence and TP53 oncogenotype. CONCLUSION The network developed offers an unprecedented, holistic view into immunogenomic profiling of non-small-cell lung cancer, which can be explored through the accompanying interactive browser that we built.
Collapse
Affiliation(s)
- Stephanie T. Schmidt
- TRACTION Platform, Division of Therapeutics Discovery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Neal Akhave
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Ryan E. Knightly
- TRACTION Platform, Division of Therapeutics Discovery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Alexandre Reuben
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Natalie Vokes
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Jianhua Zhang
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Jun Li
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Junya Fujimoto
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Lauren A. Byers
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | - Lixia Diao
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Jing Wang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Lorenzo Federico
- Department of Melanoma Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Marie-Andree Forget
- Department of Melanoma Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Daniel J. McGrail
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Annikka Weissferdt
- Department of Thoracic and Cardiovascular Surgery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Shiaw-Yih Lin
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Younghee Lee
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Erika Suzuki
- TRACTION Platform, Division of Therapeutics Discovery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Jeffrey J. Kovacs
- TRACTION Platform, Division of Therapeutics Discovery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Carmen Behrens
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Ignacio I. Wistuba
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Andrew Futreal
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Ara Vaporciyan
- Department of Thoracic and Cardiovascular Surgery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Boris Sepesi
- Department of Thoracic and Cardiovascular Surgery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - John V. Heymach
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Chantale Bernatchez
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Cara Haymaker
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Tina Cascone
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Jianjun Zhang
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Christopher A. Bristow
- TRACTION Platform, Division of Therapeutics Discovery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Timothy P. Heffernan
- TRACTION Platform, Division of Therapeutics Discovery, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Marcelo V. Negrao
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Don L. Gibbons
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| |
Collapse
|
100
|
Li X, Xiang J, Wu FX, Li M. A Dual Ranking Algorithm Based on the Multiplex Network for Heterogeneous Complex Disease Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1993-2002. [PMID: 33577455 DOI: 10.1109/tcbb.2021.3059046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identifying biomarkers of heterogeneous complex diseases has always been one of the focuses in medical research. In previous studies, the powerful network propagation methods have been applied to finding marker genes related to specific diseases, but existing methods are mostly based on a single network, which may be greatly affected by the incompleteness of the network and the ignorance of a large amount of information about physical and functional interactions between biological components. Other methods that directly integrate multiple types of interactions into an aggregate network have the risks that different types of data may conflict with each other and the characteristics and topologies of each individual network are lost. Meanwhile, biomarkers used in clinical trials should have the characteristics of small quantity and strong discriminate ability. In this study, we developed a multiplex network-based dual ranking framework (DualRank) for heterogeneous complex disease analysis. We applied the proposed method to heterogeneous complex diseases for diagnosis, prognosis, and classification. The results showed that DualRank outperformed competing methods and could identify biomarkers with the small quantity, great prediction performance (average AUC = 0.818) and biological interpretability.
Collapse
|