1
|
Morton JT, Jin DM, Mills RH, Shao Y, Rahman G, McDonald D, Zhu Q, Balaban M, Jiang Y, Cantrell K, Gonzalez A, Carmel J, Frankiensztajn LM, Martin-Brevet S, Berding K, Needham BD, Zurita MF, David M, Averina OV, Kovtun AS, Noto A, Mussap M, Wang M, Frank DN, Li E, Zhou W, Fanos V, Danilenko VN, Wall DP, Cárdenas P, Baldeón ME, Jacquemont S, Koren O, Elliott E, Xavier RJ, Mazmanian SK, Knight R, Gilbert JA, Donovan SM, Lawley TD, Carpenter B, Bonneau R, Taroncher-Oldenburg G. Multi-level analysis of the gut-brain axis shows autism spectrum disorder-associated molecular and microbial profiles. Nat Neurosci 2023:10.1038/s41593-023-01361-0. [PMID: 37365313 DOI: 10.1038/s41593-023-01361-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 05/13/2023] [Indexed: 06/28/2023]
Abstract
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by heterogeneous cognitive, behavioral and communication impairments. Disruption of the gut-brain axis (GBA) has been implicated in ASD although with limited reproducibility across studies. In this study, we developed a Bayesian differential ranking algorithm to identify ASD-associated molecular and taxa profiles across 10 cross-sectional microbiome datasets and 15 other datasets, including dietary patterns, metabolomics, cytokine profiles and human brain gene expression profiles. We found a functional architecture along the GBA that correlates with heterogeneity of ASD phenotypes, and it is characterized by ASD-associated amino acid, carbohydrate and lipid profiles predominantly encoded by microbial species in the genera Prevotella, Bifidobacterium, Desulfovibrio and Bacteroides and correlates with brain gene expression changes, restrictive dietary patterns and pro-inflammatory cytokine profiles. The functional architecture revealed in age-matched and sex-matched cohorts is not present in sibling-matched cohorts. We also show a strong association between temporal changes in microbiome composition and ASD phenotypes. In summary, we propose a framework to leverage multi-omic datasets from well-defined cohorts and investigate how the GBA influences ASD.
Collapse
Affiliation(s)
- James T Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Biostatistics & Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Dong-Min Jin
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | | | - Yan Shao
- Host-Microbiota Interactions Laboratory, Wellcome Sanger Institute, Hinxton, UK
| | - Gibraan Rahman
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Daniel McDonald
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Qiyun Zhu
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA
| | - Metin Balaban
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Yueyu Jiang
- Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Kalen Cantrell
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
- Department of Computer Science and Engineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Antonio Gonzalez
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Julie Carmel
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | | | - Sandra Martin-Brevet
- Laboratory for Research in Neuroimaging, Centre for Research in Neurosciences, Department of Clinical Neurosciences, Centre Hospitalier Universitaire Vaudois, University of Lausanne, Lausanne, Switzerland
| | - Kirsten Berding
- Division of Nutritional Sciences, University of Illinois, Urbana, IL, USA
| | - Brittany D Needham
- Stark Neurosciences Research Institute, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Anatomy, Cell Biology and Physiology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - María Fernanda Zurita
- Microbiology Institute and Health Science College, Universidad San Francisco de Quito, Quito, Ecuador
| | - Maude David
- Departments of Microbiology & Pharmaceutical Sciences, Oregon State University, Corvallis, OR, USA
| | - Olga V Averina
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia
| | - Alexey S Kovtun
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
| | - Antonio Noto
- Department of Biomedical Sciences, School of Medicine, University of Cagliari, Cagliari, Italy
| | - Michele Mussap
- Laboratory Medicine, Department of Surgical Sciences, School of Medicine, University of Cagliari, Cagliari, Italy
| | - Mingbang Wang
- Shanghai Key Laboratory of Birth Defects, Division of Neonatology, Children's Hospital of Fudan University, National Center for Children's Health, Shanghai, China
- Microbiome Therapy Center, South China Hospital, Health Science Center, Shenzhen University, Shenzhen, China
| | - Daniel N Frank
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Ellen Li
- Department of Medicine, Division of Gastroenterology and Hepatology, Stony Brook University, Stony Brook, NY, USA
| | - Wenhao Zhou
- Shanghai Key Laboratory of Birth Defects, Division of Neonatology, Children's Hospital of Fudan University, National Center for Children's Health, Shanghai, China
| | - Vassilios Fanos
- Neonatal Intensive Care Unit and Neonatal Pathology, Department of Surgical Sciences, School of Medicine, University of Cagliari, Cagliari, Italy
| | - Valery N Danilenko
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia
| | - Dennis P Wall
- Pediatrics (Systems Medicine), Biomedical Data Science, and Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Paúl Cárdenas
- Institute of Microbiology, COCIBA, Universidad San Francisco de Quito, Quito, Ecuador
| | - Manuel E Baldeón
- Facultad de Ciencias Médicas, de la Salud y la Vida, Universidad Internacional del Ecuador, Quito, Ecuador
| | - Sébastien Jacquemont
- Sainte Justine Hospital Research Center, Montréal, QC, Canada
- Department of Pediatrics, Université de Montréal, Montréal, QC, Canada
| | - Omry Koren
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | - Evan Elliott
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
- The Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar Ilan University, Ramat Gan, Israel
| | - Ramnik J Xavier
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Molecular Biology, Massachusetts General Hospital, Boston, MA, USA
- Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital, Boston, MA, USA
| | - Sarkis K Mazmanian
- Division of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Rob Knight
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
- Department of Computer Science and Engineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, California, USA
| | - Jack A Gilbert
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, California, USA
- Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, USA
| | - Sharon M Donovan
- Division of Nutritional Sciences, University of Illinois, Urbana, IL, USA
| | - Trevor D Lawley
- Host-Microbiota Interactions Laboratory, Wellcome Sanger Institute, Hinxton, UK
| | - Bob Carpenter
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Richard Bonneau
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
- Prescient Design, a Genentech Accelerator, New York, NY, USA
| | - Gaspar Taroncher-Oldenburg
- Gaspar Taroncher Consulting, Philadelphia, PA, USA.
- Simons Foundation Autism Research Initiative, Simons Foundation, New York, NY, USA.
| |
Collapse
|
2
|
Iosef C, Knauer MJ, Nicholson M, Van Nynatten LR, Cepinskas G, Draghici S, Han VKM, Fraser DD. Plasma proteome of Long-COVID patients indicates HIF-mediated vasculo-proliferative disease with impact on brain and heart function. J Transl Med 2023; 21:377. [PMID: 37301958 PMCID: PMC10257382 DOI: 10.1186/s12967-023-04149-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 04/25/2023] [Indexed: 06/12/2023] Open
Abstract
AIMS Long-COVID occurs after SARS-CoV-2 infection and results in diverse, prolonged symptoms. The present study aimed to unveil potential mechanisms, and to inform prognosis and treatment. METHODS Plasma proteome from Long-COVID outpatients was analyzed in comparison to matched acutely ill COVID-19 (mild and severe) inpatients and healthy control subjects. The expression of 3072 protein biomarkers was determined with proximity extension assays and then deconvoluted with multiple bioinformatics tools into both cell types and signaling mechanisms, as well as organ specificity. RESULTS Compared to age- and sex-matched acutely ill COVID-19 inpatients and healthy control subjects, Long-COVID outpatients showed natural killer cell redistribution with a dominant resting phenotype, as opposed to active, and neutrophils that formed extracellular traps. This potential resetting of cell phenotypes was reflected in prospective vascular events mediated by both angiopoietin-1 (ANGPT1) and vascular-endothelial growth factor-A (VEGFA). Several markers (ANGPT1, VEGFA, CCR7, CD56, citrullinated histone 3, elastase) were validated by serological methods in additional patient cohorts. Signaling of transforming growth factor-β1 with probable connections to elevated EP/p300 suggested vascular inflammation and tumor necrosis factor-α driven pathways. In addition, a vascular proliferative state associated with hypoxia inducible factor 1 pathway suggested progression from acute COVID-19 to Long-COVID. The vasculo-proliferative process predicted in Long-COVID might contribute to changes in the organ-specific proteome reflective of neurologic and cardiometabolic dysfunction. CONCLUSIONS Taken together, our findings point to a vasculo-proliferative process in Long-COVID that is likely initiated either prior hypoxia (localized or systemic) and/or stimulatory factors (i.e., cytokines, chemokines, growth factors, angiotensin, etc). Analyses of the plasma proteome, used as a surrogate for cellular signaling, unveiled potential organ-specific prognostic biomarkers and therapeutic targets.
Collapse
Affiliation(s)
- Cristiana Iosef
- Children's Health Research Institute, Victoria Research Laboratories, 800 Commissioners Road East, London, ON, N6C 2V5, Canada.
| | - Michael J Knauer
- Department of Pathology and Laboratory Medicine, London, ON, N6A 5C1, Canada
| | - Michael Nicholson
- Department of Medicine, Western University, London, ON, N6A 5C1, Canada
| | | | - Gediminas Cepinskas
- Lawson Health Research Institute, London, ON, N6C 2R5, Canada
- Department of Medical Biophysics, Western University, London, ON, N6A 5C1, Canada
| | - Sorin Draghici
- Department of Computer Science College of Engineering, Wayne State University, Ann Arbor, MI, 48202, USA
- Advaita Bioinformatics, Ann Arbor, 48105-2552, USA
- National Science Foundation, Alexandria, VA, 22314, USA
| | - Victor K M Han
- Children's Health Research Institute, Victoria Research Laboratories, 800 Commissioners Road East, London, ON, N6C 2V5, Canada
- Department of Pediatrics, Western University, London, ON, N6A 5C1, Canada
| | - Douglas D Fraser
- Children's Health Research Institute, Victoria Research Laboratories, 800 Commissioners Road East, London, ON, N6C 2V5, Canada.
- Lawson Health Research Institute, London, ON, N6C 2R5, Canada.
- Department of Pediatrics, Western University, London, ON, N6A 5C1, Canada.
- Department of Physiology & Pharmacology, Western University, London, ON, N6A 5C1, Canada.
- Department of Clinical Neurological Sciences, Western University, London, ON, N6A 5C1, Canada.
| |
Collapse
|
3
|
Agamah FE, Bayjanov JR, Niehues A, Njoku KF, Skelton M, Mazandu GK, Ederveen THA, Mulder N, Chimusa ER, 't Hoen PAC. Computational approaches for network-based integrative multi-omics analysis. Front Mol Biosci 2022; 9:967205. [PMID: 36452456 PMCID: PMC9703081 DOI: 10.3389/fmolb.2022.967205] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 10/20/2022] [Indexed: 08/27/2023] Open
Abstract
Advances in omics technologies allow for holistic studies into biological systems. These studies rely on integrative data analysis techniques to obtain a comprehensive view of the dynamics of cellular processes, and molecular mechanisms. Network-based integrative approaches have revolutionized multi-omics analysis by providing the framework to represent interactions between multiple different omics-layers in a graph, which may faithfully reflect the molecular wiring in a cell. Here we review network-based multi-omics/multi-modal integrative analytical approaches. We classify these approaches according to the type of omics data supported, the methods and/or algorithms implemented, their node and/or edge weighting components, and their ability to identify key nodes and subnetworks. We show how these approaches can be used to identify biomarkers, disease subtypes, crosstalk, causality, and molecular drivers of physiological and pathological mechanisms. We provide insight into the most appropriate methods and tools for research questions as showcased around the aetiology and treatment of COVID-19 that can be informed by multi-omics data integration. We conclude with an overview of challenges associated with multi-omics network-based analysis, such as reproducibility, heterogeneity, (biological) interpretability of the results, and we highlight some future directions for network-based integration.
Collapse
Affiliation(s)
- Francis E. Agamah
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Jumamurat R. Bayjanov
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Anna Niehues
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Kelechi F. Njoku
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Michelle Skelton
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Gaston K. Mazandu
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- African Institute for Mathematical Sciences, Cape Town, South Africa
| | - Thomas H. A. Ederveen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Emile R. Chimusa
- Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle, United Kingdom
| | - Peter A. C. 't Hoen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
4
|
Shi L, Cao J, Lei X, Shi Y, Wu L. Multi-omics data identified TP53 and LRP1B as key regulatory gene related to immune phenotypes via EPCAM in HCC. Cancer Med 2022; 11:2145-2158. [PMID: 35150083 PMCID: PMC9119357 DOI: 10.1002/cam4.4594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 12/18/2021] [Accepted: 12/21/2021] [Indexed: 12/16/2022] Open
Abstract
Background Many studies showed that the prognosis of hepatocellular carcinoma (HCC) was significantly associated with the expressions of TP53 and LRP1B. However, the potential influence of the two genes on the malignant progression of HCC is still to be expounded. Methods According to the correlation analysis between immune cells and expression levels of TP53 and LRP1B, we filtered the immune cells to perform unsupervised clustering analysis. Integration of multi‐omic data analysis identified genetic alteration and epigenetic alteration. In addition, pathway analysis was used to explore the potential function of the differentially expressed mRNAs. According to the differentially expressed genes, we established an interaction network to seek the hub gene. Least absolute shrinkage and selection operator (LASSO) regression analysis was used to build a prognosis model. Results The unsupervised clustering analysis showed that the cluster A1 showed the highest immune cell levels and the cluster B2 showed the lowest immune cell levels. Multi‐omics data analysis identified that somatic mutations, copy number variations, and DNA methylation levels had significant differences between cluster A1 and cluster B2. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis found that the upregulated mRNAs in the cluster A1 were mainly concentrated in T cell activation, external side of plasma membrane, receptor ligand activity, and cytokine−cytokine receptor interaction. Importantly, the EPCAM was identified as a critical node in the lncRNAs–miRNAs–mRNAs regulatory network correlated with the immune phenotypes. In addition, based on differentially expressed genes between cluster A1 and cluster B2, the prognostic model established by LASSO could predict the overall survival (OS) of HCC accurately. Conclusions The results indicated that the TP53 and LRP1B acted as the key genes in regulating the immune phenotypes of HCC via EPCAM.
Collapse
Affiliation(s)
- Liang Shi
- Department of Clinical Laboratory Medicine, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen, China.,Translational Medicine Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Jie Cao
- Translational Medicine Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Xin Lei
- Translational Medicine Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Yifen Shi
- Department of Hematology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Lili Wu
- Department of Clinical Blood Transfusion, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen, China.,Department of Clinical Laboratory, The Central Hospital of Wenzhou, Wenzhou, China
| |
Collapse
|
5
|
Cervantes-Gracia K, Chahwan R, Husi H. Integrative OMICS Data-Driven Procedure Using a Derivatized Meta-Analysis Approach. Front Genet 2022; 13:828786. [PMID: 35186042 PMCID: PMC8855827 DOI: 10.3389/fgene.2022.828786] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 01/12/2022] [Indexed: 12/24/2022] Open
Abstract
The wealth of high-throughput data has opened up new opportunities to analyze and describe biological processes at higher resolution, ultimately leading to a significant acceleration of scientific output using high-throughput data from the different omics layers and the generation of databases to store and report raw datasets. The great variability among the techniques and the heterogeneous methodologies used to produce this data have placed meta-analysis methods as one of the approaches of choice to correlate the resultant large-scale datasets from different research groups. Through multi-study meta-analyses, it is possible to generate results with greater statistical power compared to individual analyses. Gene signatures, biomarkers and pathways that provide new insights of a phenotype of interest have been identified by the analysis of large-scale datasets in several fields of science. However, despite all the efforts, a standardized regulation to report large-scale data and to identify the molecular targets and signaling networks is still lacking. Integrative analyses have also been introduced as complementation and augmentation for meta-analysis methodologies to generate novel hypotheses. Currently, there is no universal method established and the different methods available follow different purposes. Herein we describe a new unifying, scalable and straightforward methodology to meta-analyze different omics outputs, but also to integrate the significant outcomes into novel pathways describing biological processes of interest. The significance of using proper molecular identifiers is highlighted as well as the potential to further correlate molecules from different regulatory levels. To show the methodology’s potential, a set of transcriptomic datasets are meta-analyzed as an example.
Collapse
Affiliation(s)
| | - Richard Chahwan
- Institute of Experimental Immunology, University of Zurich, Zurich, Switzerland
- *Correspondence: Richard Chahwan, ; Holger Husi,
| | - Holger Husi
- Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, United Kingdom
- Division of Biomedical Sciences, Centre for Health Science, University of the Highlands and Islands, Inverness, United Kingdom
- *Correspondence: Richard Chahwan, ; Holger Husi,
| |
Collapse
|
6
|
John Cremin C, Dash S, Huang X. Big Data: Historic Advances and Emerging Trends in Biomedical Research. CURRENT RESEARCH IN BIOTECHNOLOGY 2022. [DOI: 10.1016/j.crbiot.2022.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
7
|
Overhoff B, Falls Z, Mangione W, Samudrala R. A Deep-Learning Proteomic-Scale Approach for Drug Design. Pharmaceuticals (Basel) 2021; 14:1277. [PMID: 34959678 PMCID: PMC8709297 DOI: 10.3390/ph14121277] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/27/2021] [Accepted: 11/29/2021] [Indexed: 12/26/2022] Open
Abstract
Computational approaches have accelerated novel therapeutic discovery in recent decades. The Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multitarget therapeutic discovery, repurposing, and design aims to improve their efficacy and safety by employing a holistic approach that computes interaction signatures between every drug/compound and a large library of non-redundant protein structures corresponding to the human proteome fold space. These signatures are compared and analyzed to determine if a given drug/compound is efficacious and safe for a given indication/disease. In this study, we used a deep learning-based autoencoder to first reduce the dimensionality of CANDO-computed drug-proteome interaction signatures. We then employed a reduced conditional variational autoencoder to generate novel drug-like compounds when given a target encoded "objective" signature. Using this approach, we designed compounds to recreate the interaction signatures for twenty approved and experimental drugs and showed that 16/20 designed compounds were predicted to be significantly (p-value ≤ 0.05) more behaviorally similar relative to all corresponding controls, and 20/20 were predicted to be more behaviorally similar relative to a random control. We further observed that redesigns of objectives developed via rational drug design performed significantly better than those derived from natural sources (p-value ≤ 0.05), suggesting that the model learned an abstraction of rational drug design. We also show that the designed compounds are structurally diverse and synthetically feasible when compared to their respective objective drugs despite consistently high predicted behavioral similarity. Finally, we generated new designs that enhanced thirteen drugs/compounds associated with non-small cell lung cancer and anti-aging properties using their predicted proteomic interaction signatures. his study represents a significant step forward in automating holistic therapeutic design with machine learning, enabling the rapid generation of novel, effective, and safe drug leads for any indication.
Collapse
Affiliation(s)
| | | | | | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY 14203, USA; (B.O.); (Z.F.); (W.M.)
| |
Collapse
|
8
|
Chen Z, Liu X, Liu F, Zhang G, Tu H, Lin W, Lin H. Identification of 4-methylation driven genes based prognostic signature in thyroid cancer: an integrative analysis based on the methylmix algorithm. Aging (Albany NY) 2021; 13:20164-20178. [PMID: 34456184 PMCID: PMC8436924 DOI: 10.18632/aging.203338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 07/01/2021] [Indexed: 12/09/2022]
Abstract
Thyroid cancer (TC) is known with a high rate of persistence and recurrence. We aimed to develop a prognostic signature to monitor and assess the survival of TC patients. mRNA expression and methylation data were downloaded from the TCGA database. Then, R package methylmix was applied to construct a mixed model was used to identify methylation-driven genes (MDGs) according to the methylation levels. Furthermore, an MDGs based prognostic signature and predictive nomogram were constructed according to the analysis of univariate and multivariate Cox regression. Totally 62 methylation-driven genes that were mainly enriched in substrate-dependent cell migration, cellular response to mechanical stimulus, et al. were found in TC tissues. aldolase C (AldoC), C14orf62, dishevelled 1 (DVL1), and protein tyrosine phosphatase receptor type C (PTPRC) were identified to be significantly related to patients' survival, and may serve as independent prognostic biomarkers for TC. Additionally, the prognostic methylation signature and a novel prognostic, predictive nomogram was established based on the methylation level of 4 MDGs. In this study, we developed a 4-MDGs based prognostic model, which might be the potential predictors for the survival rate of TC patients, and this findings might provide a novel sight for accurate monitoring and prognosis assessment.
Collapse
Affiliation(s)
- Zhiwei Chen
- Department of Pathology, The Affiliated Hospital of Putian University, Putian 351100, Fujian Province, China
| | - Xiaoli Liu
- Department of Pathology, The Affiliated Hospital of Putian University, Putian 351100, Fujian Province, China
| | - Fangfang Liu
- Department of Pathology, The Affiliated Hospital of Putian University, Putian 351100, Fujian Province, China
| | - Guolie Zhang
- Department of Thyroid Surgery, The Affiliated Hospital of Putian University, Putian 351100, Fujian Province, China
| | - Haijian Tu
- Clinical Laboratory, The Affiliated Hospital of Putian University, Putian 351100, Fujian Province, China
| | - Wei Lin
- Department of Gastrointestinal Surgery, The Affiliated Hospital of Putian University, Putian 351100, Fujian Province, China
| | - Haifeng Lin
- Department of Gastroenterology, The Affiliated Hospital of Putian University, Putian 351100, Fujian Province, China
| |
Collapse
|
9
|
Nguyen H, Tran D, Tran B, Pehlivan B, Nguyen T. A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Brief Bioinform 2021; 22:bbaa190. [PMID: 34020546 PMCID: PMC8138892 DOI: 10.1093/bib/bbaa190] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 06/19/2020] [Accepted: 07/24/2020] [Indexed: 12/13/2022] Open
Abstract
Gene regulatory network is a complicated set of interactions between genetic materials, which dictates how cells develop in living organisms and react to their surrounding environment. Robust comprehension of these interactions would help explain how cells function as well as predict their reactions to external factors. This knowledge can benefit both developmental biology and clinical research such as drug development or epidemiology research. Recently, the rapid advance of single-cell sequencing technologies, which pushed the limit of transcriptomic profiling to the individual cell level, opens up an entirely new area for regulatory network research. To exploit this new abundant source of data and take advantage of data in single-cell resolution, a number of computational methods have been proposed to uncover the interactions hidden by the averaging process in standard bulk sequencing. In this article, we review 15 such network inference methods developed for single-cell data. We discuss their underlying assumptions, inference techniques, usability, and pros and cons. In an extensive analysis using simulation, we also assess the methods' performance, sensitivity to dropout and time complexity. The main objective of this survey is to assist not only life scientists in selecting suitable methods for their data and analysis purposes but also computational scientists in developing new methods by highlighting outstanding challenges in the field that remain to be addressed in the future development.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bang Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bahadir Pehlivan
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| |
Collapse
|
10
|
A New Era of Neuro-Oncology Research Pioneered by Multi-Omics Analysis and Machine Learning. Biomolecules 2021; 11:biom11040565. [PMID: 33921457 PMCID: PMC8070530 DOI: 10.3390/biom11040565] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/02/2021] [Accepted: 04/07/2021] [Indexed: 02/06/2023] Open
Abstract
Although the incidence of central nervous system (CNS) cancers is not high, it significantly reduces a patient’s quality of life and results in high mortality rates. A low incidence also means a low number of cases, which in turn means a low amount of information. To compensate, researchers have tried to increase the amount of information available from a single test using high-throughput technologies. This approach, referred to as single-omics analysis, has only been partially successful as one type of data may not be able to appropriately describe all the characteristics of a tumor. It is presently unclear what type of data can describe a particular clinical situation. One way to solve this problem is to use multi-omics data. When using many types of data, a selected data type or a combination of them may effectively resolve a clinical question. Hence, we conducted a comprehensive survey of papers in the field of neuro-oncology that used multi-omics data for analysis and found that most of the papers utilized machine learning techniques. This fact shows that it is useful to utilize machine learning techniques in multi-omics analysis. In this review, we discuss the current status of multi-omics analysis in the field of neuro-oncology and the importance of using machine learning techniques.
Collapse
|
11
|
Planell N, Lagani V, Sebastian-Leon P, van der Kloet F, Ewing E, Karathanasis N, Urdangarin A, Arozarena I, Jagodic M, Tsamardinos I, Tarazona S, Conesa A, Tegner J, Gomez-Cabrero D. STATegra: Multi-Omics Data Integration - A Conceptual Scheme With a Bioinformatics Pipeline. Front Genet 2021; 12:620453. [PMID: 33747045 PMCID: PMC7970106 DOI: 10.3389/fgene.2021.620453] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 01/20/2021] [Indexed: 12/13/2022] Open
Abstract
Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.
Collapse
Affiliation(s)
- Nuria Planell
- Translational Bioinformatics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Spain
| | - Vincenzo Lagani
- Institute of Chemical Biology, Ilia State University, Tbilisi, Georgia
- Gnosis Data Analysis P.C., Heraklion, Greece
| | - Patricia Sebastian-Leon
- Department of Genomic and Systems Reproductive Medicine, IVI-RMA (Instituto Valenciano de Infertilidad – Reproductive Medicine Associates) IVI Foundation, Valencia, Spain
| | - Frans van der Kloet
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
| | - Ewoud Ewing
- Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Stockholm, Sweden
| | - Nestoras Karathanasis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, Greece
- Computational Medicine Center, Thomas Jefferson University, Philadelphia, PA, United States
| | - Arantxa Urdangarin
- Translational Bioinformatics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Spain
| | - Imanol Arozarena
- Cancer Signalling Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Health Research Institute of Navarre (IdiSNA), Pamplona, Spain
| | - Maja Jagodic
- Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Stockholm, Sweden
| | - Ioannis Tsamardinos
- Gnosis Data Analysis P.C., Heraklion, Greece
- Computer Science Department, University of Crete, Heraklion, Greece
| | - Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, València, Spain
| | - Ana Conesa
- Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, United States
- Genetics Institute, University of Florida, Gainesville, FL, United States
| | - Jesper Tegner
- Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Unit of Computational Medicine, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
| | - David Gomez-Cabrero
- Translational Bioinformatics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Spain
- Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Unit of Computational Medicine, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
- Mucosal & Salivary Biology DivisionKing’s College London Dental Institute, London, United Kingdom
| |
Collapse
|
12
|
Park M, Kim D, Moon K, Park T. Integrative Analysis of Multi-Omics Data Based on Blockwise Sparse Principal Components. Int J Mol Sci 2020; 21:E8202. [PMID: 33147797 PMCID: PMC7663540 DOI: 10.3390/ijms21218202] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 10/27/2020] [Accepted: 10/31/2020] [Indexed: 01/14/2023] Open
Abstract
The recent development of high-throughput technology has allowed us to accumulate vast amounts of multi-omics data. Because even single omics data have a large number of variables, integrated analysis of multi-omics data suffers from problems such as computational instability and variable redundancy. Most multi-omics data analyses apply single supervised analysis, repeatedly, for dimensional reduction and variable selection. However, these approaches cannot avoid the problems of redundancy and collinearity of variables. In this study, we propose a novel approach using blockwise component analysis. This would solve the limitations of current methods by applying variable clustering and sparse principal component (sPC) analysis. Our approach consists of two stages. The first stage identifies homogeneous variable blocks, and then extracts sPCs, for each omics dataset. The second stage merges sPCs from each omics dataset, and then constructs a prediction model. We also propose a graphical method showing the results of sparse PCA and model fitting, simultaneously. We applied the proposed methodology to glioblastoma multiforme data from The Cancer Genome Atlas. The comparison with other existing approaches showed that our proposed methodology is more easily interpretable than other approaches, and has comparable predictive power, with a much smaller number of variables.
Collapse
Affiliation(s)
- Mira Park
- Department of Preventive Medicine, Eulji University, Daejeon 34824, Korea;
| | - Doyoen Kim
- Department of Statistics, Korea University, Seoul 02841, Korea; (D.K.); (K.M.)
| | - Kwanyoung Moon
- Department of Statistics, Korea University, Seoul 02841, Korea; (D.K.); (K.M.)
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
13
|
Mukerjee S, Gonzalez-Reymundez A, Lunt SY, Vazquez AI. DNA Methylation and Gene Expression with Clinical Covariates Explain Variation in Aggressiveness and Survival of Pancreatic Cancer Patients. Cancer Invest 2020; 38:502-506. [PMID: 32935594 DOI: 10.1080/07357907.2020.1812079] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Pancreatic cancer (PC) is associated with a high mortality rate. We explored the interindividual variation of cancer outcomes, attributable to DNA methylation, gene expression, and clinical factors among PC patients. We aim to determine whether we could differentiate subjects with greater nodal involvement, higher cancer staging, and subsequent survival. We modeled every response variable as a function of a linear predictor involving the effects of clinical variables, methylation, and gene expression in a Bayesian framework. Our results highlight the overall importance of wide-spread alterations in methylation and gene expression patterns associated with survival, nodal metastasis, and staging.
Collapse
Affiliation(s)
- Shyamali Mukerjee
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, USA
| | - Agustin Gonzalez-Reymundez
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, USA.,Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, Michigan, USA
| | - Sophia Y Lunt
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, USA.,Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, Michigan, USA
| | - Ana I Vazquez
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, USA.,Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
14
|
Shafi A, Nguyen T, Peyvandipour A, Draghici S. GSMA: an approach to identify robust global and test Gene Signatures using Meta-Analysis. Bioinformatics 2019; 36:487-495. [PMID: 31329248 PMCID: PMC7869776 DOI: 10.1093/bioinformatics/btz561] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Revised: 12/10/2018] [Accepted: 07/16/2019] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION Recent advances in biomedical research have made massive amount of transcriptomic data available in public repositories from different sources. Due to the heterogeneity present in the individual experiments, identifying reproducible biomarkers for a given disease from multiple independent studies has become a major challenge. The widely used meta-analysis approaches, such as Fisher's method, Stouffer's method, minP and maxP, have at least two major limitations: (i) they are sensitive to outliers, and (ii) they perform only one statistical test for each individual study, and hence do not fully utilize the potential sample size to gain statistical power. RESULTS Here, we propose a gene-level meta-analysis framework that overcomes these limitations and identifies a gene signature that is reliable and reproducible across multiple independent studies of a given disease. The approach provides a comprehensive global signature that can be used to understand the underlying biological phenomena, and a smaller test signature that can be used to classify future samples of a given disease. We demonstrate the utility of the framework by constructing disease signatures for influenza and Alzheimer's disease using nine datasets including 1108 individuals. These signatures are then validated on 12 independent datasets including 912 individuals. The results indicate that the proposed approach performs better than the majority of the existing meta-analysis approaches in terms of both sensitivity as well as specificity. The proposed signatures could be further used in diagnosis, prognosis and identification of therapeutic targets. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Adib Shafi
- Department of Computer Science, Wayne State University, Detroit, MI 48202, USA
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557, USA
| | - Azam Peyvandipour
- Department of Computer Science, Wayne State University, Detroit, MI 48202, USA
| | | |
Collapse
|
15
|
Nguyen H, Shrestha S, Tran D, Shafi A, Draghici S, Nguyen T. A Comprehensive Survey of Tools and Software for Active Subnetwork Identification. Front Genet 2019; 10:155. [PMID: 30891064 PMCID: PMC6411791 DOI: 10.3389/fgene.2019.00155] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 02/13/2019] [Indexed: 12/13/2022] Open
Abstract
A recent focus of computational biology has been to integrate the complementary information available in molecular profiles as well as in multiple network databases in order to identify connected regions that show significant changes under different conditions. This allows for capturing dynamic and condition-specific mechanisms of the underlying phenomena and disease stages. Here we review 22 such integrative approaches for active module identification published over the last decade. This article only focuses on tools that are currently available for use and are well-maintained. We compare these methods focusing on their primary features, integrative abilities, network structures, mathematical models, and implementations. We also provide real-world scenarios in which these methods have been successfully applied, as well as highlight outstanding challenges in the field that remain to be addressed. The main objective of this review is to help potential users and researchers to choose the best method that is suitable for their data and analysis purpose.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Sangam Shrestha
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Adib Shafi
- Department of Computer Science, Wayne State University, Detroit, MI, United States
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, United States
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, United States
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| |
Collapse
|