1
|
Ogunleye A, Piyawajanusorn C, Ghislat G, Ballester PJ. Large-Scale Machine Learning Analysis Reveals DNA Methylation and Gene Expression Response Signatures for Gemcitabine-Treated Pancreatic Cancer. HEALTH DATA SCIENCE 2024; 4:0108. [PMID: 38486621 PMCID: PMC10904073 DOI: 10.34133/hds.0108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 12/08/2023] [Indexed: 03/17/2024]
Abstract
Background: Gemcitabine is a first-line chemotherapy for pancreatic adenocarcinoma (PAAD), but many PAAD patients do not respond to gemcitabine-containing treatments. Being able to predict such nonresponders would hence permit the undelayed administration of more promising treatments while sparing gemcitabine life-threatening side effects for those patients. Unfortunately, the few predictors of PAAD patient response to this drug are weak, none of them exploiting yet the power of machine learning (ML). Methods: Here, we applied ML to predict the response of PAAD patients to gemcitabine from the molecular profiles of their tumors. More concretely, we collected diverse molecular profiles of PAAD patient tumors along with the corresponding clinical data (gemcitabine responses and clinical features) from the Genomic Data Commons resource. From systematically combining 8 tumor profiles with 16 classification algorithms, each of the resulting 128 ML models was evaluated by multiple 10-fold cross-validations. Results: Only 7 of these 128 models were predictive, which underlines the importance of carrying out such a large-scale analysis to avoid missing the most predictive models. These were here random forest using 4 selected mRNAs [0.44 Matthews correlation coefficient (MCC), 0.785 receiver operating characteristic-area under the curve (ROC-AUC)] and XGBoost combining 12 DNA methylation probes (0.32 MCC, 0.697 ROC-AUC). By contrast, the hENT1 marker obtained much worse random-level performance (practically 0 MCC, 0.5 ROC-AUC). Despite not being trained to predict prognosis (overall and progression-free survival), these ML models were also able to anticipate this patient outcome. Conclusions: We release these promising ML models so that they can be evaluated prospectively on other gemcitabine-treated PAAD patients.
Collapse
Affiliation(s)
- Adeolu Ogunleye
- Department of Organismal Biology,
Uppsala University, Uppsala, Sweden
| | | | - Ghita Ghislat
- Department of Life Sciences,
Imperial College London, London, UK
| | | |
Collapse
|
2
|
Raslan MA, Raslan SA, Shehata EM, Mahmoud AS, Sabri NA. Advances in the Applications of Bioinformatics and Chemoinformatics. Pharmaceuticals (Basel) 2023; 16:1050. [PMID: 37513961 PMCID: PMC10384252 DOI: 10.3390/ph16071050] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/19/2023] [Accepted: 07/20/2023] [Indexed: 07/30/2023] Open
Abstract
Chemoinformatics involves integrating the principles of physical chemistry with computer-based and information science methodologies, commonly referred to as "in silico techniques", in order to address a wide range of descriptive and prescriptive chemistry issues, including applications to biology, drug discovery, and related molecular areas. On the other hand, the incorporation of machine learning has been considered of high importance in the field of drug design, enabling the extraction of chemical data from enormous compound databases to develop drugs endowed with significant biological features. The present review discusses the field of cheminformatics and proposes the use of virtual chemical libraries in virtual screening methods to increase the probability of discovering novel hit chemicals. The virtual libraries address the need to increase the quality of the compounds as well as discover promising ones. On the other hand, various applications of bioinformatics in disease classification, diagnosis, and identification of multidrug-resistant organisms were discussed. The use of ensemble models and brute-force feature selection methodology has resulted in high accuracy rates for heart disease and COVID-19 diagnosis, along with the role of special formulations for targeting meningitis and Alzheimer's disease. Additionally, the correlation between genomic variations and disease states such as obesity and chronic progressive external ophthalmoplegia, the investigation of the antibacterial activity of pyrazole and benzimidazole-based compounds against resistant microorganisms, and its applications in chemoinformatics for the prediction of drug properties and toxicity-all the previously mentioned-were presented in the current review.
Collapse
Affiliation(s)
| | | | | | - Amr S Mahmoud
- Department of Obstetrics and Gynecology, Faculty of Medicine, Ain Shams University, Cairo P.O. Box 11566, Egypt
| | - Nagwa A Sabri
- Department of Clinical Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo P.O. Box 11566, Egypt
| |
Collapse
|
3
|
Moreira JD, Gower AC, Xue L, Alekseyev Y, Smith KK, Choi SH, Ayalon N, Farb MG, Tenan K, LeClerc A, Levy D, Benjamin EJ, Lenburg ME, Mitchell RN, Padera RF, Fetterman JL, Gopal DM. Systematic dissection, preservation, and multiomics in whole human and bovine hearts. Cardiovasc Pathol 2023; 63:107495. [PMID: 36334690 PMCID: PMC10031913 DOI: 10.1016/j.carpath.2022.107495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/22/2022] [Accepted: 10/27/2022] [Indexed: 11/07/2022] Open
Abstract
OBJECTIVES We sought to develop a rigorous, systematic protocol for the dissection and preservation of human hearts for biobanking that expands previous success in postmortem transcriptomics to multiomics from paired tissue. BACKGROUND Existing cardiac biobanks consist largely of biopsy tissue or explanted hearts in select diseases and are insufficient for correlating whole organ phenotype with clinical data. METHODS We demonstrate optimal conditions for multiomics interrogation (ribonucleic acid (RNA) sequencing, untargeted metabolomics) in hearts by evaluating the effect of technical variables (storage solution, temperature) and simulated postmortem interval (PMI) on RNA and metabolite stability. We used bovine (n=3) and human (n=2) hearts fixed in PAXgene or snap-frozen with liquid nitrogen. RESULTS Using a paired Wald test, only two of the genes assessed were differentially expressed between left ventricular samples from bovine hearts stored in PAXgene at 0 and 12 hours PMI (FDR q<0.05). We obtained similar findings in human left ventricular samples, suggesting stability of RNA transcripts at PMIs up to 12 hours. Different library preparation methods (mRNA poly-A capture vs. rRNA depletion) resulted in similar quality metrics with both library preparations achieving >95% of reads properly aligning to the reference genomes across all PMIs for bovine and human hearts. PMI had no effect on RNA Integrity Number or quantity of RNA recovered at the time points evaluated. Of the metabolites identified (855 total) using untargeted metabolomics of human left ventricular tissue, 503 metabolites remained stable across PMIs (0, 4, 8, 12 hours). Most metabolic pathways retained several stable metabolites. CONCLUSIONS Our data demonstrate a technically rigorous, reproducible protocol that will enhance cardiac biobanking practices and facilitate novel insights into human CVD. CONDENSED ABSTRACT Cardiovascular disease (CVD) is the leading cause of mortality worldwide. Current biobanking practices insufficiently capture both the diverse array of phenotypes present in CVDs and the spatial heterogeneity across cardiac tissue sites. We have developed a rigorous and systematic protocol for the dissection and preservation of human cardiac biospecimens to enhance the availability of whole organ tissue for multiple applications. When combined with longitudinal clinical phenotyping, our protocol will enable multiomics in hearts to deepen our understanding of CVDs.
Collapse
Affiliation(s)
- Jesse D Moreira
- Evans Department of Medicine and The Whitaker Cardiovascular Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Adam C Gower
- Department of Medicine, Section of Computational Biomedicine, and Clinical and Translational Science Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Liying Xue
- Evans Department of Medicine and The Whitaker Cardiovascular Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Yuriy Alekseyev
- Department of Pathology and Laboratory Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Karan K Smith
- Evans Department of Medicine and The Whitaker Cardiovascular Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Seung H Choi
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nir Ayalon
- Cardiovascular Medicine Section, Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Melissa G Farb
- Evans Department of Medicine and The Whitaker Cardiovascular Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Kenneth Tenan
- BU Microarray and Sequencing Resource, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Ashley LeClerc
- BU Microarray and Sequencing Resource, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Daniel Levy
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA; Department of Medicine, Preventive Medicine & Epidemiology Section, Boston University Chobanian & Avedisian School of Medicine, Boston University and the National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA, USA
| | - Emelia J Benjamin
- Department of Medicine, Preventive Medicine & Epidemiology Section, Boston University Chobanian & Avedisian School of Medicine, Boston University and the National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA, USA; Section of Cardiovascular Medicine, Boston Medical Center/Boston University Chobanian & Avedisian School of Medicine and Department of Epidemiology Boston University School of Public Health, Boston, MA, USA
| | - Marc E Lenburg
- Department of Medicine, Section of Computational Biomedicine, and Clinical and Translational Science Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Richard N Mitchell
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Robert F Padera
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jessica L Fetterman
- Evans Department of Medicine and The Whitaker Cardiovascular Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA.
| | - Deepa M Gopal
- Evans Department of Medicine and The Whitaker Cardiovascular Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA; Cardiovascular Medicine Section, Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA.
| |
Collapse
|
4
|
Identification and Functional Analysis of Individual-Specific Subpathways in Lung Adenocarcinoma. Genes (Basel) 2022; 13:genes13071122. [PMID: 35885905 PMCID: PMC9315518 DOI: 10.3390/genes13071122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/14/2022] [Accepted: 06/15/2022] [Indexed: 11/26/2022] Open
Abstract
Small molecular networks within complex pathways are defined as subpathways. The identification of patient-specific subpathways can reveal the etiology of cancer and guide the development of personalized therapeutic strategies. The dysfunction of subpathways has been associated with the occurrence and development of cancer. Here, we propose a strategy to identify aberrant subpathways at the individual level by calculating the edge score and using the Gene Set Enrichment Analysis (GSEA) method. This provides a novel approach to subpathway analysis. We applied this method to the expression data of a lung adenocarcinoma (LUAD) dataset from The Cancer Genome Atlas (TCGA) database. We validated the effectiveness of this method in identifying LUAD-relevant subpathways and demonstrated its reliability using an independent Gene Expression Omnibus dataset (GEO). Additionally, survival analysis was applied to illustrate the clinical application value of the genes and edges in subpathways that were associated with the prognosis of patients and cancer immunity, which could be potential biomarkers. With these analyses, we show that our method could help uncover subpathways underlying lung adenocarcinoma.
Collapse
|
5
|
Wu C, Zhao Y, Zhang Y, Yang Y, Su W, Yang Y, Sun L, Zhang F, Yu J, Wang Y, Guo P, Zhu B, Wu S. Gut microbiota specifically mediates the anti-hypercholesterolemic effect of berberine (BBR) and facilitates to predict BBR's cholesterol-decreasing efficacy in patients. J Adv Res 2022; 37:197-208. [PMID: 35499044 PMCID: PMC9039652 DOI: 10.1016/j.jare.2021.07.011] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 07/01/2021] [Accepted: 07/28/2021] [Indexed: 12/25/2022] Open
Abstract
INTRODUCTION Gut microbiota has been implicated in the pharmacological activities of many natural products. As an effective hypolipidemic agent, berberine (BBR)'s clinical application is greatly impeded by the obvious inter-individual response variation. To date, little evidence exists on the causality between gut microbes and its therapeutic effects, and the linkage of bacteria alterations to the inter-individual response variation. OBJECTIVES This study aims to confirm the causal role of the gut microbiota in BBR's anti-hyperlipidemic effect and identify key bacteria that can predict its effectiveness. METHODS The correlation between gut microbiota and BBR's inter-individual response variation was studied in hyperlipidemic patients. The causal role of gut microbes in BBR's anti-hyperlipidemic effects was subsequently assessed by altered administration routes, co-treatment with antibiotics, fecal microbiota transplantation, and metagenomic analysis. RESULTS Three-month clinical study showed that BBR was effectively to decrease serum lipids but displayed an obvious response variation. The cholesterol-lowering but not triglyceride-decreasing effect of BBR was closely related to its modulation on gut microbiota. Interestingly, the baseline levels of Alistipes and Blautia could accurately predict its anti-hypercholesterolemic efficiency in the following treatment. Causality experiments in mice further confirmed that the gut microbiome is both necessary and sufficient to mediate the lipid-lowering effect of BBR. The absence of Blautia substantially abolished BBR's cholesterol-decreasing efficacy. CONCLUSION The gut microbiota is necessary and sufficient for BBR's hyperlipidemia-ameliorating effect. The baseline composition of gut microbes can be an effective predictor for its pharmacotherapeutic efficacy, providing a novel way to achieve personalized therapy.
Collapse
Key Words
- AMPK, AMP-activated protein kinase
- Alistipes
- BBR, berberine
- Berberine (BBR)
- Blautia
- Gut microbiota
- H&E, Hematoxylin and Eosin
- HFD, high-fat diet
- Hypercholesterolemia
- Hyperlipidemia
- InsR, insulin receptor
- Inter-individual response variation
- LDL-c, low-density lipoprotein cholesterol
- LDLR, low-density lipoprotein receptors
- NPS, the non-responsive subjects
- PS, the responsive subjects
- RF analysis, Random forest analysis
- ROC, receiving operating characteristic
- SCFAs, short-chain fatty acids
- TC, total cholesterol
- TG, triglycerides
Collapse
Affiliation(s)
- Chongming Wu
- Pharmacology and Toxicology Research Center, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China
| | - Ying Zhao
- Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Yingying Zhang
- Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Yanan Yang
- Pharmacology and Toxicology Research Center, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China
| | - Wenquan Su
- Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Yuanyuan Yang
- Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Le Sun
- Pharmacology and Toxicology Research Center, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China
| | - Fang Zhang
- Pharmacology and Toxicology Research Center, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China
| | - Jiaqi Yu
- Pharmacology and Toxicology Research Center, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China
| | - Yaoxian Wang
- Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Peng Guo
- Pharmacology and Toxicology Research Center, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China
| | - Baoli Zhu
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Shengxian Wu
- Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| |
Collapse
|
6
|
Panja S, Rahem S, Chu CJ, Mitrofanova A. Big Data to Knowledge: Application of Machine Learning to Predictive Modeling of Therapeutic Response in Cancer. Curr Genomics 2021; 22:244-266. [PMID: 35273457 PMCID: PMC8822229 DOI: 10.2174/1389202921999201224110101] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 09/16/2020] [Accepted: 09/30/2020] [Indexed: 11/22/2022] Open
Abstract
Background In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches in light of their application to therapeutic response modeling in cancer. Conclusion We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.
Collapse
Affiliation(s)
| | | | | | - Antonina Mitrofanova
- Address correspondence to this author at the Department of Health Informatics, Rutgers School of Health Professions, Rutgers Biomedical and Health Sciences, Newark, NJ 07107, USA; E-mail:
| |
Collapse
|
7
|
Jung HD, Sung YJ, Kim HU. Omics and Computational Modeling Approaches for the Effective Treatment of Drug-Resistant Cancer Cells. Front Genet 2021; 12:742902. [PMID: 34691155 PMCID: PMC8527086 DOI: 10.3389/fgene.2021.742902] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 09/20/2021] [Indexed: 02/05/2023] Open
Abstract
Chemotherapy is a mainstream cancer treatment, but has a constant challenge of drug resistance, which consequently leads to poor prognosis in cancer treatment. For better understanding and effective treatment of drug-resistant cancer cells, omics approaches have been widely conducted in various forms. A notable use of omics data beyond routine data mining is to use them for computational modeling that allows generating useful predictions, such as drug responses and prognostic biomarkers. In particular, an increasing volume of omics data has facilitated the development of machine learning models. In this mini review, we highlight recent studies on the use of multi-omics data for studying drug-resistant cancer cells. We put a particular focus on studies that use computational models to characterize drug-resistant cancer cells, and to predict biomarkers and/or drug responses. Computational models covered in this mini review include network-based models, machine learning models and genome-scale metabolic models. We also provide perspectives on future research opportunities for combating drug-resistant cancer cells.
Collapse
Affiliation(s)
- Hae Deok Jung
- Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea
| | - Yoo Jin Sung
- Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea
| | - Hyun Uk Kim
- Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.,KAIST Institute for Artificial Intelligence, KAIST, Daejeon, South Korea.,BioProcess Engineering Research Center and BioInformatics Research Center KAIST, Daejeon, South Korea
| |
Collapse
|
8
|
Yoon SJ, Lee CB, Chae SU, Jo SJ, Bae SK. The Comprehensive "Omics" Approach from Metabolomics to Advanced Omics for Development of Immune Checkpoint Inhibitors: Potential Strategies for Next Generation of Cancer Immunotherapy. Int J Mol Sci 2021; 22:6932. [PMID: 34203237 PMCID: PMC8268114 DOI: 10.3390/ijms22136932] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 12/11/2022] Open
Abstract
In the past decade, immunotherapies have been emerging as an effective way to treat cancer. Among several categories of immunotherapies, immune checkpoint inhibitors (ICIs) are the most well-known and widely used options for cancer treatment. Although several studies continue, this treatment option has yet to be developed into a precise application in the clinical setting. Recently, omics as a high-throughput technique for understanding the genome, transcriptome, proteome, and metabolome has revolutionized medical research and led to integrative interpretation to advance our understanding of biological systems. Advanced omics techniques, such as multi-omics, single-cell omics, and typical omics approaches, have been adopted to investigate various cancer immunotherapies. In this review, we highlight metabolomic studies regarding the development of ICIs involved in the discovery of targets or mechanisms of action and assessment of clinical outcomes, including drug response and resistance and propose biomarkers. Furthermore, we also discuss the genomics, proteomics, and advanced omics studies providing insights and comprehensive or novel approaches for ICI development. The overview of ICI studies suggests potential strategies for the development of other cancer immunotherapies using omics techniques in future studies.
Collapse
Affiliation(s)
| | | | | | | | - Soo Kyung Bae
- College of Pharmacy and Integrated Research Institute of Pharmaceutical Sciences, The Catholic University of Korea, 43 Jibong-ro, Wonmi-gu, Bucheon 14662, Korea; (S.J.Y.); (C.B.L.); (S.U.C.); (S.J.J.)
| |
Collapse
|
9
|
Auslander N, Gussow AB, Koonin EV. Incorporating Machine Learning into Established Bioinformatics Frameworks. Int J Mol Sci 2021; 22:2903. [PMID: 33809353 PMCID: PMC8000113 DOI: 10.3390/ijms22062903] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/08/2021] [Accepted: 03/10/2021] [Indexed: 12/23/2022] Open
Abstract
The exponential growth of biomedical data in recent years has urged the application of numerous machine learning techniques to address emerging problems in biology and clinical research. By enabling the automatic feature extraction, selection, and generation of predictive models, these methods can be used to efficiently study complex biological systems. Machine learning techniques are frequently integrated with bioinformatic methods, as well as curated databases and biological networks, to enhance training and validation, identify the best interpretable features, and enable feature and model investigation. Here, we review recently developed methods that incorporate machine learning within the same framework with techniques from molecular evolution, protein structure analysis, systems biology, and disease genomics. We outline the challenges posed for machine learning, and, in particular, deep learning in biomedicine, and suggest unique opportunities for machine learning techniques integrated with established bioinformatics approaches to overcome some of these challenges.
Collapse
Affiliation(s)
| | | | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA;
| |
Collapse
|
10
|
Oh M, Park S, Lee S, Lee D, Lim S, Jeong D, Jo K, Jung I, Kim S. DRIM: A Web-Based System for Investigating Drug Response at the Molecular Level by Condition-Specific Multi-Omics Data Integration. Front Genet 2020; 11:564792. [PMID: 33281870 PMCID: PMC7689278 DOI: 10.3389/fgene.2020.564792] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 08/14/2020] [Indexed: 12/11/2022] Open
Abstract
Pharmacogenomics is the study of how genes affect a person's response to drugs. Thus, understanding the effect of drug at the molecular level can be helpful in both drug discovery and personalized medicine. Over the years, transcriptome data upon drug treatment has been collected and several databases compiled before drug treatment cancer cell multi-omics data with drug sensitivity (IC 50, AUC) or time-series transcriptomic data after drug treatment. However, analyzing transcriptome data upon drug treatment is challenging since more than 20,000 genes interact in complex ways. In addition, due to the difficulty of both time-series analysis and multi-omics integration, current methods can hardly perform analysis of databases with different data characteristics. One effective way is to interpret transcriptome data in terms of well-characterized biological pathways. Another way is to leverage state-of-the-art methods for multi-omics data integration. In this paper, we developed Drug Response analysis Integrating Multi-omics and time-series data (DRIM), an integrative multi-omics and time-series data analysis framework that identifies perturbed sub-pathways and regulation mechanisms upon drug treatment. The system takes drug name and cell line identification numbers or user's drug control/treat time-series gene expression data as input. Then, analysis of multi-omics data upon drug treatment is performed in two perspectives. For the multi-omics perspective analysis, IC 50-related multi-omics potential mediator genes are determined by embedding multi-omics data to gene-centric vector space using a tensor decomposition method and an autoencoder deep learning model. Then, perturbed pathway analysis of potential mediator genes is performed. For the time-series perspective analysis, time-varying perturbed sub-pathways upon drug treatment are constructed. Additionally, a network involving transcription factors (TFs), multi-omics potential mediator genes, and perturbed sub-pathways is constructed, and paths to perturbed pathways from TFs are determined by an influence maximization method. To demonstrate the utility of our system, we provide analysis results of sub-pathway regulatory mechanisms in breast cancer cell lines of different drug sensitivity. DRIM is available at: http://biohealth.snu.ac.kr/software/DRIM/.
Collapse
Affiliation(s)
- Minsik Oh
- Department of Computer Science and Engineering, Seoul National University, Seoul, South Korea
| | - Sungjoon Park
- Department of Computer Science and Engineering, Seoul National University, Seoul, South Korea
| | - Sangseon Lee
- Bioinformatics Institute, Seoul National University, Seoul, South Korea
| | - Dohoon Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Sangsoo Lim
- Bioinformatics Institute, Seoul National University, Seoul, South Korea
| | - Dabin Jeong
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Kyuri Jo
- Department of Computer Engineering, Chungbuk National University, Cheongju, South Korea
| | - Inuk Jung
- Department of Computer Science and Engineering, Kyungpook National University, Daegu, South Korea
| | - Sun Kim
- Bioinformatics Institute, Seoul National University, Seoul, South Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
- Department of Computer Science and Engineering, Institute of Engineering Research, Seoul National University, Seoul, South Korea
| |
Collapse
|
11
|
Maust J, Leopold J, Bugrim A. Network Entropy Reveals that Cancer Resistance to MEK Inhibitors Is Driven by the Resilience of Proliferative Signaling. STUDIES IN COMPUTATIONAL INTELLIGENCE 2020:751-761. [DOI: 10.1007/978-3-030-36683-4_60] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|