1
|
Cao F, Yang Y, Guo C, Zhang H, Yu Q, Guo J. Advancements in artificial intelligence for atopic dermatitis: diagnosis, treatment, and patient management. Ann Med 2025; 57:2484665. [PMID: 40200717 PMCID: PMC11983576 DOI: 10.1080/07853890.2025.2484665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Revised: 03/05/2025] [Accepted: 03/16/2025] [Indexed: 04/10/2025] Open
Abstract
Atopic dermatitis (AD) is a common and complex skin disease that significantly affects the quality of life of patients. The latest advances in artificial intelligence (AI) technology have introduced new methods for diagnosing, treating, and managing AD. AI has various innovative applications in the diagnosis and treatment of atopic dermatitis, with particular emphasis on its significant benefits in medical diagnosis, treatment monitoring, and patient care. AI algorithms, especially those that use deep learning techniques, demonstrate strong performance in recognizing skin images and effectively distinguishing different types of skin lesions, including common AD manifestations. In addition, artificial intelligence has also shown promise in creating personalized treatment plans, simplifying drug development processes, and managing clinical trials. Despite challenges in data privacy and model transparency, the potential of artificial intelligence in advancing AD care is enormous, bringing the future to precision medicine and improving patient outcomes. This manuscript provides a comprehensive review of the application of AI in the process of AD disease for the first time, aiming to play a key role in the advancement of AI in skin health care and further enhance the clinical diagnosis and treatment of AD.
Collapse
Affiliation(s)
- Fang Cao
- Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Yujie Yang
- Sinopharm Chongqing Southwest Aluminum Hospital, Beijing, China
| | - Cui Guo
- Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Hui Zhang
- Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Qianying Yu
- Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Jing Guo
- Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, China
| |
Collapse
|
2
|
Fuellen G, Palmer D, Fruijtier C, Avelar RA. In-silico Evaluation of Aging-Related Interventions Using Omics Data and Predictive Modeling. Ageing Res Rev 2025:102777. [PMID: 40414362 DOI: 10.1016/j.arr.2025.102777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2025] [Revised: 03/12/2025] [Accepted: 05/16/2025] [Indexed: 05/27/2025]
Abstract
A major challenge in aging research is identifying interventions that can improve lifespan and health and minimize toxicity. Clinical studies cannot consider decades-long follow-up periods, and therefore, in-silico evaluations using omics-based surrogate biomarkers are emerging as key tools. However, many current approaches train predictive models on observational data, rather than on intervention data, which can lead to biased conclusions. Yet, the first classifiers for lifespan extension by compounds are now available, learned on intervention data. Here, we review evaluation methodologies and we prioritize training on intervention data whenever available, highlight the importance of safety and toxicity assessments, discuss the role of standardized benchmarks, and present a range of feature processing and predictive modeling approaches. We consider linear and non-linear methods, and automated machine learning workflows. We conclude by emphasizing the need for explainable and reproducible strategies, the integration of safety metrics, and the careful validation of predictors based on interventional benchmarks.
Collapse
Affiliation(s)
- Georg Fuellen
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany; UCD Conway Institute of Biomolecular and Biomedical Research, School of Medicine, University College Dublin, Dublin, Ireland
| | - Daniel Palmer
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany
| | - Claudia Fruijtier
- European Registered Toxicologist at Cats Consultants, Dietmannsried, Germany
| | - Roberto A Avelar
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany
| |
Collapse
|
3
|
Gao S, Yu T, Rasheed A, Wang J, Crossa J, Hearne S, Li H. Fast-forwarding plant breeding with deep learning-based genomic prediction. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2025. [PMID: 40226955 DOI: 10.1111/jipb.13914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Accepted: 03/18/2025] [Indexed: 04/15/2025]
Abstract
Deep learning-based genomic prediction (DL-based GP) has shown promising performance compared to traditional GP methods in plant breeding, particularly in handling large, complex multi-omics data sets. However, the effective development and widespread adoption of DL-based GP still face substantial challenges, including the need for large, high-quality data sets, inconsistencies in performance benchmarking, and the integration of environmental factors. Here, we summarize the key obstacles impeding the development of DL-based GP models and propose future developing directions, such as modular approaches, data augmentation, and advanced attention mechanisms.
Collapse
Affiliation(s)
- Shang Gao
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), CIMMYT-China office, Beijing, 100081, China
- Nanfan Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Sanya, 572024, Hainan, China
| | - Tingxi Yu
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), CIMMYT-China office, Beijing, 100081, China
- Nanfan Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Sanya, 572024, Hainan, China
| | - Awais Rasheed
- Department of Plant Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Jiankang Wang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), CIMMYT-China office, Beijing, 100081, China
- Nanfan Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Sanya, 572024, Hainan, China
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, Texcoco, D.F. 06600, Mexico
| | - Sarah Hearne
- International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, Texcoco, D.F. 06600, Mexico
| | - Huihui Li
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), CIMMYT-China office, Beijing, 100081, China
- Nanfan Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Sanya, 572024, Hainan, China
| |
Collapse
|
4
|
Wu Z, Sun Y, Zhao X, Liu Z, Zhou W, Niu Y. Phenotype prediction in plants is improved by integrating large-scale transcriptomic datasets. NAR Genom Bioinform 2024; 6:lqae184. [PMID: 39735343 PMCID: PMC11672113 DOI: 10.1093/nargab/lqae184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 11/05/2024] [Accepted: 12/19/2024] [Indexed: 12/31/2024] Open
Abstract
Research on the dynamic expression of genes in plants is important for understanding different biological processes. We used the large amounts of transcriptomic data from various plant sample sources that are publicly available to investigate whether the expression levels of a subset of highly variable genes (HVGs) can be used to accurately identify the phenotypes of plants. Using maize (Zea mays L.) as an example, we built machine learning (ML) models to predict phenotypes using a gene expression dataset of 21 612 bulk RNA sequencing samples. We showed that the ML models achieved excellent prediction accuracy using only the HVGs to identify different phenotypes, including tissue types, developmental stages, cultivars and stress conditions. By ML models, several important functional genes were found to be associated with different phenotypes. We performed a similar analysis in rice (Orzya sativa L.) and found that the ML models could be generalized across species. However, the models trained from maize did not perform well in rice, probably because of the expression divergence of the conserved HVGs between the two species. Overall, our results provide an ML framework for phenotype prediction using gene expression profiles, which may contribute to precision management of crops in agricultural practices.
Collapse
Affiliation(s)
- Zefeng Wu
- State Key Laboratory of Aridland Crop Science, Gansu Agricultural University, No. 1 Yingmen Village, Anning District, Lanzhou 730070, Gansu Province, China
| | - Yali Sun
- State Key Laboratory of Aridland Crop Science, Gansu Agricultural University, No. 1 Yingmen Village, Anning District, Lanzhou 730070, Gansu Province, China
| | - Xiaoqiang Zhao
- State Key Laboratory of Aridland Crop Science, Gansu Agricultural University, No. 1 Yingmen Village, Anning District, Lanzhou 730070, Gansu Province, China
| | - Zigang Liu
- State Key Laboratory of Aridland Crop Science, Gansu Agricultural University, No. 1 Yingmen Village, Anning District, Lanzhou 730070, Gansu Province, China
| | - Wenqi Zhou
- Crop Research Institute, Gansu Academy of Agricultural Sciences, No. 1, New Village, Anning District, Lanzhou 730070, Gansu Province, China
| | - Yining Niu
- State Key Laboratory of Aridland Crop Science, Gansu Agricultural University, No. 1 Yingmen Village, Anning District, Lanzhou 730070, Gansu Province, China
| |
Collapse
|
5
|
Kundu P, Beura S, Mondal S, Das AK, Ghosh A. Machine learning for the advancement of genome-scale metabolic modeling. Biotechnol Adv 2024; 74:108400. [PMID: 38944218 DOI: 10.1016/j.biotechadv.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/13/2024] [Accepted: 06/23/2024] [Indexed: 07/01/2024]
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
Collapse
Affiliation(s)
- Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Satyajit Beura
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Suman Mondal
- P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Kumar Das
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
6
|
Khatibi SMH, Ali J. Harnessing the power of machine learning for crop improvement and sustainable production. FRONTIERS IN PLANT SCIENCE 2024; 15:1417912. [PMID: 39188546 PMCID: PMC11346375 DOI: 10.3389/fpls.2024.1417912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 07/15/2024] [Indexed: 08/28/2024]
Abstract
Crop improvement and production domains encounter large amounts of expanding data with multi-layer complexity that forces researchers to use machine-learning approaches to establish predictive and informative models to understand the sophisticated mechanisms underlying these processes. All machine-learning approaches aim to fit models to target data; nevertheless, it should be noted that a wide range of specialized methods might initially appear confusing. The principal objective of this study is to offer researchers an explicit introduction to some of the essential machine-learning approaches and their applications, comprising the most modern and utilized methods that have gained widespread adoption in crop improvement or similar domains. This article explicitly explains how different machine-learning methods could be applied for given agricultural data, highlights newly emerging techniques for machine-learning users, and lays out technical strategies for agri/crop research practitioners and researchers.
Collapse
Affiliation(s)
| | - Jauhar Ali
- Rice Breeding Platform, International Rice Research Institute, Los Baños, Laguna, Philippines
| |
Collapse
|
7
|
Liu X, Shi J, Jiao Y, An J, Tian J, Yang Y, Zhuo L. Integrated multi-omics with machine learning to uncover the intricacies of kidney disease. Brief Bioinform 2024; 25:bbae364. [PMID: 39082652 PMCID: PMC11289682 DOI: 10.1093/bib/bbae364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 06/20/2024] [Accepted: 07/17/2024] [Indexed: 08/03/2024] Open
Abstract
The development of omics technologies has driven a profound expansion in the scale of biological data and the increased complexity in internal dimensions, prompting the utilization of machine learning (ML) as a powerful toolkit for extracting knowledge and understanding underlying biological patterns. Kidney disease represents one of the major growing global health threats with intricate pathogenic mechanisms and a lack of precise molecular pathology-based therapeutic modalities. Accordingly, there is a need for advanced high-throughput approaches to capture implicit molecular features and complement current experiments and statistics. This review aims to delineate strategies for integrating multi-omics data with appropriate ML methods, highlighting key clinical translational scenarios, including predicting disease progression risks to improve medical decision-making, comprehensively understanding disease molecular mechanisms, and practical applications of image recognition in renal digital pathology. Examining the benefits and challenges of current integration efforts is expected to shed light on the complexity of kidney disease and advance clinical practice.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Li Zhuo
- Corresponding author. Department of Nephrology, China-Japan Friendship Hospital, Beijing 100029, China; China-Japan Friendship Clinic Medical College, Beijing University of Chinese Medicine, 100029 Beijing, China. E-mail:
| |
Collapse
|
8
|
Gross B, Dauvin A, Cabeli V, Kmetzsch V, El Khoury J, Dissez G, Ouardini K, Grouard S, Davi A, Loeb R, Esposito C, Hulot L, Ghermi R, Blum M, Darhi Y, Durand EY, Romagnoni A. Robust evaluation of deep learning-based representation methods for survival and gene essentiality prediction on bulk RNA-seq data. Sci Rep 2024; 14:17064. [PMID: 39048590 PMCID: PMC11269749 DOI: 10.1038/s41598-024-67023-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 07/08/2024] [Indexed: 07/27/2024] Open
Abstract
Deep learning (DL) has shown potential to provide powerful representations of bulk RNA-seq data in cancer research. However, there is no consensus regarding the impact of design choices of DL approaches on the performance of the learned representation, including the model architecture, the training methodology and the various hyperparameters. To address this problem, we evaluate the performance of various design choices of DL representation learning methods using TCGA and DepMap pan-cancer datasets and assess their predictive power for survival and gene essentiality predictions. We demonstrate that baseline methods achieve comparable or superior performance compared to more complex models on survival predictions tasks. DL representation methods, however, are the most efficient to predict the gene essentiality of cell lines. We show that auto-encoders (AE) are consistently improved by techniques such as masking and multi-head training. Our results suggest that the impact of DL representations and of pretraining are highly task- and architecture-dependent, highlighting the need for adopting rigorous evaluation guidelines. These guidelines for robust evaluation are implemented in a pipeline made available to the research community.
Collapse
|
9
|
Zhang W, Maeser D, Lee A, Huang Y, Gruener RF, Abdelbar IG, Jena S, Patel AG, Huang RS. Integration of Pan-Cancer Cell Line and Single-Cell Transcriptomic Profiles Enables Inference of Therapeutic Vulnerabilities in Heterogeneous Tumors. Cancer Res 2024; 84:2021-2033. [PMID: 38581448 PMCID: PMC11178452 DOI: 10.1158/0008-5472.can-23-3005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 10/18/2023] [Accepted: 04/01/2024] [Indexed: 04/08/2024]
Abstract
Single-cell RNA sequencing (scRNA-seq) greatly advanced the understanding of intratumoral heterogeneity by identifying distinct cancer cell subpopulations. However, translating biological differences into treatment strategies is challenging due to a lack of tools to facilitate efficient drug discovery that tackles heterogeneous tumors. Developing such approaches requires accurate prediction of drug response at the single-cell level to offer therapeutic options to specific cell subpopulations. Here, we developed a transparent computational framework (nicknamed scIDUC) to predict therapeutic efficacies on an individual cell basis by integrating single-cell transcriptomic profiles with large, data-rich pan-cancer cell line screening data sets. This method achieved high accuracy in separating cells into their correct cellular drug response statuses. In three distinct prospective tests covering different diseases (rhabdomyosarcoma, pancreatic ductal adenocarcinoma, and castration-resistant prostate cancer), the predicted results using scIDUC were accurate and mirrored biological expectations. In the first two tests, the framework identified drugs for cell subpopulations that were resistant to standard-of-care (SOC) therapies due to intrinsic resistance or tumor microenvironmental effects, and the results showed high consistency with experimental findings from the original studies. In the third test using newly generated SOC therapy-resistant cell lines, scIDUC identified efficacious drugs for the resistant line, and the predictions were validated with in vitro experiments. Together, this study demonstrates the potential of scIDUC to quickly translate scRNA-seq data into drug responses for individual cells, displaying the potential as a tool to improve the treatment of heterogenous tumors. SIGNIFICANCE A versatile method that infers cell-level drug response in scRNA-seq data facilitates the development of therapeutic strategies to target heterogeneous subpopulations within a tumor and address issues such as treatment failure and resistance.
Collapse
Affiliation(s)
- Weijie Zhang
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN 55455
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
| | - Danielle Maeser
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN 55455
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
| | - Adam Lee
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
| | - Yingbo Huang
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
| | - Robert F. Gruener
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
| | - Israa G. Abdelbar
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
- Clinical Pharmacy Practice Department, The British University in Egypt, El Sherouk, 11837, Egypt
| | - Sampreeti Jena
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
| | - Anand G. Patel
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105
- Department of Developmental Neurobiology, St. Jude Children’s Research Hospital, Memphis, TN 38105
| | - R. Stephanie Huang
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN 55455
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN 55455
| |
Collapse
|
10
|
Li Pomi F, Papa V, Borgia F, Vaccaro M, Pioggia G, Gangemi S. Artificial Intelligence: A Snapshot of Its Application in Chronic Inflammatory and Autoimmune Skin Diseases. Life (Basel) 2024; 14:516. [PMID: 38672786 PMCID: PMC11051135 DOI: 10.3390/life14040516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 04/10/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open
Abstract
Immuno-correlated dermatological pathologies refer to skin disorders that are closely associated with immune system dysfunction or abnormal immune responses. Advancements in the field of artificial intelligence (AI) have shown promise in enhancing the diagnosis, management, and assessment of immuno-correlated dermatological pathologies. This intersection of dermatology and immunology plays a pivotal role in comprehending and addressing complex skin disorders with immune system involvement. The paper explores the knowledge known so far and the evolution and achievements of AI in diagnosis; discusses segmentation and the classification of medical images; and reviews existing challenges, in immunological-related skin diseases. From our review, the role of AI has emerged, especially in the analysis of images for both diagnostic and severity assessment purposes. Furthermore, the possibility of predicting patients' response to therapies is emerging, in order to create tailored therapies.
Collapse
Affiliation(s)
- Federica Li Pomi
- Department of Precision Medicine in Medical, Surgical and Critical Care (Me.Pre.C.C.), University of Palermo, 90127 Palermo, Italy;
| | - Vincenzo Papa
- Department of Clinical and Experimental Medicine, School and Operative Unit of Allergy and Clinical Immunology, University of Messina, 98125 Messina, Italy; (V.P.); (S.G.)
| | - Francesco Borgia
- Department of Clinical and Experimental Medicine, Section of Dermatology, University of Messina, 98125 Messina, Italy;
| | - Mario Vaccaro
- Department of Clinical and Experimental Medicine, Section of Dermatology, University of Messina, 98125 Messina, Italy;
| | - Giovanni Pioggia
- Institute for Biomedical Research and Innovation (IRIB), National Research Council of Italy (CNR), 98164 Messina, Italy;
| | - Sebastiano Gangemi
- Department of Clinical and Experimental Medicine, School and Operative Unit of Allergy and Clinical Immunology, University of Messina, 98125 Messina, Italy; (V.P.); (S.G.)
| |
Collapse
|
11
|
Wacker EM, Uellendahl-Werth F, Bej S, Wolkenhauer O, Vesterhus M, Lieb W, Franke A, Karlsen TH, Folseraas T, Ellinghaus D. Whole blood RNA sequencing identifies transcriptional differences between primary sclerosing cholangitis and ulcerative colitis. JHEP Rep 2024; 6:100988. [PMID: 38304234 PMCID: PMC10832281 DOI: 10.1016/j.jhepr.2023.100988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 11/10/2023] [Accepted: 12/06/2023] [Indexed: 02/03/2024] Open
Abstract
Background & Aims Genetic and microbiome studies across patients with primary sclerosing cholangitis (PSC) and ulcerative colitis (UC) have indicated that UC in PSC is a separate disease entity to primary UC, but expression studies for PSC are lacking. Methods We conducted whole blood RNA sequencing experiments for 495 patients with UC, 220 patients with PSC (including 177 with UC), and 320 healthy controls from Germany and Norway. Differential expression analyses, gene ontology and coexpression analyses and random forest machine learning were performed to identify genes, ontologies and transcriptional features that discriminate diagnoses. Results The blood transcriptome in UC and PSC is dominated by neutrophil activation genes (e.g. S100A12). In UC, but not in PSC (neither PSC alone nor patients with an additional diagnosis of UC [PSC/UC]), ribosomal, mitochondrial, and energy metabolism genes are upregulated in conjunction with antibody transcript expression (MZB1, IGJ). In PSC, there is an increase in modules related to apoptosis and expression of genes of interferon-I-related ontologies. Random forest analysis could poorly discriminate PSC alone from PSC/UC (AUROC 0.56), but could discriminate PSC, UC, and controls with high accuracy (AUROC UC vs. controls 0.95, PSC vs. controls 0.88, UC vs. PSC 0.986). The main coexpression modules relevant for distinguishing PSC, UC, and controls are enriched in neutrophil degranulation and antibody production genes. Conclusions Supported by machine learning results, PSC and UC appear to be separate entities on a molecular level, while PSC/UC and PSC are indistinguishable. Impact and implications Clinical and genetic studies suggest that the colitis-like symptoms in primary sclerosing cholangitis (PSC) represent a different disease entity from primary ulcerative colitis (UC). The present study supports this assumption with transcriptomic data from whole blood and describes notable differences in gene expression between primary UC and PSC, providing insights into the still unclear pathophysiology of both diseases. These findings are of interest to scientists seeking to decipher the molecular pathophysiology of both diseases and provide evidence that a redefinition of the PSC-UC phenotype should be considered. The study practically supports future molecular research by providing a large transcriptomic whole blood reference cohort.
Collapse
Affiliation(s)
- Eike Matthias Wacker
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany
| | | | - Saptarshi Bej
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
- Indian Institute of Science Education and Research, Thiruvananthapuram, India
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
- Leibniz-Institute for Food Systems Biology at the Technical University Munich, Munich, Germany
- Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre at Stellenbosch University, Stellenbosch, South Africa
| | - Mette Vesterhus
- Norwegian PSC Research Center, Department of Transplantation Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital Rikshospitalet and University of Oslo, Oslo, Norway
- Department of Medicine, Haraldsplass Deaconess Hospital, Bergen, Norway
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Wolfgang Lieb
- Institute of Epidemiology, Christian-Albrechts-University of Kiel, Kiel, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany
| | - Tom Hemming Karlsen
- Research Institute for Internal Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital Rikshospitalet and University of Oslo, Oslo, Norway
| | - Trine Folseraas
- Research Institute for Internal Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital Rikshospitalet and University of Oslo, Oslo, Norway
| | - David Ellinghaus
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany
| |
Collapse
|
12
|
Brouard C, Mourad R, Vialaneix N. Should we really use graph neural networks for transcriptomic prediction? Brief Bioinform 2024; 25:bbae027. [PMID: 38349060 PMCID: PMC10939369 DOI: 10.1093/bib/bbae027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 12/20/2023] [Accepted: 01/17/2024] [Indexed: 02/15/2024] Open
Abstract
The recent development of deep learning methods have undoubtedly led to great improvement in various machine learning tasks, especially in prediction tasks. This type of methods have also been adapted to answer various problems in bioinformatics, including automatic genome annotation, artificial genome generation or phenotype prediction. In particular, a specific type of deep learning method, called graph neural network (GNN) has repeatedly been reported as a good candidate to predict phenotypes from gene expression because its ability to embed information on gene regulation or co-expression through the use of a gene network. However, up to date, no complete and reproducible benchmark has ever been performed to analyze the trade-off between cost and benefit of this approach compared to more standard (and simpler) machine learning methods. In this article, we provide such a benchmark, based on clear and comparable policies to evaluate the different methods on several datasets. Our conclusion is that GNN rarely provides a real improvement in prediction performance, especially when compared to the computation effort required by the methods. Our findings on a limited but controlled simulated dataset shows that this could be explained by the limited quality or predictive power of the input biological gene network itself.
Collapse
Affiliation(s)
- Céline Brouard
- Université Fédérale de Toulouse, INRAE, MIAT, 31326 Castanet-Tolosan, France
| | - Raphaël Mourad
- Université Fédérale de Toulouse, INRAE, MIAT, 31326 Castanet-Tolosan, France
- Université Paul Sabatier, 31062 Toulouse, France
| | - Nathalie Vialaneix
- Université Fédérale de Toulouse, INRAE, MIAT, 31326 Castanet-Tolosan, France
| |
Collapse
|
13
|
Georgouli K, Yeom JS, Blake RC, Navid A. Multi-scale models of whole cells: progress and challenges. Front Cell Dev Biol 2023; 11:1260507. [PMID: 38020904 PMCID: PMC10661945 DOI: 10.3389/fcell.2023.1260507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 10/19/2023] [Indexed: 12/01/2023] Open
Abstract
Whole-cell modeling is "the ultimate goal" of computational systems biology and "a grand challenge for 21st century" (Tomita, Trends in Biotechnology, 2001, 19(6), 205-10). These complex, highly detailed models account for the activity of every molecule in a cell and serve as comprehensive knowledgebases for the modeled system. Their scope and utility far surpass those of other systems models. In fact, whole-cell models (WCMs) are an amalgam of several types of "system" models. The models are simulated using a hybrid modeling method where the appropriate mathematical methods for each biological process are used to simulate their behavior. Given the complexity of the models, the process of developing and curating these models is labor-intensive and to date only a handful of these models have been developed. While whole-cell models provide valuable and novel biological insights, and to date have identified some novel biological phenomena, their most important contribution has been to highlight the discrepancy between available data and observations that are used for the parametrization and validation of complex biological models. Another realization has been that current whole-cell modeling simulators are slow and to run models that mimic more complex (e.g., multi-cellular) biosystems, those need to be executed in an accelerated fashion on high-performance computing platforms. In this manuscript, we review the progress of whole-cell modeling to date and discuss some of the ways that they can be improved.
Collapse
Affiliation(s)
- Konstantia Georgouli
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Jae-Seung Yeom
- Center for Applied Scientific Computing, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Robert C. Blake
- Center for Applied Scientific Computing, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Ali Navid
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| |
Collapse
|
14
|
Zhang W, Maeser D, Lee A, Huang Y, Gruener RF, Abdelbar IG, Jena S, Patel AG, Huang RS. Inferring therapeutic vulnerability within tumors through integration of pan-cancer cell line and single-cell transcriptomic profiles. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.29.564598. [PMID: 37961545 PMCID: PMC10634928 DOI: 10.1101/2023.10.29.564598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Single-cell RNA sequencing greatly advanced our understanding of intratumoral heterogeneity through identifying tumor subpopulations with distinct biologies. However, translating biological differences into treatment strategies is challenging, as we still lack tools to facilitate efficient drug discovery that tackles heterogeneous tumors. One key component of such approaches tackles accurate prediction of drug response at the single-cell level to offer therapeutic options to specific cell subpopulations. Here, we present a transparent computational framework (nicknamed scIDUC) to predict therapeutic efficacies on an individual-cell basis by integrating single-cell transcriptomic profiles with large, data-rich pan-cancer cell line screening datasets. Our method achieves high accuracy, with predicted sensitivities easily able to separate cells into their true cellular drug resistance status as measured by effect size (Cohen's d > 1.0). More importantly, we examine our method's utility with three distinct prospective tests covering different diseases (rhabdomyosarcoma, pancreatic ductal adenocarcinoma, and castration-resistant prostate cancer), and in each our predicted results are accurate and mirrored biological expectations. In the first two, we identified drugs for cell subpopulations that are resistant to standard-of-care (SOC) therapies due to intrinsic resistance or effects of tumor microenvironments. Our results showed high consistency with experimental findings from the original studies. In the third test, we generated SOC therapy resistant cell lines, used scIDUC to identify efficacious drugs for the resistant line, and validated the predictions with in-vitro experiments. Together, scIDUC quickly translates scRNA-seq data into drug response for individual cells, displaying the potential as a first-line tool for nuanced and heterogeneity-aware drug discovery.
Collapse
Affiliation(s)
- Weijie Zhang
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN 55455
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
| | - Danielle Maeser
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN 55455
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
| | - Adam Lee
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
| | - Yingbo Huang
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
| | - Robert F Gruener
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
| | - Israa G Abdelbar
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
- Clinical Pharmacy Practice Department, The British University in Egypt, El Sherouk, 11837, Egypt
| | - Sampreeti Jena
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
| | - Anand G Patel
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN 38105
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105
| | - R Stephanie Huang
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN 55455
- Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN 55455
| |
Collapse
|
15
|
Zhang W, Lee AM, Jena S, Huang Y, Ho Y, Tietz KT, Miller CR, Su MC, Mentzer J, Ling AL, Li Y, Dehm SM, Huang RS. Computational drug discovery for castration-resistant prostate cancers through in vitro drug response modeling. Proc Natl Acad Sci U S A 2023; 120:e2218522120. [PMID: 37068243 PMCID: PMC10151558 DOI: 10.1073/pnas.2218522120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 03/17/2023] [Indexed: 04/19/2023] Open
Abstract
Prostate cancer (PC) is the most frequently diagnosed malignancy and a leading cause of cancer deaths in US men. Many PC cases metastasize and develop resistance to systemic hormonal therapy, a stage known as castration-resistant prostate cancer (CRPC). Therefore, there is an urgent need to develop effective therapeutic strategies for CRPC. Traditional drug discovery pipelines require significant time and capital input, which highlights a need for novel methods to evaluate the repositioning potential of existing drugs. Here, we present a computational framework to predict drug sensitivities of clinical CRPC tumors to various existing compounds and identify treatment options with high potential for clinical impact. We applied this method to a CRPC patient cohort and nominated drugs to combat resistance to hormonal therapies including abiraterone and enzalutamide. The utility of this method was demonstrated by nomination of multiple drugs that are currently undergoing clinical trials for CRPC. Additionally, this method identified the tetracycline derivative COL-3, for which we validated higher efficacy in an isogenic cell line model of enzalutamide-resistant vs. enzalutamide-sensitive CRPC. In enzalutamide-resistant CRPC cells, COL-3 displayed higher activity for inhibiting cell growth and migration, and for inducing G1-phase cell cycle arrest and apoptosis. Collectively, these findings demonstrate the utility of a computational framework for independent validation of drugs being tested in CRPC clinical trials, and for nominating drugs with enhanced biological activity in models of enzalutamide-resistant CRPC. The efficiency of this method relative to traditional drug development approaches indicates a high potential for accelerating drug development for CRPC.
Collapse
Affiliation(s)
- Weijie Zhang
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN55455
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| | - Adam M. Lee
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| | - Sampreeti Jena
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| | - Yingbo Huang
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| | - Yeung Ho
- Department of Laboratory Medicine and Pathology, The University of Minnesota Medical School, Minneapolis, MN55455
| | - Kiel T. Tietz
- Department of Laboratory Medicine and Pathology, The University of Minnesota Medical School, Minneapolis, MN55455
| | - Conor R. Miller
- Department of Laboratory Medicine and Pathology, The University of Minnesota Medical School, Minneapolis, MN55455
| | - Mei-Chi Su
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| | - Joshua Mentzer
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| | - Alexander L. Ling
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| | - Yingming Li
- Department of Laboratory Medicine and Pathology, The University of Minnesota Medical School, Minneapolis, MN55455
| | - Scott M. Dehm
- Department of Laboratory Medicine and Pathology, The University of Minnesota Medical School, Minneapolis, MN55455
| | - R. Stephanie Huang
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN55455
- The Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN55455
| |
Collapse
|
16
|
Tsakiroglou M, Evans A, Pirmohamed M. Leveraging transcriptomics for precision diagnosis: Lessons learned from cancer and sepsis. Front Genet 2023; 14:1100352. [PMID: 36968610 PMCID: PMC10036914 DOI: 10.3389/fgene.2023.1100352] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 02/20/2023] [Indexed: 03/12/2023] Open
Abstract
Diagnostics require precision and predictive ability to be clinically useful. Integration of multi-omic with clinical data is crucial to our understanding of disease pathogenesis and diagnosis. However, interpretation of overwhelming amounts of information at the individual level requires sophisticated computational tools for extraction of clinically meaningful outputs. Moreover, evolution of technical and analytical methods often outpaces standardisation strategies. RNA is the most dynamic component of all -omics technologies carrying an abundance of regulatory information that is least harnessed for use in clinical diagnostics. Gene expression-based tests capture genetic and non-genetic heterogeneity and have been implemented in certain diseases. For example patients with early breast cancer are spared toxic unnecessary treatments with scores based on the expression of a set of genes (e.g., Oncotype DX). The ability of transcriptomics to portray the transcriptional status at a moment in time has also been used in diagnosis of dynamic diseases such as sepsis. Gene expression profiles identify endotypes in sepsis patients with prognostic value and a potential to discriminate between viral and bacterial infection. The application of transcriptomics for patient stratification in clinical environments and clinical trials thus holds promise. In this review, we discuss the current clinical application in the fields of cancer and infection. We use these paradigms to highlight the impediments in identifying useful diagnostic and prognostic biomarkers and propose approaches to overcome them and aid efforts towards clinical implementation.
Collapse
Affiliation(s)
- Maria Tsakiroglou
- Department of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
- *Correspondence: Maria Tsakiroglou,
| | - Anthony Evans
- Computational Biology Facility, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Munir Pirmohamed
- Department of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| |
Collapse
|
17
|
Heil BJ, Crawford J, Greene CS. The effect of non-linear signal in classification problems using gene expression. PLoS Comput Biol 2023; 19:e1010984. [PMID: 36972227 PMCID: PMC10079219 DOI: 10.1371/journal.pcbi.1010984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 04/06/2023] [Accepted: 02/28/2023] [Indexed: 03/29/2023] Open
Abstract
Those building predictive models from transcriptomic data are faced with two conflicting perspectives. The first, based on the inherent high dimensionality of biological systems, supposes that complex non-linear models such as neural networks will better match complex biological systems. The second, imagining that complex systems will still be well predicted by simple dividing lines prefers linear models that are easier to interpret. We compare multi-layer neural networks and logistic regression across multiple prediction tasks on GTEx and Recount3 datasets and find evidence in favor of both possibilities. We verified the presence of non-linear signal when predicting tissue and metadata sex labels from expression data by removing the predictive linear signal with Limma, and showed the removal ablated the performance of linear methods but not non-linear ones. However, we also found that the presence of non-linear signal was not necessarily sufficient for neural networks to outperform logistic regression. Our results demonstrate that while multi-layer neural networks may be useful for making predictions from gene expression data, including a linear baseline model is critical because while biological systems are high-dimensional, effective dividing lines for predictive models may not be.
Collapse
Affiliation(s)
- Benjamin J. Heil
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Pennsylvania, United States of America
| | - Jake Crawford
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Pennsylvania, United States of America
| | - Casey S. Greene
- Department of Pharmacology, University of Colorado School of Medicine, Colorado, United States of America
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Colorado, United States of America
| |
Collapse
|
18
|
Price BA, Marron JS, Mose LE, Perou CM, Parker JS. Translating transcriptomic findings from cancer model systems to humans through joint dimension reduction. Commun Biol 2023; 6:179. [PMID: 36797360 PMCID: PMC9935626 DOI: 10.1038/s42003-023-04529-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 01/25/2023] [Indexed: 02/18/2023] Open
Abstract
Model systems are an essential resource in cancer research. They simulate effects that we can infer into humans, but come at a risk of inaccurately representing human biology. This inaccuracy can lead to inconclusive experiments or misleading results, urging the need for an improved process for translating model system findings into human-relevant data. We present a process for applying joint dimension reduction (jDR) to horizontally integrate gene expression data across model systems and human tumor cohorts. We then use this approach to combine human TCGA gene expression data with data from human cancer cell lines and mouse model tumors. By identifying the aspects of genomic variation joint-acting across cohorts, we demonstrate how predictive modeling and clinical biomarkers from model systems can be improved.
Collapse
Affiliation(s)
- Brandon A Price
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - J S Marron
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Lisle E Mose
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Charles M Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Joel S Parker
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
19
|
Li T, Kou G, Peng Y. A New Representation Learning Approach for Credit Data Analysis. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.01.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
20
|
Buckler AJ, Marlevi D, Skenteris NT, Lengquist M, Kronqvist M, Matic L, Hedin U. In silico model of atherosclerosis with individual patient calibration to enable precision medicine for cardiovascular disease. Comput Biol Med 2023; 152:106364. [PMID: 36525832 DOI: 10.1016/j.compbiomed.2022.106364] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 11/01/2022] [Accepted: 11/25/2022] [Indexed: 12/03/2022]
Abstract
OBJECTIVE Guidance for preventing myocardial infarction and ischemic stroke by tailoring treatment for individual patients with atherosclerosis is an unmet need. Such development may be possible with computational modeling. Given the multifactorial biology of atherosclerosis, modeling must be based on complete biological networks that capture protein-protein interactions estimated to drive disease progression. Here, we aimed to develop a clinically relevant scale model of atherosclerosis, calibrate it with individual patient data, and use it to simulate optimized pharmacotherapy for individual patients. APPROACH AND RESULTS The study used a uniquely constituted plaque proteomic dataset to create a comprehensive systems biology disease model for simulating individualized responses to pharmacotherapy. Plaque tissue was collected from 18 patients with 6735 proteins at two locations per patient. 113 pathways were identified and included in the systems biology model of endothelial cells, vascular smooth muscle cells, macrophages, lymphocytes, and the integrated intima, altogether spanning 4411 proteins, demonstrating a range of 39-96% plaque instability. After calibrating the systems biology models for individual patients, we simulated intensive lipid-lowering, anti-inflammatory, and anti-diabetic drugs. We also simulated a combination therapy. Drug response was evaluated as the degree of change in plaque stability, where an improvement was defined as a reduction of plaque instability. In patients with initially unstable lesions, simulated responses varied from high (20%, on combination therapy) to marginal improvement, whereas patients with initially stable plaques showed generally less improvement. CONCLUSION In this pilot study, proteomics-based system biology modeling was shown to simulate drug response based on atherosclerotic plaque instability with a power of 90%, providing a potential strategy for improved personalized management of patients with cardiovascular disease.
Collapse
Affiliation(s)
- Andrew J Buckler
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden; Elucid Bioimaging Inc., Boston, MA, USA
| | - David Marlevi
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Nikolaos T Skenteris
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Mariette Lengquist
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Malin Kronqvist
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Ljubica Matic
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Ulf Hedin
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden.
| |
Collapse
|
21
|
Abrar S, Samad MD. Perturbation of deep autoencoder weights for model compression and classification of tabular data. Neural Netw 2022; 156:160-169. [PMID: 36270199 PMCID: PMC9669225 DOI: 10.1016/j.neunet.2022.09.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 07/18/2022] [Accepted: 09/19/2022] [Indexed: 11/16/2022]
Abstract
Fully connected deep neural networks (DNN) often include redundant weights leading to overfitting and high memory requirements. Additionally, in tabular data classification, DNNs are challenged by the often superior performance of traditional machine learning models. This paper proposes periodic perturbations (prune and regrow) of DNN weights, especially at the self-supervised pre-training stage of deep autoencoders. The proposed weight perturbation strategy outperforms dropout learning or weight regularization (L1 or L2) for four out of six tabular data sets in downstream classification tasks. Unlike dropout learning, the proposed weight perturbation routine additionally achieves 15% to 40% sparsity across six tabular data sets, resulting in compressed pretrained models. The proposed pretrained model compression improves the accuracy of downstream classification, unlike traditional weight pruning methods that trade off performance for model compression. Our experiments reveal that a pretrained deep autoencoder with weight perturbation can outperform traditional machine learning in tabular data classification, whereas baseline fully-connected DNNs yield the worst classification accuracy. However, traditional machine learning models are superior to any deep model when a tabular data set contains uncorrelated variables. Therefore, the performance of deep models with tabular data is contingent on the types and statistics of constituent variables.
Collapse
Affiliation(s)
- Sakib Abrar
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, United States
| | - Manar D Samad
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, United States.
| |
Collapse
|
22
|
Qin R, Mahal LK, Bojar D. Deep learning explains the biology of branched glycans from single-cell sequencing data. iScience 2022; 25:105163. [PMID: 36217547 PMCID: PMC9547197 DOI: 10.1016/j.isci.2022.105163] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 09/06/2022] [Accepted: 09/16/2022] [Indexed: 11/03/2022] Open
Abstract
Glycosylation is ubiquitous and often dysregulated in disease. However, the regulation and functional significance of various types of glycosylation at cellular levels is hard to unravel experimentally. Multi-omics, single-cell measurements such as SUGAR-seq, which quantifies transcriptomes and cell surface glycans, facilitate addressing this issue. Using SUGAR-seq data, we pioneered a deep learning model to predict the glycan phenotypes of cells (mouse T lymphocytes) from transcripts, with the example of predicting β1,6GlcNAc-branching across T cell subtypes (test set F1 score: 0.9351). Model interpretation via SHAP (SHapley Additive exPlanations) identified highly predictive genes, in part known to impact (i) branched glycan levels and (ii) the biology of branched glycans. These genes included physiologically relevant low-abundance genes that were not captured by conventional differential expression analysis. Our work shows that interpretable deep learning models are promising for uncovering novel functions and regulatory mechanisms of glycans from integrated transcriptomic and glycomic datasets.
Collapse
Affiliation(s)
- Rui Qin
- Department of Chemistry, University of Alberta, Edmonton, AB T6G 2G2, Canada
| | - Lara K. Mahal
- Department of Chemistry, University of Alberta, Edmonton, AB T6G 2G2, Canada
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 405 30 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 405 30 Gothenburg, Sweden
| |
Collapse
|
23
|
Desaire H, Go EP, Hua D. Advances, obstacles, and opportunities for machine learning in proteomics. CELL REPORTS. PHYSICAL SCIENCE 2022; 3:101069. [PMID: 36381226 PMCID: PMC9648337 DOI: 10.1016/j.xcrp.2022.101069] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The fields of proteomics and machine learning are both large disciplines, each producing well over 5,000 publications per year. However, studies combining both fields are still relatively rare, with only about 2% of recent proteomics papers including machine learning. This review, which focuses on the intersection of the fields, is intended to inspire proteomics researchers to develop skills and knowledge in the application of machine learning. A brief tutorial introduction to machine learning is provided, and research advances that rely on both fields, particularly as they relate to proteomics tools development and biomarker discovery, are highlighted. Key knowledge gaps and opportunities for scientific advancement are also enumerated.
Collapse
Affiliation(s)
- Heather Desaire
- Department of Chemistry, University of Kansas, Lawrence, KS 66045, USA
| | - Eden P. Go
- Department of Chemistry, University of Kansas, Lawrence, KS 66045, USA
| | - David Hua
- Department of Chemistry, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
24
|
Keshavarz-Rahaghi F, Pleasance E, Kolisnik T, Jones SJM. A p53 transcriptional signature in primary and metastatic cancers derived using machine learning. Front Genet 2022; 13:987238. [PMID: 36134028 PMCID: PMC9483853 DOI: 10.3389/fgene.2022.987238] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
The tumor suppressor gene, TP53, has the highest rate of mutation among all genes in human cancer. This transcription factor plays an essential role in the regulation of many cellular processes. Mutations in TP53 result in loss of wild-type p53 function in a dominant negative manner. Although TP53 is a well-studied gene, the transcriptome modifications caused by the mutations in this gene have not yet been explored in a pan-cancer study using both primary and metastatic samples. In this work, we used a random forest model to stratify tumor samples based on TP53 mutational status and detected a p53 transcriptional signature. We hypothesize that the existence of this transcriptional signature is due to the loss of wild-type p53 function and is universal across primary and metastatic tumors as well as different tumor types. Additionally, we showed that the algorithm successfully detected this signature in samples with apparent silent mutations that affect correct mRNA splicing. Furthermore, we observed that most of the highly ranked genes contributing to the classification extracted from the random forest have known associations with p53 within the literature. We suggest that other genes found in this list including GPSM2, OR4N2, CTSL2, SPERT, and RPE65 protein coding genes have yet undiscovered linkages to p53 function. Our analysis of time on different therapies also revealed that this signature is more effective than the recorded TP53 status in detecting patients who can benefit from platinum therapies and taxanes. Our findings delineate a p53 transcriptional signature, expand the knowledge of p53 biology and further identify genes important in p53 related pathways.
Collapse
Affiliation(s)
- Faeze Keshavarz-Rahaghi
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Erin Pleasance
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Tyler Kolisnik
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- School of Natural and Computational Sciences, Massey University, Auckland, New Zealand
| | - Steven J. M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Vancouver, BC, Canada
- *Correspondence: Steven J. M. Jones,
| |
Collapse
|
25
|
Hanczar B, Bourgeais V, Zehraoui F. Assessment of deep learning and transfer learning for cancer prediction based on gene expression data. BMC Bioinformatics 2022; 23:262. [PMID: 35786378 PMCID: PMC9250744 DOI: 10.1186/s12859-022-04807-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 06/15/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Machine learning is now a standard tool for cancer prediction based on gene expression data. However, deep learning is still new for this task, and there is no clear consensus about its performance and utility. Few experimental works have evaluated deep neural networks and compared them with state-of-the-art machine learning. Moreover, their conclusions are not consistent. RESULTS We extensively evaluate the deep learning approach on 22 cancer prediction tasks based on gene expression data. We measure the impact of the main hyper-parameters and compare the performances of neural networks with the state-of-the-art. We also investigate the effectiveness of several transfer learning schemes in different experimental setups. CONCLUSION Based on our experimentations, we provide several recommendations to optimize the construction and training of a neural network model. We show that neural networks outperform the state-of-the-art methods only for very large training set size. For a small training set, we show that transfer learning is possible and may strongly improve the model performance in some cases.
Collapse
Affiliation(s)
- Blaise Hanczar
- IBISC, Université Paris-Saclay (Univ. Evry), 23 boulevard de France, 91034, Evry, France.
| | - Victoria Bourgeais
- IBISC, Université Paris-Saclay (Univ. Evry), 23 boulevard de France, 91034, Evry, France
| | - Farida Zehraoui
- IBISC, Université Paris-Saclay (Univ. Evry), 23 boulevard de France, 91034, Evry, France
| |
Collapse
|
26
|
Lee BD, Gitter A, Greene CS, Raschka S, Maguire F, Titus AJ, Kessler MD, Lee AJ, Chevrette MG, Stewart PA, Britto-Borges T, Cofer EM, Yu KH, Carmona JJ, Fertig EJ, Kalinin AA, Signal B, Lengerich BJ, Triche TJ, Boca SM. Ten quick tips for deep learning in biology. PLoS Comput Biol 2022; 18:e1009803. [PMID: 35324884 PMCID: PMC8946751 DOI: 10.1371/journal.pcbi.1009803] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Affiliation(s)
- Benjamin D. Lee
- In-Q-Tel Labs, Arlington, Virginia, United States of America
- School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Morgridge Institute for Research, Madison, Wisconsin, United States of America
| | - Casey S. Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
- Center for Health AI, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Sebastian Raschka
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Alexander J. Titus
- University of New Hampshire, Manchester, New Hampshire, United States of America
- Bioeconomy.XYZ, Manchester, New Hampshire, United States of America
| | - Michael D. Kessler
- Department of Oncology, Johns Hopkins University, Baltimore, Maryland, United States of America
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Alexandra J. Lee
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Marc G. Chevrette
- Wisconsin Institute for Discovery and Department of Plant Pathology, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Paul Allen Stewart
- Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, Florida, United States of America
| | - Thiago Britto-Borges
- Section of Bioinformatics and Systems Cardiology, Klaus Tschira Institute for Integrative Computational Cardiology, University Hospital Heidelberg, Heidelberg, Germany
- Department of Internal Medicine III (Cardiology, Angiology, and Pneumology), University Hospital Heidelberg, Heidelberg, Germany
| | - Evan M. Cofer
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Graduate Program in Quantitative and Computational Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Kun-Hsing Yu
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Pathology, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
| | - Juan Jose Carmona
- Philips Healthcare, Cambridge, Massachusetts, United States of America
| | - Elana J. Fertig
- Department of Oncology, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Biomedical Engineering, Department of Applied Mathematics and Statistics, Convergence Institute, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Alexandr A. Kalinin
- Medical Big Data Group, Shenzhen Research Institute of Big Data, Shenzhen, China
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Brandon Signal
- School of Medicine, College of Health and Medicine, University of Tasmania, Hobart, Australia
| | - Benjamin J. Lengerich
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Timothy J. Triche
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, Michigan, United States of America
- Department of Pediatrics, College of Human Medicine, Michigan State University, East Lansing, Michigan, United States of America
- Department of Translational Genomics, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Simina M. Boca
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, District of Columbia, United States of America
- Department of Oncology, Georgetown University Medical Center, Washington, DC, United States of America
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University Medical Center, Washington, DC, United States of America
- Cancer Prevention and Control Program, Lombardi Comprehensive Cancer Center, Washington, DC, United States of America
| |
Collapse
|
27
|
Maudsley S, Leysen H, van Gastel J, Martin B. Systems Pharmacology: Enabling Multidimensional Therapeutics. COMPREHENSIVE PHARMACOLOGY 2022:725-769. [DOI: 10.1016/b978-0-12-820472-6.00017-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
28
|
Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol 2022; 23:40-55. [PMID: 34518686 DOI: 10.1038/s41580-021-00407-0] [Citation(s) in RCA: 790] [Impact Index Per Article: 263.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2021] [Indexed: 02/08/2023]
Abstract
The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.
Collapse
Affiliation(s)
- Joe G Greener
- Department of Computer Science, University College London, London, UK
| | - Shaun M Kandathil
- Department of Computer Science, University College London, London, UK
| | - Lewis Moffat
- Department of Computer Science, University College London, London, UK
| | - David T Jones
- Department of Computer Science, University College London, London, UK.
| |
Collapse
|
29
|
Shen WX, Liu Y, Chen Y, Zeng X, Tan Y, Jiang YY, Chen Y. OUP accepted manuscript. Nucleic Acids Res 2022; 50:e45. [PMID: 35100418 PMCID: PMC9071488 DOI: 10.1093/nar/gkac010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 12/01/2021] [Accepted: 01/06/2022] [Indexed: 11/20/2022] Open
Abstract
Omics-based biomedical learning frequently relies on data of high-dimensions (up to thousands) and low-sample sizes (dozens to hundreds), which challenges efficient deep learning (DL) algorithms, particularly for low-sample omics investigations. Here, an unsupervised novel feature aggregation tool AggMap was developed to Aggregate and Map omics features into multi-channel 2D spatial-correlated image-like feature maps (Fmaps) based on their intrinsic correlations. AggMap exhibits strong feature reconstruction capabilities on a randomized benchmark dataset, outperforming existing methods. With AggMap multi-channel Fmaps as inputs, newly-developed multi-channel DL AggMapNet models outperformed the state-of-the-art machine learning models on 18 low-sample omics benchmark tasks. AggMapNet exhibited better robustness in learning noisy data and disease classification. The AggMapNet explainable module Simply-explainer identified key metabolites and proteins for COVID-19 detections and severity predictions. The unsupervised AggMap algorithm of good feature restructuring abilities combined with supervised explainable AggMapNet architecture establish a pipeline for enhanced learning and interpretability of low-sample omics data.
Collapse
Affiliation(s)
- Wan Xiang Shen
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, P.R. China
- Bioinformatics and Drug Design Group, Department of Pharmacy, and Center for Computational Science and Engineering, National University of Singapore 117543, Singapore
| | - Yu Liu
- Institute for Health Innovation & Technology, National University of Singapore 117543, Singapore
- Department of Biomedical Engineering, Faculty of Engineering, National University of Singapore 117543, Singapore
| | - Yan Chen
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, P.R. China
| | - Xian Zeng
- Department of Biological Medicines & Shanghai Engineering Research Center of Immunotherapeutics, School of Pharmacy, Fudan University, Shanghai 201203, P.R. China
| | - Ying Tan
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, P.R. China
- Shenzhen Kivita Innovative Drug Discovery Institute, Shenzhen 518110, P.R. China
| | - Yu Yang Jiang
- Correspondence may also be addressed to Yu Yang Jiang. Tel: +86 755 2603635;
| | - Yu Zong Chen
- To whom correspondence should be addressed. Tel: +86 755 26032094;
| |
Collapse
|
30
|
Leysen H, Walter D, Christiaenssen B, Vandoren R, Harputluoğlu İ, Van Loon N, Maudsley S. GPCRs Are Optimal Regulators of Complex Biological Systems and Orchestrate the Interface between Health and Disease. Int J Mol Sci 2021; 22:ijms222413387. [PMID: 34948182 PMCID: PMC8708147 DOI: 10.3390/ijms222413387] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/08/2021] [Accepted: 12/09/2021] [Indexed: 02/06/2023] Open
Abstract
GPCRs arguably represent the most effective current therapeutic targets for a plethora of diseases. GPCRs also possess a pivotal role in the regulation of the physiological balance between healthy and pathological conditions; thus, their importance in systems biology cannot be underestimated. The molecular diversity of GPCR signaling systems is likely to be closely associated with disease-associated changes in organismal tissue complexity and compartmentalization, thus enabling a nuanced GPCR-based capacity to interdict multiple disease pathomechanisms at a systemic level. GPCRs have been long considered as controllers of communication between tissues and cells. This communication involves the ligand-mediated control of cell surface receptors that then direct their stimuli to impact cell physiology. Given the tremendous success of GPCRs as therapeutic targets, considerable focus has been placed on the ability of these therapeutics to modulate diseases by acting at cell surface receptors. In the past decade, however, attention has focused upon how stable multiprotein GPCR superstructures, termed receptorsomes, both at the cell surface membrane and in the intracellular domain dictate and condition long-term GPCR activities associated with the regulation of protein expression patterns, cellular stress responses and DNA integrity management. The ability of these receptorsomes (often in the absence of typical cell surface ligands) to control complex cellular activities implicates them as key controllers of the functional balance between health and disease. A greater understanding of this function of GPCRs is likely to significantly augment our ability to further employ these proteins in a multitude of diseases.
Collapse
Affiliation(s)
- Hanne Leysen
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Deborah Walter
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Bregje Christiaenssen
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Romi Vandoren
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - İrem Harputluoğlu
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
- Department of Chemistry, Middle East Technical University, Çankaya, Ankara 06800, Turkey
| | - Nore Van Loon
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Stuart Maudsley
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
- Correspondence:
| |
Collapse
|
31
|
Mourragui SMC, Loog M, Vis DJ, Moore K, Manjon AG, van de Wiel MA, Reinders MJT, Wessels LFA. Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning. Proc Natl Acad Sci U S A 2021; 118:e2106682118. [PMID: 34873056 PMCID: PMC8670522 DOI: 10.1073/pnas.2106682118] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2021] [Indexed: 12/13/2022] Open
Abstract
Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we developed TRANSACT, a computational framework that builds a consensus space to capture biological processes common to preclinical models and human tumors and exploits this space to construct drug response predictors that robustly transfer from preclinical models to human tumors. TRANSACT performs favorably compared to four competing approaches, including two deep learning approaches, on a set of 23 drug prediction challenges on The Cancer Genome Atlas and 226 metastatic tumors from the Hartwig Medical Foundation. We demonstrate that response predictions deliver a robust performance for a number of therapies of high clinical importance: platinum-based chemotherapies, gemcitabine, and paclitaxel. In contrast to other approaches, we demonstrate the interpretability of the TRANSACT predictors by correctly identifying known biomarkers of targeted therapies, and we propose potential mechanisms that mediate the resistance to two chemotherapeutic agents.
Collapse
Affiliation(s)
- Soufiane M C Mourragui
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
| | - Marco Loog
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
- Department of Computer Science, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Daniel J Vis
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Kat Moore
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Anna G Manjon
- Division of Cell Biology, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Mark A van de Wiel
- Epidemiology and Biostatistics, Amsterdam University Medical Center, 1105 AZ Amsterdam, The Netherlands
- Medical Research Council Biostatistics Unit, Cambridge University, Cambridge CB2 0SR, United Kingdom
| | - Marcel J T Reinders
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands;
- Leiden Computational Biology Center, Leiden University Medical Center, 2333 ZC Leiden, The Netherlands
| | - Lodewyk F A Wessels
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands;
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
| |
Collapse
|
32
|
Clinical impact and quality of randomized controlled trials involving interventions evaluating artificial intelligence prediction tools: a systematic review. NPJ Digit Med 2021; 4:154. [PMID: 34711955 PMCID: PMC8553754 DOI: 10.1038/s41746-021-00524-2] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 09/30/2021] [Indexed: 12/23/2022] Open
Abstract
The evidence of the impact of traditional statistical (TS) and artificial intelligence (AI) tool interventions in clinical practice was limited. This study aimed to investigate the clinical impact and quality of randomized controlled trials (RCTs) involving interventions evaluating TS, machine learning (ML), and deep learning (DL) prediction tools. A systematic review on PubMed was conducted to identify RCTs involving TS/ML/DL tool interventions in the past decade. A total of 65 RCTs from 26,082 records were included. A majority of them had model development studies and generally good performance was achieved. The function of TS and ML tools in the RCTs mainly included assistive treatment decisions, assistive diagnosis, and risk stratification, but DL trials were only conducted for assistive diagnosis. Nearly two-fifths of the trial interventions showed no clinical benefit compared to standard care. Though DL and ML interventions achieved higher rates of positive results than TS in the RCTs, in trials with low risk of bias (17/65) the advantage of DL to TS was reduced while the advantage of ML to TS disappeared. The current applications of DL were not yet fully spread performed in medicine. It is predictable that DL will integrate more complex clinical problems than ML and TS tools in the future. Therefore, rigorous studies are required before the clinical application of these tools.
Collapse
|
33
|
Golriz Khatami S, Mubeen S, Bharadhwaj VS, Kodamullil AT, Hofmann-Apitius M, Domingo-Fernández D. Using predictive machine learning models for drug response simulation by calibrating patient-specific pathway signatures. NPJ Syst Biol Appl 2021; 7:40. [PMID: 34707117 PMCID: PMC8551267 DOI: 10.1038/s41540-021-00199-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 09/21/2021] [Indexed: 11/21/2022] Open
Abstract
The utility of pathway signatures lies in their capability to determine whether a specific pathway or biological process is dysregulated in a given patient. These signatures have been widely used in machine learning (ML) methods for a variety of applications including precision medicine, drug repurposing, and drug discovery. In this work, we leverage highly predictive ML models for drug response simulation in individual patients by calibrating the pathway activity scores of disease samples. Using these ML models and an intuitive scoring algorithm to modify the signatures of patients, we evaluate whether a given sample that was formerly classified as diseased, could be predicted as normal following drug treatment simulation. We then use this technique as a proxy for the identification of potential drug candidates. Furthermore, we demonstrate the ability of our methodology to successfully identify approved and clinically investigated drugs for four different cancers, outperforming six comparable state-of-the-art methods. We also show how this approach can deconvolute a drugs' mechanism of action and propose combination therapies. Taken together, our methodology could be promising to support clinical decision-making in personalized medicine by simulating a drugs' effect on a given patient.
Collapse
Affiliation(s)
- Sepehr Golriz Khatami
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin, 53757, Germany.
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany.
| | - Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin, 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Fraunhofer Center for Machine Learning, Sankt Augustin, Germany
| | - Vinay Srinivas Bharadhwaj
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin, 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
| | - Alpha Tom Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin, 53757, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin, 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin, 53757, Germany.
- Fraunhofer Center for Machine Learning, Sankt Augustin, Germany.
- Enveda Biosciences, Boulder, CO, 80301, USA.
| |
Collapse
|
34
|
Improved prediction of smoking status via isoform-aware RNA-seq deep learning models. PLoS Comput Biol 2021; 17:e1009433. [PMID: 34634029 PMCID: PMC8530282 DOI: 10.1371/journal.pcbi.1009433] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 10/21/2021] [Accepted: 09/08/2021] [Indexed: 11/19/2022] Open
Abstract
Most predictive models based on gene expression data do not leverage information related to gene splicing, despite the fact that splicing is a fundamental feature of eukaryotic gene expression. Cigarette smoking is an important environmental risk factor for many diseases, and it has profound effects on gene expression. Using smoking status as a prediction target, we developed deep neural network predictive models using gene, exon, and isoform level quantifications from RNA sequencing data in 2,557 subjects in the COPDGene Study. We observed that models using exon and isoform quantifications clearly outperformed gene-level models when using data from 5 genes from a previously published prediction model. Whereas the test set performance of the previously published model was 0.82 in the original publication, our exon-based models including an exon-to-isoform mapping layer achieved a test set AUC (area under the receiver operating characteristic) of 0.88, which improved to an AUC of 0.94 using exon quantifications from a larger set of genes. Isoform variability is an important source of latent information in RNA-seq data that can be used to improve clinical prediction models. Predictive models based on gene expression are already a part of medical decision making for selected situations such as early breast cancer treatment. Most of these models are based on measures that do not capture critical aspects of gene splicing, but with RNA sequencing it is possible to capture some of these aspects of alternative splicing and use them to improve clinical predictions. Building on previous models to predict cigarette smoking status, we show that measures of alternative splicing significantly improve the accuracy of these predictive models.
Collapse
|
35
|
Wartmann H, Heins S, Kloiber K, Bonn S. Bias-invariant RNA-sequencing metadata annotation. Gigascience 2021; 10:giab064. [PMID: 34553213 PMCID: PMC8559615 DOI: 10.1093/gigascience/giab064] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 06/11/2021] [Accepted: 09/01/2021] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND Recent technological advances have resulted in an unprecedented increase in publicly available biomedical data, yet the reuse of the data is often precluded by experimental bias and a lack of annotation depth and consistency. Missing annotations makes it impossible for researchers to find datasets specific to their needs. FINDINGS Here, we investigate RNA-sequencing metadata prediction based on gene expression values. We present a deep-learning-based domain adaptation algorithm for the automatic annotation of RNA-sequencing metadata. We show, in multiple experiments, that our model is better at integrating heterogeneous training data compared with existing linear regression-based approaches, resulting in improved tissue type classification. By using a model architecture similar to Siamese networks, the algorithm can learn biases from datasets with few samples. CONCLUSION Using our novel domain adaptation approach, we achieved metadata annotation accuracies up to 15.7% better than a previously published method. Using the best model, we provide a list of >10,000 novel tissue and sex label annotations for 8,495 unique SRA samples. Our approach has the potential to revive idle datasets by automated annotation making them more searchable.
Collapse
Affiliation(s)
- Hannes Wartmann
- Institute of Medical Systems Biology, Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Sven Heins
- Institute of Medical Systems Biology, Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Karin Kloiber
- Institute of Medical Systems Biology, Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Stefan Bonn
- Institute of Medical Systems Biology, Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| |
Collapse
|
36
|
Machine Learning Modeling from Omics Data as Prospective Tool for Improvement of Inflammatory Bowel Disease Diagnosis and Clinical Classifications. Genes (Basel) 2021; 12:genes12091438. [PMID: 34573420 PMCID: PMC8466305 DOI: 10.3390/genes12091438] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 08/21/2021] [Accepted: 09/14/2021] [Indexed: 12/14/2022] Open
Abstract
Research of inflammatory bowel disease (IBD) has identified numerous molecular players involved in the disease development. Even so, the understanding of IBD is incomplete, while disease treatment is still far from the precision medicine. Reliable diagnostic and prognostic biomarkers in IBD are limited which may reduce efficient therapeutic outcomes. High-throughput technologies and artificial intelligence emerged as powerful tools in search of unrevealed molecular patterns that could give important insights into IBD pathogenesis and help to address unmet clinical needs. Machine learning, a subtype of artificial intelligence, uses complex mathematical algorithms to learn from existing data in order to predict future outcomes. The scientific community has been increasingly employing machine learning for the prediction of IBD outcomes from comprehensive patient data-clinical records, genomic, transcriptomic, proteomic, metagenomic, and other IBD relevant omics data. This review aims to present fundamental principles behind machine learning modeling and its current application in IBD research with the focus on studies that explored genomic and transcriptomic data. We described different strategies used for dealing with omics data and outlined the best-performing methods. Before being translated into clinical settings, the developed machine learning models should be tested in independent prospective studies as well as randomized controlled trials.
Collapse
|
37
|
Srinivas Bharadhwaj V, Ali M, Birkenbihl C, Mubeen S, Lehmann J, Hofmann-Apitius M, Tapley Hoyt C, Domingo-Fernández D. CLEP: A Hybrid Data- and Knowledge- Driven Framework for Generating Patient Representations. Bioinformatics 2021; 37:3311-3318. [PMID: 33964127 PMCID: PMC8504642 DOI: 10.1093/bioinformatics/btab340] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 03/29/2021] [Accepted: 05/03/2021] [Indexed: 12/29/2022] Open
Abstract
Summary As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation. Availability and implementation CLEP is available to the bioinformatics community as an open source Python package at https://github.com/hybrid-kg/clep under the Apache 2.0 License. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Vinay Srinivas Bharadhwaj
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany.,Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
| | - Mehdi Ali
- Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53113, Germany.,Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Dresden and Sankt Augustin, Germany
| | - Colin Birkenbihl
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany.,Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
| | - Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany.,Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany.,Fraunhofer Center for Machine Learning, Germany
| | - Jens Lehmann
- Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53113, Germany.,Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Dresden and Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany.,Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53113, Germany.,Fraunhofer Center for Machine Learning, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53113, Germany.,Fraunhofer Center for Machine Learning, Germany
| |
Collapse
|
38
|
Del Giudice M, Peirone S, Perrone S, Priante F, Varese F, Tirtei E, Fagioli F, Cereda M. Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data to Foster Precision Oncology. Int J Mol Sci 2021; 22:ijms22094563. [PMID: 33925407 PMCID: PMC8123853 DOI: 10.3390/ijms22094563] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 04/21/2021] [Accepted: 04/23/2021] [Indexed: 02/01/2023] Open
Abstract
Artificial intelligence, or the discipline of developing computational algorithms able to perform tasks that requires human intelligence, offers the opportunity to improve our idea and delivery of precision medicine. Here, we provide an overview of artificial intelligence approaches for the analysis of large-scale RNA-sequencing datasets in cancer. We present the major solutions to disentangle inter- and intra-tumor heterogeneity of transcriptome profiles for an effective improvement of patient management. We outline the contributions of learning algorithms to the needs of cancer genomics, from identifying rare cancer subtypes to personalizing therapeutic treatments.
Collapse
Affiliation(s)
- Marco Del Giudice
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy
| | - Serena Peirone
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics and INFN, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Sarah Perrone
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Francesca Priante
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Fabiola Varese
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
| | - Elisa Tirtei
- Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.)
| | - Franca Fagioli
- Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.)
- Department of Public Health and Paediatric Sciences, University of Torino, 10124 Turin, Italy
| | - Matteo Cereda
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy
- Correspondence: ; Tel.: +39-011-993-3969
| |
Collapse
|
39
|
Ning L, Huixin H. Topic Evolution Analysis for Omics Data Integration in Cancers. Front Cell Dev Biol 2021; 9:631011. [PMID: 33898421 PMCID: PMC8058380 DOI: 10.3389/fcell.2021.631011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 02/04/2021] [Indexed: 12/02/2022] Open
Abstract
One of the vital challenges for cancer diseases is efficient biomarkers monitoring formation and development are limited. Omics data integration plays a crucial role in the mining of biomarkers in the human condition. As the link between omics study on biomarkers discovery and cancer diseases is deepened, defining the principal technologies applied in the field is a must not only for the current period but also for the future. We utilize topic modeling to extract topics (or themes) as a probabilistic distribution of latent topics from the dataset. To predict the future trend of related cases, we utilize the Prophet neural network to perform a prediction correction model for existing topics. A total of 2,318 pieces of literature (from 2006 to 2020) were retrieved from MEDLINE with the query on “omics” and “cancer.” Our study found 20 topics covering current research types. The topic extraction results indicate that, with the rapid development of omics data integration research, multi-omics analysis (Topic 11) and genomics of colorectal cancer (Topic 10) have more studies reported last 15 years. From the topic prediction view, research findings in multi-omics data processing and novel biomarker discovery for cancer prediction (Topic 2, 3, 10, 11) will be heavily focused in the future. From the topic visuallization and evolution trends, metabolomics of breast cancer (Topic 9), pharmacogenomics (Topic 15), genome-guided therapy regimens (Topic 16), and microRNAs target genes (Topic 17) could have more rapidly developed in the study of cancer treatment effect and recurrence prediction.
Collapse
Affiliation(s)
- Li Ning
- Business School of Huaqiao University, Quan Zhou, China.,Business School of Huaqiao University, Quan Zhou, China
| | - He Huixin
- Management Science and Engineering Department, Management School, Xiamen University, Xiamen, China
| |
Collapse
|
40
|
Yuan Q, Zhang S, Li J, Xiao J, Li X, Yang J, Lu D, Wang Y. Comprehensive analysis of core genes and key pathways in Parkinson's disease. Am J Transl Res 2020; 12:5630-5639. [PMID: 33042444 PMCID: PMC7540129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Accepted: 07/25/2020] [Indexed: 06/11/2023]
Abstract
Parkinson's disease (PD) is a neurodegenerative disease that occurs mostly in middle-aged and older adults. Its main pathological feature is the progressive death of substantia nigra dopaminergic neurons. As the world's population ages, the number of PD patients is increasing. In this study, we explored the relationship between PD and the cell cycle. In this study, we collected two independent PD transcriptomic datasets, GSE54536 and GSE6613, from the Gene Expression Omnibus (GEO) database. Gene set enrichment analysis (GSEA) was used to identify dysregulated pathways in PD samples. Gene expression was verified by qPCR in PD patients. Nineteen pathways were negatively enriched in both the GSE54536 and GSE6613 datasets. Seven of these 19 pathways were cell cycle-related pathways, including the M/G1 transition, S phase, G1/S transition, mitotic G1-G1/S phases, CDT1 association with the CDC6 ORC origin complex, cell cycle checkpoints and synthesis of DNA. Next, we found that eight genes (PSMA4, PSMB1, PSMC5, PSMD11, MCM4, RPA1, POLE, and PSME4) were mainly enriched in the GSE54536 and GSE6613 datasets. In GSE54536, PSMA4, PSMB1, PSMC5, and PSME4 could significantly predict the occurrence of PD, whereas, in GSE6613, RPA1 and PSME4 could significantly predict the occurrence of PD. Only PSME4 showed significant results in both datasets. Finally, we assessed blood samples from PD patients and controls. Compared with the control samples, the PD samples had lower mRNA levels of PSME4. In summary,these findings can significantly enhance our understanding of the causes and potential molecular mechanisms of PD; the cell cycle signaling pathways and PSME4 may be therapeutic targets for PD.
Collapse
Affiliation(s)
- Qian Yuan
- Department of Neurology, Wuhan Wuchang Hospital, Wuchang Hospital Affiliated to Wuhan University of Science and TechnologyWuhan 430063, China
- Department of Neurology, The Second Affiliated Hospital of Zhengzhou UniversityZhengzhou 450014, China
| | - Simiao Zhang
- Department of Neurology, The Second Affiliated Hospital of Zhengzhou UniversityZhengzhou 450014, China
| | - Jingna Li
- Department of Neurology, The Second Affiliated Hospital of Zhengzhou UniversityZhengzhou 450014, China
| | - Jianhao Xiao
- Department of Neurology, Shanghai Pudong Hospital, Fudan University Pudong Medical CenterShanghai 201399, China
| | - Xiaodong Li
- Department of Neurology, Zhengzhou Central HospitalZhengzhou 450014, China
| | - Jingmin Yang
- NHC Key Laboratory of Birth Defects and Reproductive Health, Chongqing Population and Family Planning Science and Technology Research InstitueChongqing 400020, China
| | - Daru Lu
- NHC Key Laboratory of Birth Defects and Reproductive Health, Chongqing Population and Family Planning Science and Technology Research InstitueChongqing 400020, China
| | - Yunliang Wang
- Department of Neurology, The Second Affiliated Hospital of Zhengzhou UniversityZhengzhou 450014, China
- Department of Neurology, The 960th Hospital of Chinese PLAZibo 255300, China
| |
Collapse
|