1
|
Yamagiwa H, Hashimoto R, Arakane K, Murakami K, Soeda S, Oyama M, Zhu Y, Okada M, Shimodaira H. Predicting drug-gene relations via analogy tasks with word embeddings. Sci Rep 2025; 15:17240. [PMID: 40383732 PMCID: PMC12086191 DOI: 10.1038/s41598-025-01418-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Accepted: 05/06/2025] [Indexed: 05/20/2025] Open
Abstract
Natural language processing is utilized in a wide range of fields, where words in text are typically transformed into feature vectors called embeddings. BioConceptVec is a specific example of embeddings tailored for biology, trained on approximately 30 million PubMed abstracts using models such as skip-gram. Generally, word embeddings are known to solve analogy tasks through simple vector arithmetic. For example, subtracting the vector for man from that of king and then adding the vector for woman yields a point that lies closer to queen in the embedding space. In this study, we demonstrate that BioConceptVec embeddings, along with our own embeddings trained on PubMed abstracts, contain information about drug-gene relations and can predict target genes from a given drug through analogy computations. We also show that categorizing drugs and genes using biological pathways improves performance. Furthermore, we illustrate that vectors derived from known relations in the past can predict unknown future relations in datasets divided by year. Despite the simplicity of implementing analogy tasks as vector additions, our approach demonstrated performance comparable to that of large language models such as GPT-4 in predicting drug-gene relations.
Collapse
Affiliation(s)
| | | | - Kiwamu Arakane
- Institute for Protein Research, Osaka University, Osaka, Japan
| | - Ken Murakami
- Research Institute of Molecular Pathology, Vienna BioCenter, Vienna, Austria
| | - Shou Soeda
- Institute for Protein Research, Osaka University, Osaka, Japan
| | - Momose Oyama
- Kyoto University, Kyoto, Japan
- RIKEN, Tokyo, Japan
| | | | - Mariko Okada
- Institute for Protein Research, Osaka University, Osaka, Japan
| | | |
Collapse
|
2
|
Yoon JW, Kim KM, Cho S, Cho MJ, Park S, Hwang D, Kim HR, Park SH, Cho JH, Jeong H, Choi JM. Th1-poised naive CD4 T cell subpopulation reflects anti-tumor immunity and autoimmune disease. Nat Commun 2025; 16:1962. [PMID: 40000667 PMCID: PMC11861895 DOI: 10.1038/s41467-025-57237-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 02/13/2025] [Indexed: 02/27/2025] Open
Abstract
Naïve CD4 T cells are traditionally viewed as a quiescent, homogeneous, resting population, but emerging evidence reveals their heterogeneity, which can be crucial for understanding disease contexts and therapeutic outcomes. In this study, we identify distinct subpopulations within both murine and human naïve CD4 T cells by single cell-RNA-sequencing (scRNA-seq), particularly focusing on a subpopulation that expresses super-high levels of interleukin-7 receptor (IL-7Rsup-hi), along with CD97, IL-18R, and Ly6C. This subpopulation, absent in the thymus and peripherally induced, exhibits type 1 helper T cell (Th1)-poised characteristics and contributes to the inhibition of cancer progression in B16F10 tumor-bearing mice. In humans, this IL-7Rsup-hi subpopulation expressing CD97 correlates with the responsiveness to anti-PD-1 therapy in cancer patients and the disease state of multiple sclerosis. By elucidating the heterogeneity of naive CD4 T cells and identifying a Th1-poised subpopulation capable of robust type 1 responses, we highlight the importance of this heterogeneity in inflammatory conditions for defining the disease states and predicting drug responsiveness.
Collapse
Affiliation(s)
- Jae-Won Yoon
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul, 04763, Republic of Korea
| | - Kyung Min Kim
- Department of Biological Sciences, Seoul National University, Seoul, Korea
| | - Sookyung Cho
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul, 04763, Republic of Korea
| | - Min-Ji Cho
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul, 04763, Republic of Korea
| | - Seonjun Park
- Department of Biological Sciences, Ulsan National Institute of Science & Technology (UNIST), Ulsan, Republic of Korea
| | - Daehee Hwang
- Department of Biological Sciences, Seoul National University, Seoul, Korea
| | - Hye Ryun Kim
- Division of Medical Oncology, Department of Internal Medicine, Yonsei Cancer Center Yonsei University College of Medicine, Seoul, 03722, Republic of Korea
| | - Sung Ho Park
- Department of Biological Sciences, Ulsan National Institute of Science & Technology (UNIST), Ulsan, Republic of Korea
| | - Jae-Ho Cho
- Medical Research Center for Combinatorial Tumor Immunotherapy, Department of Microbiology and Immunology, Chonnam National University Medical School, Hwasun, 58128, Korea
| | - Hyobin Jeong
- Department of Systems Biology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, South Korea.
| | - Je-Min Choi
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul, 04763, Republic of Korea.
- Hanyang Institute of Bioscience and Biotechnology, Hanyang University, Seoul, Republic of Korea.
- Research Institute for Natural Sciences, Hanyang University, Seoul, Republic of Korea.
- Research Institute for Convergence of Basic Sciences, Hanyang University, Seoul, 04763, Republic of Korea.
| |
Collapse
|
3
|
Zhu J, Zhang K, Chen Y, Ge X, Wu J, Xu P, Yao J. Progress of single-cell RNA sequencing combined with spatial transcriptomics in tumour microenvironment and treatment of pancreatic cancer. J Transl Med 2024; 22:563. [PMID: 38867230 PMCID: PMC11167806 DOI: 10.1186/s12967-024-05307-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 05/16/2024] [Indexed: 06/14/2024] Open
Abstract
In recent years, single-cell analyses have revealed the heterogeneity of the tumour microenvironment (TME) at the genomic, transcriptomic, and proteomic levels, further improving our understanding of the mechanisms of tumour development. Single-cell RNA sequencing (scRNA-seq) technology allow analysis of the transcriptome at the single-cell level and have unprecedented potential for exploration of the characteristics involved in tumour development and progression. These techniques allow analysis of transcript sequences at higher resolution, thereby increasing our understanding of the diversity of cells found in the tumour microenvironment and how these cells interact in complex tumour tissue. Although scRNA-seq has emerged as an important tool for studying the tumour microenvironment in recent years, it cannot be used to analyse spatial information for cells. In this regard, spatial transcriptomics (ST) approaches allow researchers to understand the functions of individual cells in complex multicellular organisms by understanding their physical location in tissue sections. In particular, in related research on tumour heterogeneity, ST is an excellent complementary approach to scRNA-seq, constituting a new method for further exploration of tumour heterogeneity, and this approach can also provide unprecedented insight into the development of treatments for pancreatic cancer (PC). In this review, based on the methods of scRNA-seq and ST analyses, research progress on the tumour microenvironment and treatment of pancreatic cancer is further explained.
Collapse
Affiliation(s)
- Jie Zhu
- Department of Hepatobiliary and Pancreatic Surgery, Northern Jiangsu People's Hospital Affiliated Yangzhou University, Jiangsu Province, China
| | - Ke Zhang
- Dalian Medical University, Dalian, China
| | - Yuan Chen
- Department of Hepatobiliary and Pancreatic Surgery, Northern Jiangsu People's Hospital Affiliated Yangzhou University, Jiangsu Province, China
| | - Xinyu Ge
- Dalian Medical University, Dalian, China
| | - Junqing Wu
- Department of Hepatobiliary and Pancreatic Surgery, Northern Jiangsu People's Hospital Affiliated Yangzhou University, Jiangsu Province, China
| | - Peng Xu
- Department of Hepatobiliary and Pancreatic Surgery, Northern Jiangsu People's Hospital Affiliated Yangzhou University, Jiangsu Province, China.
| | - Jie Yao
- Department of Hepatobiliary and Pancreatic Surgery, Northern Jiangsu People's Hospital Affiliated Yangzhou University, Jiangsu Province, China.
| |
Collapse
|
4
|
Vural-Ozdeniz M, Calisir K, Acar R, Yavuz A, Ozgur MM, Dalgıc E, Konu O. CAP-RNAseq: an integrated pipeline for functional annotation and prioritization of co-expression clusters. Brief Bioinform 2024; 25:bbad536. [PMID: 38279653 PMCID: PMC10818169 DOI: 10.1093/bib/bbad536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/04/2023] [Accepted: 12/21/2024] [Indexed: 01/28/2024] Open
Abstract
Cluster analysis is one of the most widely used exploratory methods for visualization and grouping of gene expression patterns across multiple samples or treatment groups. Although several existing online tools can annotate clusters with functional terms, there is no all-in-one webserver to effectively prioritize genes/clusters using gene essentiality as well as congruency of mRNA-protein expression. Hence, we developed CAP-RNAseq that makes possible (1) upload and clustering of bulk RNA-seq data followed by identification, annotation and network visualization of all or selected clusters; and (2) prioritization using DepMap gene essentiality and/or dependency scores as well as the degree of correlation between mRNA and protein levels of genes within an expression cluster. In addition, CAP-RNAseq has an integrated primer design tool for the prioritized genes. Herein, we showed using comparisons with the existing tools and multiple case studies that CAP-RNAseq can uniquely aid in the discovery of co-expression clusters enriched with essential genes and prioritization of novel biomarker genes that exhibit high correlations between their mRNA and protein expression levels. CAP-RNAseq is applicable to RNA-seq data from different contexts including cancer and available at http://konulabapps.bilkent.edu.tr:3838/CAPRNAseq/ and the docker image is downloadable from https://hub.docker.com/r/konulab/caprnaseq.
Collapse
Affiliation(s)
| | - Kubra Calisir
- Department of Molecular Biology and Genetics, Bilkent University, Ankara, Türkiye
| | - Rana Acar
- Department of Molecular Biology and Genetics, Bilkent University, Ankara, Türkiye
| | - Aysenur Yavuz
- Department of Molecular Biology and Genetics, Bilkent University, Ankara, Türkiye
| | - Mustafa M Ozgur
- Department of Molecular Biology and Genetics, Bilkent University, Ankara, Türkiye
| | - Ertugrul Dalgıc
- Department of Medical Biology, School of Medicine, Zonguldak Bülent Ecevit University, Zonguldak, Türkiye
| | - Ozlen Konu
- Department of Neuroscience, Bilkent University, Ankara, Türkiye
- Department of Molecular Biology and Genetics, Bilkent University, Ankara, Türkiye
| |
Collapse
|
5
|
Hegarty C, Neto N, Cahill P, Floudas A. Computational approaches in rheumatic diseases - Deciphering complex spatio-temporal cell interactions. Comput Struct Biotechnol J 2023; 21:4009-4020. [PMID: 37649712 PMCID: PMC10462794 DOI: 10.1016/j.csbj.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/04/2023] [Accepted: 08/04/2023] [Indexed: 09/01/2023] Open
Abstract
Inflammatory arthritis, including rheumatoid (RA), and psoriatic (PsA) arthritis, are clinically and immunologically heterogeneous diseases with no identified cure. Chronic inflammation of the synovial tissue ushers loss of function of the joint that severely impacts the patient's quality of life, eventually leading to disability and life-threatening comorbidities. The pathogenesis of synovial inflammation is the consequence of compounded immune and stromal cell interactions influenced by genetic and environmental factors. Deciphering the complexity of the synovial cellular landscape has accelerated primarily due to the utilisation of bulk and single cell RNA sequencing. Particularly the capacity to generate cell-cell interaction networks could reveal evidence of previously unappreciated processes leading to disease. However, there is currently a lack of universal nomenclature as a result of varied experimental and technological approaches that discombobulates the study of synovial inflammation. While spatial transcriptomic analysis that combines anatomical information with transcriptomic data of synovial tissue biopsies promises to provide more insights into disease pathogenesis, in vitro functional assays with single-cell resolution will be required to validate current bioinformatic applications. In order to provide a comprehensive approach and translate experimental data to clinical practice, a combination of clinical and molecular data with machine learning has the potential to enhance patient stratification and identify individuals at risk of arthritis that would benefit from early therapeutic intervention. This review aims to provide a comprehensive understanding of the effect of computational approaches in deciphering synovial inflammation pathogenesis and discuss the impact that further experimental and novel computational tools may have on therapeutic target identification and drug development.
Collapse
Affiliation(s)
- Ciara Hegarty
- Translational Immunology lab, School of Biotechnology, Dublin City University, Dublin, Ireland
| | - Nuno Neto
- Trinity Centre for Biomedical Engineering, Trinity College Dublin, Ireland
| | - Paul Cahill
- Vascular Biology lab, School of Biotechnology, Dublin City University, Dublin, Ireland
| | - Achilleas Floudas
- Translational Immunology lab, School of Biotechnology, Dublin City University, Dublin, Ireland
| |
Collapse
|