1
|
Doost PA, Moghadam SS, Khezri E, Basem A, Trik M. A new intrusion detection method using ensemble classification and feature selection. Sci Rep 2025; 15:13642. [PMID: 40254667 PMCID: PMC12009975 DOI: 10.1038/s41598-025-98604-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 04/14/2025] [Indexed: 04/22/2025] Open
Abstract
Intrusion Detection Systems (IDS) play a crucial role in ensuring network security by identifying and mitigating cyber threats. This study introduces a hybrid intrusion detection approach that integrates Convolutional Neural Networks (CNNs) for feature extraction and the Random Forest (RF) algorithm for classification. The proposed method enhances detection accuracy by leveraging CNNs to automatically extract relevant network features, reducing data dimensionality and noise. Subsequently, the RF classifier processes these optimized features to achieve robust and precise intrusion classification. To evaluate the effectiveness of the approach, experiments were conducted on the KDD99 and UNSW-NB15 datasets. The results demonstrate that the proposed model achieves an accuracy of 97% and a precision of over 98%, outperforming traditional machine learning-based IDS solutions. These findings highlight the potential of the proposed hybrid framework as a scalable and efficient cybersecurity solution for real-world network environments.
Collapse
Affiliation(s)
- Pooyan Azizi Doost
- Khuzestan Electric Power Distribution Company, Shahid Monsefi Ave, Ahvaz, Amanieh, Iran.
| | | | - Edris Khezri
- Department of Computer Engineering, Boukan Branch, Islamic Azad University, Boukan, Iran.
| | - Ali Basem
- Faculty of Engineering, Warith Al Anbiyaa University, Karbala, 56001, Iraq
| | - Mohammad Trik
- Department of Computer Engineering, Boukan Branch, Islamic Azad University, Boukan, Iran
| |
Collapse
|
2
|
Lam T, Quach HT, Hall L, Abou Chakra M, Wong AP. A multidisciplinary approach towards modeling of a virtual human lung. NPJ Syst Biol Appl 2025; 11:38. [PMID: 40251169 PMCID: PMC12008392 DOI: 10.1038/s41540-025-00517-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2024] [Accepted: 04/08/2025] [Indexed: 04/20/2025] Open
Abstract
Integrating biological data with in silico modeling offers the transformative potential to develop virtual human models, or "digital twins." These models hold immense promise for deepening our understanding of diseases and uncovering new therapeutic strategies. This approach is especially valuable for diseases lacking reliable models. Here we review current modelling efforts in of human lung development, highlighting the role of interdisciplinary collaboration and key advances toward a digital lung twin.
Collapse
Affiliation(s)
- Timothy Lam
- Program in Developmental, Stem cell and Cancer Biology, Hospital for Sick Children, PGCRL 16-9420, Toronto, ON, Canada
| | - Henry T Quach
- Program in Developmental, Stem cell and Cancer Biology, Hospital for Sick Children, PGCRL 16-9420, Toronto, ON, Canada
- Department of Laboratory Medicine & Pathobiology, University of Toronto, Toronto, ON, Canada
| | - Lauren Hall
- Program in Developmental, Stem cell and Cancer Biology, Hospital for Sick Children, PGCRL 16-9420, Toronto, ON, Canada
- Department of Laboratory Medicine & Pathobiology, University of Toronto, Toronto, ON, Canada
| | - Maria Abou Chakra
- Donnelly Centre for Cellular and Biomedical Research, University of Toronto, Toronto, ON, Canada
| | - Amy P Wong
- Program in Developmental, Stem cell and Cancer Biology, Hospital for Sick Children, PGCRL 16-9420, Toronto, ON, Canada.
- Department of Laboratory Medicine & Pathobiology, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
3
|
Tan C, Yuan Z, Xu F, Xie D. Optimized Feature Selection and Deep Neural Networks to Improve Heart Disease Prediction. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01435-4. [PMID: 40240654 DOI: 10.1007/s10278-025-01435-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2024] [Revised: 01/23/2025] [Accepted: 01/30/2025] [Indexed: 04/18/2025]
Abstract
Heart disease remains a significant health threat due to its high mortality rate and increasing prevalence. Early prediction using basic physical markers from routine exams is crucial for timely diagnosis and intervention. However, manual analysis of large datasets can be labor-intensive and error-prone. Our goal is to rapidly and reliably anticipate cardiac disease using a variety of body signs. This research presents a unique model for heart disease prediction. We provide a system for predicting cardiac disease that blends the deep convolutional neural network with a feature selection technique based on the LinearSVC. This integrated feature selection method selects a subset of characteristics that are strongly linked with heart disease. We feed these features into the deep conventual neural network that we constructed. Also to improve the speed of the predictor and avoid gradient varnishing or explosion, the network's hyperparameters were tuned using the random search algorithm. The proposed method was evaluated using the UCI and MIT datasets. The predictor is evaluated using a number of indicators, such as accuracy, recall, precision, and F1 score. The results demonstrate that our model attains accuracy rates of 98.16%, 98.2%, 95.38%, and 97.84% in the UCI dataset, with an average MCC score of 90%. These results affirm the efficacy and reliability of the proposed technique to predict heart disease.
Collapse
Affiliation(s)
- Changming Tan
- Department of Cardiovascular Surgery, The Second Xiangya Hospital of Central South University, No139 Renmin Road, Changsha, Hunan Province, 410011, People's Republic of China.
| | - Zhaoshun Yuan
- Department of Cardiovascular Surgery, The Second Xiangya Hospital of Central South University, No139 Renmin Road, Changsha, Hunan Province, 410011, People's Republic of China
| | - Feng Xu
- Department of Endocrinology and Metabolism, The Second Xiangya Hospital of Central South University, No139 Renmin Road, Changsha, Hunan Province, 410011, People's Republic of China
| | - Dang Xie
- Weimu (Shanghai) Medical Technology Ltd. No, 4188, Canghai Road, Lingang New Area, Shanghai Free Trade Zone, Shanghai, 201306, China
| |
Collapse
|
4
|
Hsu CY, Buñay Guaman JS, Ved A, Yadav A, Ezhilarasan G, Rameshbabu A, Alkhayyat A, Aulakh D, Choudhury S, Sunori SK, Ranjbar F. Prediction of methane hydrate equilibrium in saline water solutions based on support vector machine and decision tree techniques. Sci Rep 2025; 15:11723. [PMID: 40188155 PMCID: PMC11972364 DOI: 10.1038/s41598-025-95969-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Accepted: 03/25/2025] [Indexed: 04/07/2025] Open
Abstract
The formation of clathrate hydrates offers a powerful approach for separating gaseous substances, desalinating seawater, and energy storage at low temperatures. On the other hand, this phenomenon may lead to practical challenges, including the blockage of pipelines, in some industries. Consequently, accurately predicting the equilibrium conditions for clathrate hydrate formation is crucial. This study was undertaken to design reliable models capable of predicting the equilibrium state of methane hydrates in saline water solutions. A comprehensive collection of measured data, consisting of 1051 samples, was assembled from published sources. The prepared databank encompassed the hydrate formation temperature of methane (HFTM) in the presence of 26 different saline water solutions. A machine learning modeling was undertaken through the implementation of Decision Tree (DT) and Support Vector Machine (SVM) approaches. While both models had excellent performance, the latter achieved higher accuracy in estimating the HFTM with the mean absolute percentage error (MAPE) of 0.26%, and standard deviation (SD) of 0.78% in the validation process. Furthermore, more than 90% of the values predicted by the novel models fell within the [Formula: see text]1% error bound. It was found that the intelligent models also favorably describe the physical variations of HFTM with operational factors. An examination using the William's plot acknowledged the truthfulness of the gathered data and the suggested estimation techniques. Ultimately, the order of significance of the factors governing the HFTM was clarified using a sensitivity analysis.
Collapse
Affiliation(s)
- Chou-Yi Hsu
- Thunderbird School of Global Management, Arizona State University Tempe Campus, Phoenix, AZ, 85004, USA
| | | | - Amit Ved
- Department of Electrical Engineering, Faculty of Engineering & Technology, Marwadi University Research Center, Marwadi University, Rajkot, 360003, Gujarat, India
| | - Anupam Yadav
- Department of Computer engineering and Application, GLA University, Mathura, 281406, India
| | - G Ezhilarasan
- Department of Electrical and Electronics Engineering, School of Engineering and Technology, JAIN (Deemed to be University), Bangalore, Karnataka, India
| | - A Rameshbabu
- Department of Electrical and Electronics Engineering, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, India
| | - Ahmad Alkhayyat
- Department of computers Techniques engineering, College of technical engineering, The Islamic University, Najaf, Iraq
- Department of computers Techniques engineering, College of technical engineering, The Islamic University of Al Diwaniyah, Al Diwaniyah, Iraq
| | - Damanjeet Aulakh
- Centre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, 140401, Punjab, India
| | - Satish Choudhury
- Department of Electrical & Electronics Engineering, Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, 751030, Odisha, India
| | - S K Sunori
- Graphic Era Hill University, Bhimtal, Uttarakhand, India
- Graphic Era Deemed to be University, Dehradun, 248002, Uttarakhand, India
| | - Fereydoon Ranjbar
- Department of Chemistry, Islamic Azad University, Najafabad Branch, Isfehan, Iran.
| |
Collapse
|
5
|
Guo J, Liao J, Chen Y, Wen L, Cheng S. New Machine Learning Method for Medical Image and Microarray Data Analysis for Heart Disease Classification. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01492-9. [PMID: 40169470 DOI: 10.1007/s10278-025-01492-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 02/02/2025] [Revised: 03/09/2025] [Accepted: 03/19/2025] [Indexed: 04/03/2025]
Abstract
Microarray technology has become a vital tool in cardiovascular research, enabling the simultaneous analysis of thousands of gene expressions. This capability provides a robust foundation for heart disease classification and biomarker discovery. However, the high dimensionality, noise, and sparsity of microarray data present significant challenges for effective analysis. Gene selection, which aims to identify the most relevant subset of genes, is a crucial preprocessing step for improving classification accuracy, reducing computational complexity, and enhancing biological interpretability. Traditional gene selection methods often fall short in capturing complex, nonlinear interactions among genes, limiting their effectiveness in heart disease classification tasks. In this study, we propose a novel framework that leverages deep neural networks (DNNs) for optimizing gene selection and heart disease classification using microarray data. DNNs, known for their ability to model complex, nonlinear patterns, are integrated with feature selection techniques to address the challenges of high-dimensional data. The proposed method, DeepGeneNet (DGN), combines gene selection and DNN-based classification into a unified framework, ensuring robust performance and meaningful insights into the underlying biological mechanisms. Additionally, the framework incorporates hyperparameter optimization and innovative U-Net segmentation techniques to further enhance computational performance and classification accuracy. These optimizations enable DGN to deliver robust and scalable results, outperforming traditional methods in both predictive accuracy and interpretability. Experimental results demonstrate that the proposed approach significantly improves heart disease classification accuracy compared to other methods. By focusing on the interplay between gene selection and deep learning, this work advances the field of cardiovascular genomics, providing a scalable and interpretable framework for future applications.
Collapse
Affiliation(s)
- Jinglan Guo
- Department of Medical Laboratory, Affiliated Hospital of Southwest Medical University, Lu Zhou, 646000, Si Chuan, China
| | - Jue Liao
- School of Basic Medical Sciences of Southwest Medical University, Lu Zhou, 646000, Si Chuan, China
| | - Yuanlian Chen
- Family Planning Service Center, Jiangyang District Maternal and Child Health Hospital, Lu Zhou, 646000, Sichuan, China
| | - Lisha Wen
- Family Planning Service Center, Jiangyang District Maternal and Child Health Hospital, Lu Zhou, 646000, Sichuan, China
| | - Song Cheng
- Department of Medical Laboratory, Affiliated Hospital of Southwest Medical University, Lu Zhou, 646000, Si Chuan, China.
| |
Collapse
|
6
|
Li S, Hua H, Chen S. Graph neural networks for single-cell omics data: a review of approaches and applications. Brief Bioinform 2025; 26:bbaf109. [PMID: 40091193 PMCID: PMC11911123 DOI: 10.1093/bib/bbaf109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Revised: 02/09/2025] [Accepted: 02/25/2025] [Indexed: 03/19/2025] Open
Abstract
Rapid advancement of sequencing technologies now allows for the utilization of precise signals at single-cell resolution in various omics studies. However, the massive volume, ultra-high dimensionality, and high sparsity nature of single-cell data have introduced substantial difficulties to traditional computational methods. The intricate non-Euclidean networks of intracellular and intercellular signaling molecules within single-cell datasets, coupled with the complex, multimodal structures arising from multi-omics joint analysis, pose significant challenges to conventional deep learning operations reliant on Euclidean geometries. Graph neural networks (GNNs) have extended deep learning to non-Euclidean data, allowing cells and their features in single-cell datasets to be modeled as nodes within a graph structure. GNNs have been successfully applied across a broad range of tasks in single-cell data analysis. In this survey, we systematically review 107 successful applications of GNNs and their six variants in various single-cell omics tasks. We begin by outlining the fundamental principles of GNNs and their six variants, followed by a systematic review of GNN-based models applied in single-cell epigenomics, transcriptomics, spatial transcriptomics, proteomics, and multi-omics. In each section dedicated to a specific omics type, we have summarized the publicly available single-cell datasets commonly utilized in the articles reviewed in that section, totaling 77 datasets. Finally, we summarize the potential shortcomings of current research and explore directions for future studies. We anticipate that this review will serve as a guiding resource for researchers to deepen the application of GNNs in single-cell omics.
Collapse
Affiliation(s)
- Sijie Li
- School of Mathematical Sciences and The Key Laboratory of Pure Mathematics and Combinatorics, Ministry of Education (LPMC), Nankai University, No. 94 Weijin Road, Nankai District, Tianjin 300071, China
| | - Heyang Hua
- School of Mathematical Sciences and The Key Laboratory of Pure Mathematics and Combinatorics, Ministry of Education (LPMC), Nankai University, No. 94 Weijin Road, Nankai District, Tianjin 300071, China
| | - Shengquan Chen
- School of Mathematical Sciences and The Key Laboratory of Pure Mathematics and Combinatorics, Ministry of Education (LPMC), Nankai University, No. 94 Weijin Road, Nankai District, Tianjin 300071, China
| |
Collapse
|
7
|
Yu W, Lin Z, Lan M, Ou-Yang L. GCLink: a graph contrastive link prediction framework for gene regulatory network inference. Bioinformatics 2025; 41:btaf074. [PMID: 39960893 PMCID: PMC11881698 DOI: 10.1093/bioinformatics/btaf074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 01/10/2025] [Accepted: 02/13/2025] [Indexed: 03/06/2025] Open
Abstract
MOTIVATION Gene regulatory networks (GRNs) unveil the intricate interactions among genes, pivotal in elucidating the complex biological processes within cells. The advent of single-cell RNA-sequencing (scRNA-seq) enables the inference of GRNs at single-cell resolution. However, the majority of current supervised network inference methods typically concentrate on predicting pairwise gene regulatory interaction, thus failing to fully exploit correlations among all genes and exhibiting limited generalization performance. RESULTS To address these issues, we propose a graph contrastive link prediction (GCLink) model to infer potential gene regulatory interactions from scRNA-seq data. Based on known gene regulatory interactions and scRNA-seq data, GCLink introduces a graph contrastive learning strategy to aggregate the feature and neighborhood information of genes to learn their representations. This approach reduces the dependence of our model on sample size and enhance its ability in predicting potential gene regulatory interactions. Extensive experiments on real scRNA-seq datasets demonstrate that GCLink outperforms other state-of-the-art methods in most cases. Furthermore, by pretraining GCLink on a source cell line with abundant known regulatory interactions and fine-tuning it on a target cell line with limited amount of known interactions, our GCLink model exhibits good performance in GRN inference, demonstrating its effectiveness in inferring GRNs from datasets with limited known interactions. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/Yoyiming/GCLink.
Collapse
Affiliation(s)
- Weiming Yu
- Guangdong Provincial Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Zerun Lin
- Guangdong Provincial Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Miaofang Lan
- Guangdong Provincial Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Le Ou-Yang
- Guangdong Laboratory of Machine Perception and Intelligent Computing, Faculty of Engineering, Shenzhen MSU-BIT University, Shenzhen 518116, China
| |
Collapse
|
8
|
Ru Y, Gruninger M, Dou Y. Robust self supervised symmetric nonnegative matrix factorization to the graph clustering. Sci Rep 2025; 15:7350. [PMID: 40025198 PMCID: PMC11873141 DOI: 10.1038/s41598-025-92564-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Accepted: 02/28/2025] [Indexed: 03/04/2025] Open
Abstract
Graph clustering is a fundamental task in network analysis, aimed at uncovering meaningful groups of nodes based on structural and attribute-based similarities. Traditional Nonnegative Matrix Factorization (NMF) methods have shown promise in clustering tasks by providing low-dimensional representations of data. However, most existing NMF-based approaches are highly sensitive to noise and outliers, leading to suboptimal performance in real-world scenarios. Additionally, these methods often struggle to capture the underlying nonlinear structures of complex networks, which can significantly impact clustering accuracy. To address these limitations, this paper introduces Robust Self-Supervised Symmetric NMF (R3SNMF) to improve graph clustering. The proposed algorithm leverages a robust principal component model to handle noise and outliers effectively. By incorporating a self-supervised learning mechanism, R3SNMF iteratively refines the clustering process, enhancing the quality of the learned representations and increasing resilience to data imperfections. The symmetric factorization ensures the preservation of network structures, while the self-supervised approach allows the model to adaptively improve its clustering performance over successive iterations. In addition, R3SNMF integrates a graph-boosting method to improve how relationships within the network are represented. Extensive experimental evaluations on various real-world graph datasets demonstrate that R3SNMF outperforms state-of-the-art clustering methods in terms of both accuracy and robustness.
Collapse
Affiliation(s)
- Yi Ru
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, M5S 3G8, Canada.
| | - Michael Gruninger
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, M5S 3G8, Canada
| | - YangLiu Dou
- Department of Computer Vision Technology (VIS), Baidu Inc, Shenzhen, 518000, China
| |
Collapse
|
9
|
Wang C, Liu ZP. Diffusion-based generation of gene regulatory networks from scRNA-seq data with DigNet. Genome Res 2025; 35:340-354. [PMID: 39694856 PMCID: PMC11874984 DOI: 10.1101/gr.279551.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 12/10/2024] [Indexed: 12/20/2024]
Abstract
A gene regulatory network (GRN) intricately encodes the interconnectedness of identities and functionalities of genes within cells, ultimately shaping cellular specificity. Despite decades of endeavors, reverse engineering of GRNs from gene expression profiling data remains a profound challenge, particularly when it comes to reconstructing cell-specific GRNs that are tailored to precise cellular and genetic contexts. Here, we propose a discrete diffusion generation model, called DigNet, capable of generating corresponding GRNs from high-throughput single-cell RNA sequencing (scRNA-seq) data. DigNet embeds the network generation process into a multistep recovery procedure with Markov properties. Each intermediate step has a specific model to recover a portion of the gene regulatory architectures. It thus can ensure compatibility between global network structures and regulatory modules through the unique multistep diffusion procedure. Furthermore, through iMetacell integration and non-Euclidean discrete space modeling, DigNet is robust to the presence of noise in scRNA-seq data and the sparsity of GRNs. Benchmark evaluation results against more than a dozen state-of-the-art network inference methods demonstrate that DigNet achieves superior performance across various single-cell GRN reconstruction experiments. Furthermore, DigNet provides unique insights into the immune response in breast cancer, derived from differential gene regulation identified in T cells. As an open-source software, DigNet offers a powerful and effective tool for generating cell-specific GRNs from scRNA-seq data.
Collapse
Affiliation(s)
- Chuanyuan Wang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
10
|
Liu J, Ma L, Ju F, Zhao C, Yu L. SpaCcLink: exploring downstream signaling regulations with graph attention network for systematic inference of spatial cell-cell communication. BMC Biol 2025; 23:44. [PMID: 39939849 PMCID: PMC11823213 DOI: 10.1186/s12915-025-02141-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 01/23/2025] [Indexed: 02/14/2025] Open
Abstract
BACKGROUND Cellular communication is vital for the proper functioning of multicellular organisms. A comprehensive analysis of cellular communication demands the consideration not only of the binding between ligands and receptors but also of a series of downstream signal transduction reactions within cells. Thanks to the advancements in spatial transcriptomics technology, we are now able to better decipher the process of cellular communication within the cellular microenvironment. Nevertheless, the majority of existing spatial cell-cell communication algorithms fail to take into account the downstream signals within cells. RESULTS In this study, we put forward SpaCcLink, a cell-cell communication analysis method that takes into account the downstream influence of individual receptors within cells and systematically investigates the spatial patterns of communication as well as downstream signal networks. Analyses conducted on real datasets derived from humans and mice have demonstrated that SpaCcLink can help in identifying more relevant ligands and receptors, thereby enabling us to systematically decode the downstream genes and signaling pathways that are influenced by cell-cell communication. Comparisons with other methods suggest that SpaCcLink can identify downstream genes that are more closely associated with biological processes and can also discover reliable ligand-receptor relationships. CONCLUSIONS By means of SpaCcLink, a more profound and all-encompassing comprehension of the mechanisms underlying cellular communication can be achieved, which in turn promotes and deepens our understanding of the intricate complexity within organisms.
Collapse
Affiliation(s)
- Jingtao Liu
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Litian Ma
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Fen Ju
- Department of Rehabilitation Medicine, Xijing Hospital, Fourth Military Medical University, Xi'an, 710032, China
| | - Chenguang Zhao
- Department of Rehabilitation Medicine, Xijing Hospital, Fourth Military Medical University, Xi'an, 710032, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China.
| |
Collapse
|
11
|
Sun Y, Gao J. HGATLink: single-cell gene regulatory network inference via the fusion of heterogeneous graph attention networks and transformer. BMC Bioinformatics 2025; 26:49. [PMID: 39934680 PMCID: PMC11817978 DOI: 10.1186/s12859-025-06071-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Accepted: 01/29/2025] [Indexed: 02/13/2025] Open
Abstract
BACKGROUND Gene regulatory networks (GRNs) involve complex regulatory relationships between genes and play important roles in the study of various biological systems and diseases. The introduction of single-cell sequencing (scRNA-seq) technology has allowed gene regulation studies to be carried out on specific cell types, providing the opportunity to accurately infer gene regulatory networks. However, the sparsity and noise problems of single-cell sequencing data pose challenges for gene regulatory network inference, and although many gene regulatory network inference methods have been proposed, they often fail to eliminate transitive interactions or do not address multilevel relationships and nonlinear features in the graph data well. RESULTS On the basis of the above limitations, we propose a gene regulatory network inference framework named HGATLink. HGATLink combines the heterogeneous graph attention network and simplified transformer to capture complex interactions effectively between genes in low-dimensional space via matrix decomposition techniques, which not only enhances the ability to model complex heterogeneous graph structures and alleviate transitive interactions, but also effectively captures the long-range dependencies between genes to ensure more accurate prediction. CONCLUSIONS Compared with 10 state-of-the-art GRN inference methods on 14 scRNA-seq datasets under two metrics, AUROC and AUPRC, HGATLink shows good stability and accuracy in gene regulatory network inference tasks.
Collapse
Affiliation(s)
- Yao Sun
- Department of Computer Science and Technology, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, 010011, Inner Mongolia, China
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research, Hohhot, 010018, Inner Mongolia, China
| | - Jing Gao
- Department of Computer Science and Technology, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, 010011, Inner Mongolia, China.
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research, Hohhot, 010018, Inner Mongolia, China.
| |
Collapse
|
12
|
Ma L, Liu J, Sun W, Zhao C, Yu L. scMFG: a single-cell multi-omics integration method based on feature grouping. BMC Genomics 2025; 26:132. [PMID: 39934664 PMCID: PMC11817349 DOI: 10.1186/s12864-025-11319-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Accepted: 02/03/2025] [Indexed: 02/13/2025] Open
Abstract
BACKGROUND Recent advancements in methodologies and technologies have enabled the simultaneous measurement of multiple omics data, which provides a comprehensive understanding of cellular heterogeneity. However, existing methods have limitations in accurately identifying cell types while maintaining model interpretability, especially in the presence of noise. METHODS We propose a novel method called scMFG, which leverages feature grouping and group integration techniques for the integration of single-cell multi-omics data. By organizing features with similar characteristics within each omics layer through feature grouping. Furthermore, scMFG ensures a consistent feature grouping approach across different omics layers, promoting comparability of diverse data types. Additionally, scMFG incorporates a matrix factorization-based approach to enable the integrated results remain interpretable. RESULTS We comprehensively evaluated scMFG's performance on four complex real-world datasets generated using diverse sequencing technologies, highlighting its robustness in accurately identifying cell types. Notably, scMFG exhibited superior performance in deciphering cellular heterogeneity at a finer resolution compared to existing methods when applied to simulated datasets. Furthermore, our method proved highly effective in identifying rare cell types, showcasing its robust performance and suitability for detecting low-abundance cellular populations. The interpretability of scMFG was successfully validated through its specific association of outputs with specific cell types or states observed in the neonatal mouse cerebral cortices dataset. Moreover, we demonstrated that scMFG is capable of identifying cell developmental trajectories even in datasets with batch effects. CONCLUSIONS Our work presents a robust framework for the analysis of single-cell multi-omics data, advancing our understanding of cellular heterogeneity in a comprehensive and interpretable manner.
Collapse
Affiliation(s)
- Litian Ma
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, 710071, China
| | - Jingtao Liu
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, 710071, China
| | - Wei Sun
- Department of Rehabilitation Medicine, Xijing Hospital, Fourth Military Medical University, Xi'an, 710032, China
| | - Chenguang Zhao
- Department of Rehabilitation Medicine, Xijing Hospital, Fourth Military Medical University, Xi'an, 710032, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, 710071, China.
| |
Collapse
|
13
|
Yang M, Xiao J, Zhong Q. Modeling of parameters affecting the removal of chromium using polysulfone/graphene oxide membrane via response surface methodology. ENVIRONMENTAL MONITORING AND ASSESSMENT 2025; 197:180. [PMID: 39841272 DOI: 10.1007/s10661-025-13616-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Accepted: 01/03/2025] [Indexed: 01/23/2025]
Abstract
In this study, an efficient membrane composed of polysulfone and graphene oxide was developed and evaluated for its efficacy in chromium adsorption. Characterization of the synthesized membrane involved comprehensive analyses including scanning electron microscopy (SEM), transmission electron microscopy (TEM), thermogravimetric analysis (TGA) and Fourier-transform infrared spectroscopy (FTIR) to assess its structural properties. Subsequently, the membrane's performance in removing chromium from aqueous solutions was scrutinized, considering key operational parameters. Response surface methodology (RSM) based on central composite design (CCD) was employed to optimize parameters. Additionally, the pH parameter revealed the most significant (F-value = 184.25) on the amount of chromium removal by the membrane process. The interaction between pH and contact time is the most significant among all interactions, with an F-value of 40.99. Moreover, the high R2 (97.58%) and adjusted R2 (95.41%) indicate the model effectively explains variance with minimal overfitting, confirming its strong predictive capability. Under optimized conditions (pH 5, initial concentration of 30 mg/L, and contact time of 40 min), the polysulfone/graphene oxide membrane exhibited an impressive removal efficiency of 81.1%. This study highlights the potential of polysulfone/graphene oxide membranes in effectively separating chromium from aqueous mediums, thereby suggesting a promising avenue for future research in addressing heavy metal pollution.√.
Collapse
Affiliation(s)
- Minge Yang
- School of Metallurgy and Environment, Central South University, Changsha, 410083, Hunan, China
- Department of Science and Technology, Hunan Automotive Engineering Vocational University, Zhuzhou, 412001, Hunan, China
| | - Jin Xiao
- School of Metallurgy and Environment, Central South University, Changsha, 410083, Hunan, China
| | - Qifan Zhong
- School of Metallurgy and Environment, Central South University, Changsha, 410083, Hunan, China.
| |
Collapse
|
14
|
Cao G, Chen D. Unveiling Long Non-coding RNA Networks from Single-Cell Omics Data Through Artificial Intelligence. Methods Mol Biol 2025; 2883:257-279. [PMID: 39702712 DOI: 10.1007/978-1-0716-4290-0_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2024]
Abstract
Single-cell omics technologies have revolutionized the study of long non-coding RNAs (lncRNAs), offering unprecedented resolution in elucidating their expression dynamics, cell-type specificity, and associated gene regulatory networks (GRNs). Concurrently, the integration of artificial intelligence (AI) methodologies has significantly advanced our understanding of lncRNA functions and its implications in disease pathogenesis. This chapter discusses the progress in single-cell omics data analysis, emphasizing its pivotal role in unraveling the molecular mechanisms underlying cellular heterogeneity and the associated regulatory networks involving lncRNAs. Additionally, we provide a summary of single-cell omics resources and AI models for constructing single-cell gene regulatory networks (scGRNs). Finally, we explore the challenges and prospects of exploring scGRNs in the context of lncRNA biology.
Collapse
Affiliation(s)
- Guangshuo Cao
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Dijun Chen
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China.
| |
Collapse
|
15
|
Weng G, Martin P, Kim H, Won KJ. Integrating Prior Knowledge Using Transformer for Gene Regulatory Network Inference. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2409990. [PMID: 39605181 PMCID: PMC11744656 DOI: 10.1002/advs.202409990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 10/23/2024] [Indexed: 11/29/2024]
Abstract
Gene regulatory network (GRN) inference, a process of reconstructing gene regulatory rules from experimental data, has the potential to discover new regulatory rules. However, existing methods often struggle to generalize across diverse cell types and account for unseen regulators. Here, this work presents GRNPT, a novel Transformer-based framework that integrates large language model (LLM) embeddings from publicly accessible biological data and a temporal convolutional network (TCN) autoencoder to capture regulatory patterns from single-cell RNA sequencing (scRNA-seq) trajectories. GRNPT significantly outperforms both supervised and unsupervised methods in inferring GRNs, particularly when training data is limited. Notably, GRNPT exhibits exceptional generalizability, accurately predicting regulatory relationships in previously unseen cell types and even regulators. By combining LLMs ability to distillate biological knowledge from text and deep learning methodologies capturing complex patterns in gene expression data, GRNPT overcomes the limitations of traditional GRN inference methods and enables more accurate and comprehensive understanding of gene regulatory dynamics.
Collapse
Affiliation(s)
- Guangzheng Weng
- Biotech Research and Innovation Centre (BRIC)University of CopenhagenOle Maaløes Vej 5Copenhagen2200Denmark
| | - Patrick Martin
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCA90069USA
| | - Hyobin Kim
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCA90069USA
| | - Kyoung Jae Won
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCA90069USA
| |
Collapse
|
16
|
Cui W, Long Q, Liu W, Fang C, Wang X, Wang P, Zhou Y. Hierarchical Graph Transformer With Contrastive Learning for Gene Regulatory Network Inference. IEEE J Biomed Health Inform 2025; 29:690-699. [PMID: 39401117 DOI: 10.1109/jbhi.2024.3476490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Gene regulatory networks (GRNs) are crucial for understanding gene regulation and cellular processes. Inferring GRNs helps uncover regulatory pathways, shedding light on the regulation and development of cellular processes. With the rise of high-throughput sequencing and advancements in computational technology, computational models have emerged as cost-effective alternatives to traditional experimental studies. Moreover, the surge in ChIP-seq data for TF-DNA binding has catalyzed the development of graph neural network (GNN)-based methods, greatly advancing GRN inference capabilities. However, most existing GNN-based methods suffer from the inability to capture long-distance structural semantic correlations due to transitive interactions. In this paper, we introduce a novel GNN-based model named Hierarchical Graph Transformer with Contrastive Learning for GRN (HGTCGRN) inference. HGTCGRN excels at capturing structural semantics using a hierarchical graph Transformer, which introduces a series of gene family nodes representing gene functions as virtual nodes to interact with nodes in the GRNS. These semantic-aware virtual-node embeddings are aggregated to produce node representations with varying emphasis. Additionally, we leverage gene ontology information to construct gene interaction networks for contrastive learning optimization of GRNs. Experimental results demonstrate that HGTCGRN achieves superior performance in GRN inference.
Collapse
|
17
|
Wu CY, Xu ZX, Li N, Qi DY, Hao ZH, Wu HY, Gao R, Jin YT. Accurately identifying positive and negative regulation of apoptosis using fusion features and machine learning methods. Comput Biol Chem 2024; 113:108207. [PMID: 39265463 DOI: 10.1016/j.compbiolchem.2024.108207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 08/20/2024] [Accepted: 09/06/2024] [Indexed: 09/14/2024]
Abstract
Apoptotic proteins play a crucial role in the apoptosis process, ensuring a balance between cell proliferation and death. Thus, further elucidating the regulatory mechanisms of apoptosis will enhance our understanding of their functions. However, the development of computational methods to accurately identify positive and negative regulation of apoptosis remains a significant challenge. This work proposes a machine learning model based on multi-feature fusion to effectively identify the roles of positive and negative regulation of apoptosis. Initially, we constructed a reliable benchmark dataset containing 200 positive regulation of apoptosis and 241 negative regulation of apoptosis proteins. Subsequently, we developed a classifier that combines the support vector machine (SVM) with pseudo composition of k-spaced amino acid pairs (PseCKSAAP), composition transition distribution (CTD), dipeptide deviation from expected mean (DDE), and PSSM-composition to identify these proteins. Analysis of variance (ANOVA) was employed to select optimized features that could yield the maximum prediction performance. Evaluating the proposed model on independent data revealed and achieved an accuracy of 0.781 with an AUROC of 0.837, demonstrating our model's potent capabilities.
Collapse
Affiliation(s)
- Cheng-Yan Wu
- Key Laboratory of Magnetism and Magnetic Materials at Universities of Inner Mongolia Autonomous Region, Baotou Teacher's College, Baotou 014010, China.
| | - Zhi-Xue Xu
- Key Laboratory of Magnetism and Magnetic Materials at Universities of Inner Mongolia Autonomous Region, Baotou Teacher's College, Baotou 014010, China.
| | - Nan Li
- Key Laboratory of Magnetism and Magnetic Materials at Universities of Inner Mongolia Autonomous Region, Baotou Teacher's College, Baotou 014010, China.
| | - Dan-Yang Qi
- Key Laboratory of Magnetism and Magnetic Materials at Universities of Inner Mongolia Autonomous Region, Baotou Teacher's College, Baotou 014010, China.
| | - Zhi-Hong Hao
- Key Laboratory of Magnetism and Magnetic Materials at Universities of Inner Mongolia Autonomous Region, Baotou Teacher's College, Baotou 014010, China.
| | - Hong-Ye Wu
- Key Laboratory of Magnetism and Magnetic Materials at Universities of Inner Mongolia Autonomous Region, Baotou Teacher's College, Baotou 014010, China.
| | - Ru Gao
- The People's Hospital of Wenjiang, Chengdu, Sichuan 611130, China.
| | - Yan-Ting Jin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
| |
Collapse
|
18
|
Yu S, Liu L, Wang H, Yan S, Zheng S, Ning J, Luo R, Fu X, Deng X. AtML: An Arabidopsis thaliana root cell identity recognition tool for medicinal ingredient accumulation. Methods 2024; 231:61-69. [PMID: 39293728 DOI: 10.1016/j.ymeth.2024.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Revised: 08/05/2024] [Accepted: 09/12/2024] [Indexed: 09/20/2024] Open
Abstract
Arabidopsis thaliana synthesizes various medicinal compounds, and serves as a model plant for medicinal plant research. Single-cell transcriptomics technologies are essential for understanding the developmental trajectory of plant roots, facilitating the analysis of synthesis and accumulation patterns of medicinal compounds in different cell subpopulations. Although methods for interpreting single-cell transcriptomics data are rapidly advancing in Arabidopsis, challenges remain in precisely annotating cell identity due to the lack of marker genes for certain cell types. In this work, we trained a machine learning system, AtML, using sequencing datasets from six cell subpopulations, comprising a total of 6000 cells, to predict Arabidopsis root cell stages and identify biomarkers through complete model interpretability. Performance testing using an external dataset revealed that AtML achieved 96.50% accuracy and 96.51% recall. Through the interpretability provided by AtML, our model identified 160 important marker genes, contributing to the understanding of cell type annotations. In conclusion, we trained AtML to efficiently identify Arabidopsis root cell stages, providing a new tool for elucidating the mechanisms of medicinal compound accumulation in Arabidopsis roots.
Collapse
Affiliation(s)
- Shicong Yu
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Rice Research Institute, Sichuan Agricultural University, Chengdu 611130, China
| | - Lijia Liu
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Hao Wang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Shen Yan
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Shuqin Zheng
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Rice Research Institute, Sichuan Agricultural University, Chengdu 611130, China
| | - Jing Ning
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Rice Research Institute, Sichuan Agricultural University, Chengdu 611130, China
| | - Ruxian Luo
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Rice Research Institute, Sichuan Agricultural University, Chengdu 611130, China
| | - Xiangzheng Fu
- Research Institute of Hunan University in Chongqing, Chongqing 401120, China.
| | - Xiaoshu Deng
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Rice Research Institute, Sichuan Agricultural University, Chengdu 611130, China; Chongqing Academy of Chinese Materia Medica, Chongqing 400065, China.
| |
Collapse
|
19
|
Sun Y, Pan Z, Wang Z, Wang H, Wei L, Cui F, Zou Q, Zhang Z. Single-cell transcriptome analysis reveals immune microenvironment changes and insights into the transition from DCIS to IDC with associated prognostic genes. J Transl Med 2024; 22:894. [PMID: 39363164 PMCID: PMC11448450 DOI: 10.1186/s12967-024-05706-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 09/25/2024] [Indexed: 10/05/2024] Open
Abstract
BACKGROUND Ductal carcinoma in situ (DCIS) of the breast is an early stage of breast cancer, and preventing its progression to invasive ductal carcinoma (IDC) is crucial for the early detection and treatment of breast cancer. Although single-cell transcriptome analysis technology has been widely used in breast cancer research, the biological mechanisms underlying the transition from DCIS to IDC remain poorly understood. RESULTS We identified eight cell types through cell annotation, finding significant differences in T cell proportions between DCIS and IDC. Using this as a basis, we performed pseudotime analysis on T cell subpopulations, revealing that differentially expressed genes primarily regulate immune cell migration and modulation. By intersecting WGCNA results of T cells highly correlated with the subtypes and the differentially expressed genes, we identified six key genes: FGFBP2, GNLY, KLRD1, TYROBP, PRF1, and NKG7. Excluding PRF1, the other five genes were significantly associated with overall survival in breast cancer, highlighting their potential as prognostic biomarkers. CONCLUSIONS We identified immune cells that may play a role in the progression from DCIS to IDC and uncovered five key genes that can serve as prognostic markers for breast cancer. These findings provide insights into the mechanisms underlying the transition from DCIS to IDC, offering valuable perspectives for future research. Additionally, our results contribute to a better understanding of the biological processes involved in breast cancer progression.
Collapse
MESH Headings
- Humans
- Single-Cell Analysis
- Female
- Tumor Microenvironment/genetics
- Tumor Microenvironment/immunology
- Gene Expression Profiling
- Prognosis
- Carcinoma, Intraductal, Noninfiltrating/genetics
- Carcinoma, Intraductal, Noninfiltrating/immunology
- Carcinoma, Intraductal, Noninfiltrating/pathology
- Breast Neoplasms/genetics
- Breast Neoplasms/immunology
- Breast Neoplasms/pathology
- Gene Expression Regulation, Neoplastic
- Carcinoma, Ductal, Breast/genetics
- Carcinoma, Ductal, Breast/pathology
- Carcinoma, Ductal, Breast/immunology
- Disease Progression
- Transcriptome/genetics
- Single-Cell Gene Expression Analysis
Collapse
Affiliation(s)
- Yidi Sun
- School of Computer Science and Technology, Hainan University, Haikou, 570228, China
| | - Zhuoyu Pan
- International Business School, Hainan University, Haikou, 570228, China
| | - Ziyi Wang
- School of Computer Science and Technology, Hainan University, Haikou, 570228, China
| | - Haofei Wang
- School of Computer Science and Technology, Hainan University, Haikou, 570228, China
| | - Leyi Wei
- Centre for Artificial Intelligence driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, Macao SAR, China
- School of Informatics, Xiamen University, Xiamen, China
| | - Feifei Cui
- School of Computer Science and Technology, Hainan University, Haikou, 570228, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China.
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou, 570228, China.
| |
Collapse
|
20
|
K Lodi M, Chernikov A, Ghosh P. COFFEE: consensus single cell-type specific inference for gene regulatory networks. Brief Bioinform 2024; 25:bbae457. [PMID: 39311699 PMCID: PMC11418232 DOI: 10.1093/bib/bbae457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/22/2024] [Accepted: 09/02/2024] [Indexed: 09/26/2024] Open
Abstract
The inference of gene regulatory networks (GRNs) is crucial to understanding the regulatory mechanisms that govern biological processes. GRNs may be represented as edges in a graph, and hence, it have been inferred computationally for scRNA-seq data. A wisdom of crowds approach to integrate edges from several GRNs to create one composite GRN has demonstrated improved performance when compared with individual algorithm implementations on bulk RNA-seq and microarray data. In an effort to extend this approach to scRNA-seq data, we present COFFEE (COnsensus single cell-type speciFic inFerence for gEnE regulatory networks), a Borda voting-based consensus algorithm that integrates information from 10 established GRN inference methods. We conclude that COFFEE has improved performance across synthetic, curated, and experimental datasets when compared with baseline methods. Additionally, we show that a modified version of COFFEE can be leveraged to improve performance on newer cell-type specific GRN inference methods. Overall, our results demonstrate that consensus-based methods with pertinent modifications continue to be valuable for GRN inference at the single cell level. While COFFEE is benchmarked on 10 algorithms, it is a flexible strategy that can incorporate any set of GRN inference algorithms according to user preference. A Python implementation of COFFEE may be found on GitHub: https://github.com/lodimk2/coffee.
Collapse
Affiliation(s)
- Musaddiq K Lodi
- Integrative Life Sciences, Virginia Commonwealth University, 1000 W Cary St, Richmond, VA 23284, United States
| | - Anna Chernikov
- Center for Biological Data Science, Virginia Commonwealth University, 1015 Floyd Ave, Richmond, VA 23284, United States
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, 401 W Main St, Richmond, VA 23284, United States
| |
Collapse
|
21
|
Yuan L, Zhao L, Jiang Y, Shen Z, Zhang Q, Zhang M, Zheng CH, Huang DS. scMGATGRN: a multiview graph attention network-based method for inferring gene regulatory networks from single-cell transcriptomic data. Brief Bioinform 2024; 25:bbae526. [PMID: 39417321 PMCID: PMC11484520 DOI: 10.1093/bib/bbae526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Revised: 07/09/2024] [Accepted: 10/03/2024] [Indexed: 10/19/2024] Open
Abstract
The gene regulatory network (GRN) plays a vital role in understanding the structure and dynamics of cellular systems, revealing complex regulatory relationships, and exploring disease mechanisms. Recently, deep learning (DL)-based methods have been proposed to infer GRNs from single-cell transcriptomic data and achieved impressive performance. However, these methods do not fully utilize graph topological information and high-order neighbor information from multiple receptive fields. To overcome those limitations, we propose a novel model based on multiview graph attention network, namely, scMGATGRN, to infer GRNs. scMGATGRN mainly consists of GAT, multiview, and view-level attention mechanism. GAT can extract essential features of the gene regulatory network. The multiview model can simultaneously utilize local feature information and high-order neighbor feature information of nodes in the gene regulatory network. The view-level attention mechanism dynamically adjusts the relative importance of node embedding representations and efficiently aggregates node embedding representations from two views. To verify the effectiveness of scMGATGRN, we compared its performance with 10 methods (five shallow learning algorithms and five state-of-the-art DL-based methods) on seven benchmark single-cell RNA sequencing (scRNA-seq) datasets from five cell lines (two in human and three in mouse) with four different kinds of ground-truth networks. The experimental results not only show that scMGATGRN outperforms competing methods but also demonstrate the potential of this model in inferring GRNs. The code and data of scMGATGRN are made freely available on GitHub (https://github.com/nathanyl/scMGATGRN).
Collapse
Affiliation(s)
- Lin Yuan
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science, 3501 Daxue Road, 250353, Shandong, China
| | - Ling Zhao
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science, 3501 Daxue Road, 250353, Shandong, China
| | - Yufeng Jiang
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science, 3501 Daxue Road, 250353, Shandong, China
| | - Zhen Shen
- School of Computer and Software, Nanyang Institute of Technology, 80 Changjiang Road, 473004, Henan, China
| | - Qinhu Zhang
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, 568 Tongxin Road, 315201, Zhejiang, China
| | - Ming Zhang
- Department of Pediatrics, Zhongshan Hospital Xiamen University, 201 Hubinnan Road, 361004, Fujian, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - De-Shuang Huang
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, 568 Tongxin Road, 315201, Zhejiang, China
- Institute for Regenerative Medicine, Medical Innovation Center and State Key Laboratory of Cardiology, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, 1239 Siping Road, 200123, Shanghai, China
| |
Collapse
|
22
|
Johnston KG, Grieco SF, Nie Q, Theis FJ, Xu X. Small data methods in omics: the power of one. Nat Methods 2024; 21:1597-1602. [PMID: 39174710 PMCID: PMC12067744 DOI: 10.1038/s41592-024-02390-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 07/24/2024] [Indexed: 08/24/2024]
Abstract
Over the last decade, biology has begun utilizing 'big data' approaches, resulting in large, comprehensive atlases in modalities ranging from transcriptomics to neural connectomics. However, these approaches must be complemented and integrated with 'small data' approaches to efficiently utilize data from individual labs. Integration of smaller datasets with major reference atlases is critical to provide context to individual experiments, and approaches toward integration of large and small data have been a major focus in many fields in recent years. Here we discuss progress in integration of small data with consortium-sized atlases across multiple modalities, and its potential applications. We then examine promising future directions for utilizing the power of small data to maximize the information garnered from small-scale experiments. We envision that, in the near future, international consortia comprising many laboratories will work together to collaboratively build reference atlases and foundation models using small data methods.
Collapse
Affiliation(s)
- Kevin G Johnston
- Department of Mathematics, University of California, Irvine, Irvine, CA, USA
- Department of Anatomy and Neurobiology, School of Medicine, University of California, Irvine, Irvine, CA, USA
| | - Steven F Grieco
- Department of Anatomy and Neurobiology, School of Medicine, University of California, Irvine, Irvine, CA, USA
- Center for Neural Circuit Mapping, University of California, Irvine, Irvine, CA, USA
| | - Qing Nie
- Department of Mathematics, University of California, Irvine, Irvine, CA, USA.
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA.
| | - Fabian J Theis
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
- Department of Mathematics, Technical University of Munich, Munich, Germany.
| | - Xiangmin Xu
- Department of Anatomy and Neurobiology, School of Medicine, University of California, Irvine, Irvine, CA, USA.
- Center for Neural Circuit Mapping, University of California, Irvine, Irvine, CA, USA.
| |
Collapse
|
23
|
Xiang L, Rao J, Yuan J, Xie T, Yan H. Single-Cell RNA-Sequencing: Opening New Horizons for Breast Cancer Research. Int J Mol Sci 2024; 25:9482. [PMID: 39273429 PMCID: PMC11395021 DOI: 10.3390/ijms25179482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 08/25/2024] [Accepted: 08/29/2024] [Indexed: 09/15/2024] Open
Abstract
Breast cancer is the most prevalent malignant tumor among women with high heterogeneity. Traditional techniques frequently struggle to comprehensively capture the intricacy and variety of cellular states and interactions within breast cancer. As global precision medicine rapidly advances, single-cell RNA sequencing (scRNA-seq) has become a highly effective technique, revolutionizing breast cancer research by offering unprecedented insights into the cellular heterogeneity and complexity of breast cancer. This cutting-edge technology facilitates the analysis of gene expression profiles at the single-cell level, uncovering diverse cell types and states within the tumor microenvironment. By dissecting the cellular composition and transcriptional signatures of breast cancer cells, scRNA-seq provides new perspectives for understanding the mechanisms behind tumor therapy, drug resistance and metastasis in breast cancer. In this review, we summarized the working principle and workflow of scRNA-seq and emphasized the major applications and discoveries of scRNA-seq in breast cancer research, highlighting its impact on our comprehension of breast cancer biology and its potential for guiding personalized treatment strategies.
Collapse
Affiliation(s)
- Lingyan Xiang
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan 430060, China
| | - Jie Rao
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan 430060, China
| | - Jingping Yuan
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan 430060, China
| | - Ting Xie
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan 430060, China
| | - Honglin Yan
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan 430060, China
| |
Collapse
|
24
|
Cervantes-Pérez SA, Zogli P, Amini S, Thibivilliers S, Tennant S, Hossain MS, Xu H, Meyer I, Nooka A, Ma P, Yao Q, Naldrett MJ, Farmer A, Martin O, Bhattacharya S, Kläver J, Libault M. Single-cell transcriptome atlases of soybean root and mature nodule reveal new regulatory programs that control the nodulation process. PLANT COMMUNICATIONS 2024; 5:100984. [PMID: 38845198 PMCID: PMC11369782 DOI: 10.1016/j.xplc.2024.100984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/21/2024] [Accepted: 06/03/2024] [Indexed: 07/14/2024]
Abstract
The soybean root system is complex. In addition to being composed of various cell types, the soybean root system includes the primary root, the lateral roots, and the nodule, an organ in which mutualistic symbiosis with N-fixing rhizobia occurs. A mature soybean root nodule is characterized by a central infection zone where atmospheric nitrogen is fixed and assimilated by the symbiont, resulting from the close cooperation between the plant cell and the bacteria. To date, the transcriptome of individual cells isolated from developing soybean nodules has been established, but the transcriptomic signatures of cells from the mature soybean nodule have not yet been characterized. Using single-nucleus RNA-seq and Molecular Cartography technologies, we precisely characterized the transcriptomic signature of soybean root and mature nodule cell types and revealed the co-existence of different sub-populations of B. diazoefficiens-infected cells in the mature soybean nodule, including those actively involved in nitrogen fixation and those engaged in senescence. Mining of the single-cell-resolution nodule transcriptome atlas and the associated gene co-expression network confirmed the role of known nodulation-related genes and identified new genes that control the nodulation process. For instance, we functionally characterized the role of GmFWL3, a plasma membrane microdomain-associated protein that controls rhizobial infection. Our study reveals the unique cellular complexity of the mature soybean nodule and helps redefine the concept of cell types when considering the infection zone of the soybean nodule.
Collapse
Affiliation(s)
| | - Prince Zogli
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68503, USA
| | - Sahand Amini
- Division of Plant Science and Technology, College of Agriculture, Food, and Natural Resources, University of Missouri-Columbia, Columbia, MO 65211, USA; Interdisciplinary Plant Group of Missouri-Columbia, Columbia, MO 65211, USA
| | - Sandra Thibivilliers
- Division of Plant Science and Technology, College of Agriculture, Food, and Natural Resources, University of Missouri-Columbia, Columbia, MO 65211, USA; Interdisciplinary Plant Group of Missouri-Columbia, Columbia, MO 65211, USA
| | - Sutton Tennant
- Division of Plant Science and Technology, College of Agriculture, Food, and Natural Resources, University of Missouri-Columbia, Columbia, MO 65211, USA; Interdisciplinary Plant Group of Missouri-Columbia, Columbia, MO 65211, USA
| | - Md Sabbir Hossain
- Division of Plant Science and Technology, College of Agriculture, Food, and Natural Resources, University of Missouri-Columbia, Columbia, MO 65211, USA; Interdisciplinary Plant Group of Missouri-Columbia, Columbia, MO 65211, USA
| | - Hengping Xu
- Division of Plant Science and Technology, College of Agriculture, Food, and Natural Resources, University of Missouri-Columbia, Columbia, MO 65211, USA; Interdisciplinary Plant Group of Missouri-Columbia, Columbia, MO 65211, USA
| | - Ian Meyer
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68503, USA
| | - Akash Nooka
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68503, USA
| | - Pengchong Ma
- School of Computing, University of Nebraska-Lincoln, Lincoln, NE 68503, USA
| | - Qiuming Yao
- School of Computing, University of Nebraska-Lincoln, Lincoln, NE 68503, USA
| | - Michael J Naldrett
- Proteomics and Metabolomics Facility, Center for Biotechnology, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Andrew Farmer
- National Center for Genome Resources, Santa Fe, NM 87505, USA
| | - Olivier Martin
- INRAE, Université Paris-Saclay, Institut des Sciences des Plantes de Paris Saclay, IPS2, Batiment 630 Plateau du Moulon, Rue Noetzlin, 91192 Gif sur Yvette Cedex, France
| | | | | | - Marc Libault
- Division of Plant Science and Technology, College of Agriculture, Food, and Natural Resources, University of Missouri-Columbia, Columbia, MO 65211, USA; Interdisciplinary Plant Group of Missouri-Columbia, Columbia, MO 65211, USA.
| |
Collapse
|
25
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024; 46:e2300210. [PMID: 38715516 PMCID: PMC11444527 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
26
|
Si Y, Zou J, Gao Y, Chuai G, Liu Q, Chen L. Foundation models in molecular biology. BIOPHYSICS REPORTS 2024; 10:135-151. [PMID: 39027316 PMCID: PMC11252241 DOI: 10.52601/bpr.2024.240006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/04/2024] [Indexed: 07/20/2024] Open
Abstract
Determining correlations between molecules at various levels is an important topic in molecular biology. Large language models have demonstrated a remarkable ability to capture correlations from large amounts of data in the field of natural language processing as well as image generation, and correlations captured from data using large language models can also be applicable to solving a wide range of specific tasks, hence large language models are also referred to as foundation models. The massive amount of data that exists in the field of molecular biology provides an excellent basis for the development of foundation models, and the recent emergence of foundation models in the field of molecular biology has really pushed the entire field forward. We summarize the foundation models developed based on RNA sequence data, DNA sequence data, protein sequence data, single-cell transcriptome data, and spatial transcriptome data respectively, and further discuss the research directions for the development of foundation models in molecular biology.
Collapse
Affiliation(s)
- Yunda Si
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Jiawei Zou
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, China
| | - Yicheng Gao
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Guohui Chuai
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Qi Liu
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Luonan Chen
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
27
|
Gan Y, Yu J, Xu G, Yan C, Zou G. Inferring gene regulatory networks from single-cell transcriptomics based on graph embedding. Bioinformatics 2024; 40:btae291. [PMID: 38810116 PMCID: PMC11142726 DOI: 10.1093/bioinformatics/btae291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/06/2024] [Accepted: 05/28/2024] [Indexed: 05/31/2024] Open
Abstract
MOTIVATION Gene regulatory networks (GRNs) encode gene regulation in living organisms, and have become a critical tool to understand complex biological processes. However, due to the dynamic and complex nature of gene regulation, inferring GRNs from scRNA-seq data is still a challenging task. Existing computational methods usually focus on the close connections between genes, and ignore the global structure and distal regulatory relationships. RESULTS In this study, we develop a supervised deep learning framework, IGEGRNS, to infer GRNs from scRNA-seq data based on graph embedding. In the framework, contextual information of genes is captured by GraphSAGE, which aggregates gene features and neighborhood structures to generate low-dimensional embedding for genes. Then, the k most influential nodes in the whole graph are filtered through Top-k pooling. Finally, potential regulatory relationships between genes are predicted by stacking CNNs. Compared with nine competing supervised and unsupervised methods, our method achieves better performance on six time-series scRNA-seq datasets. AVAILABILITY AND IMPLEMENTATION Our method IGEGRNS is implemented in Python using the Pytorch machine learning library, and it is freely available at https://github.com/DHUDBlab/IGEGRNS.
Collapse
Affiliation(s)
- Yanglan Gan
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Jiacheng Yu
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Guangwei Xu
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Cairong Yan
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Guobing Zou
- School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
| |
Collapse
|
28
|
Sun N, Shao H, Zhang Y, Ci B, Yao H, Bai B, Tan T. Establishing a 3D culture system for early organogenesis of monkey embryos ex vivo and single-cell transcriptome analysis of cultured embryos. STAR Protoc 2024; 5:102835. [PMID: 38224493 PMCID: PMC10826423 DOI: 10.1016/j.xpro.2023.102835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 11/30/2023] [Accepted: 12/27/2023] [Indexed: 01/17/2024] Open
Abstract
Creating in vitro culture platforms for monkey embryos is crucial for understanding the initial 4 weeks of early primate embryogenesis. Here, we present a protocol to culture cynomolgus monkey embryos in vitro for 25 days post-fertilization and to delineate the key developmental events of gastrulation and early organogenesis. We describe steps for culturing with a 3D system, immunofluorescence analysis, single-cell RNA sequencing, and bioinformatic analysis. For complete details on the use and execution of this protocol, please refer to Gong et al. (2023).1.
Collapse
Affiliation(s)
- Nianqin Sun
- State Key Laboratory of Primate Biomedical Research, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Yunnan Key Laboratory of Primate Biomedical Research, Kunming, Yunnan 650500, China.
| | - Honglian Shao
- State Key Laboratory of Primate Biomedical Research, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Yunnan Key Laboratory of Primate Biomedical Research, Kunming, Yunnan 650500, China
| | - Youyue Zhang
- State Key Laboratory of Primate Biomedical Research, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Yunnan Key Laboratory of Primate Biomedical Research, Kunming, Yunnan 650500, China
| | - Baiquan Ci
- State Key Laboratory of Primate Biomedical Research, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Yunnan Key Laboratory of Primate Biomedical Research, Kunming, Yunnan 650500, China
| | - Hui Yao
- State Key Laboratory of Primate Biomedical Research, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Yunnan Key Laboratory of Primate Biomedical Research, Kunming, Yunnan 650500, China
| | - Bing Bai
- State Key Laboratory of Primate Biomedical Research, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Yunnan Key Laboratory of Primate Biomedical Research, Kunming, Yunnan 650500, China.
| | - Tao Tan
- State Key Laboratory of Primate Biomedical Research, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Yunnan Key Laboratory of Primate Biomedical Research, Kunming, Yunnan 650500, China.
| |
Collapse
|
29
|
Cui X, Lin Q, Chen M, Wang Y, Wang Y, Wang Y, Tao J, Yin H, Zhao T. Long-read sequencing unveils novel somatic variants and methylation patterns in the genetic information system of early lung cancer. Comput Biol Med 2024; 171:108174. [PMID: 38442557 DOI: 10.1016/j.compbiomed.2024.108174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 01/25/2024] [Accepted: 02/18/2024] [Indexed: 03/07/2024]
Abstract
Lung cancer poses a global health challenge, necessitating advanced diagnostics for improved outcomes. Intensive efforts are ongoing to pinpoint early detection biomarkers, such as genomic variations and DNA methylation, to elevate diagnostic precision. We conducted long-read sequencing on cancerous and adjacent non-cancerous tissues from a patient with lung adenocarcinoma. We identified somatic structural variations (SVs) specific to lung cancer by integrating data from various SV calling methods and differentially methylated regions (DMRs) that were distinct between these two tissue samples, revealing a unique methylation pattern associated with lung cancer. This study discovered over 40,000 somatic SVs and over 180,000 DMRs linked to lung cancer. We identified approximately 700 genes of significant relevance through comprehensive analysis, including genes intricately associated with many lung cancers, such as NOTCH1, SMOC2, CSMD2, and others. Furthermore, we observed that somatic SVs and DMRs were substantially enriched in several pathways, such as axon guidance signaling pathways, which suggests a comprehensive multi-omics impact on lung cancer progression across various biological investigation levels. These datasets can potentially serve as biomarkers for early lung cancer detection and may hold significant value in clinical diagnosis and treatment applications.
Collapse
Affiliation(s)
- Xinran Cui
- School of Computer Science and Technology, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China
| | - Qingyan Lin
- Department of Respiratory and Critical Care, Heilongjiang Provincial Hospital, 405 Gorokhovaya Street, Harbin, Heilongjiang, 150000, China
| | - Ming Chen
- Institute of Bioinformatics, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China
| | - Yidan Wang
- Department of Respiratory and Critical Care, Heilongjiang Provincial Hospital, 405 Gorokhovaya Street, Harbin, Heilongjiang, 150000, China
| | - Yiwen Wang
- Tanwei College, Tsinghua University, Shuangqing Road, Beijing, 100084, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China.
| | - Jiang Tao
- School of Computer Science and Technology, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China.
| | - Honglei Yin
- Department of Respiratory and Critical Care, Heilongjiang Provincial Hospital, 405 Gorokhovaya Street, Harbin, Heilongjiang, 150000, China.
| | - Tianyi Zhao
- School of Medicine, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China.
| |
Collapse
|
30
|
Peng L, Yang Y, Yang C, Li Z, Cheong N. HRGCNLDA: Forecasting of lncRNA-disease association based on hierarchical refinement graph convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:4814-4834. [PMID: 38872515 DOI: 10.3934/mbe.2024212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Long non-coding RNA (lncRNA) is considered to be a crucial regulator involved in various human biological processes, including the regulation of tumor immune checkpoint proteins. It has great potential as both a cancer biomolecular biomarker and therapeutic target. Nevertheless, conventional biological experimental techniques are both resource-intensive and laborious, making it essential to develop an accurate and efficient computational method to facilitate the discovery of potential links between lncRNAs and diseases. In this study, we proposed HRGCNLDA, a computational approach utilizing hierarchical refinement of graph convolutional neural networks for forecasting lncRNA-disease potential associations. This approach effectively addresses the over-smoothing problem that arises from stacking multiple layers of graph convolutional neural networks. Specifically, HRGCNLDA enhances the layer representation during message propagation and node updates, thereby amplifying the contribution of hidden layers that resemble the ego layer while reducing discrepancies. The results of the experiments showed that HRGCNLDA achieved the highest AUC-ROC (area under the receiver operating characteristic curve, AUC for short) and AUC-PR (area under the precision versus recall curve, AUPR for short) values compared to other methods. Finally, to further demonstrate the reliability and efficacy of our approach, we performed case studies on the case of three prevalent human diseases, namely, breast cancer, lung cancer and gastric cancer.
Collapse
Affiliation(s)
- Li Peng
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
- Hunan Key Laboratory for Service Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Yujie Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Cheng Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Zejun Li
- School of Computer Science and Engineering, Hunan Institute of Technology, Hengyang 421002, China
| | - Ngai Cheong
- Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
| |
Collapse
|
31
|
Lodi MK, Chernikov A, Ghosh P. COFFEE: Consensus Single Cell-Type Specific Inference for Gene Regulatory Networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.05.574445. [PMID: 38260386 PMCID: PMC10802453 DOI: 10.1101/2024.01.05.574445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The inference of gene regulatory networks (GRNs) is crucial to understanding the regulatory mechanisms that govern biological processes. GRNs may be represented as edges in a graph, and hence have been inferred computationally for scRNA-seq data. A wisdom of crowds approach to integrate edges from several GRNs to create one composite GRN has demonstrated improved performance when compared to individual algorithm implementations on bulk RNA-seq and microarray data. In an effort to extend this approach to scRNA-seq data, we present COFFEE (COnsensus single cell-type speciFic inFerence for gEnE regulatory networks), a Borda voting based consensus algorithm that integrates information from 10 established GRN inference methods. We conclude that COFFEE has improved performance across synthetic, curated and experimental datasets when compared to baseline methods. Additionally, we show that a modified version of COFFEE can be leveraged to improve performance on newer cell-type specific GRN inference methods. Overall, our results demonstrate that consensus based methods with pertinent modifications continue to be valuable for GRN inference at the single cell level.
Collapse
Affiliation(s)
- Musaddiq K Lodi
- Integrative Life Sciences, Virginia Commonwealth University, Richmond, VA 23284
| | - Anna Chernikov
- Center for Biological Data Science, Virginia Commonwealth University, Richmond, VA 23284
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284
| |
Collapse
|