1
|
Wu C, Xie X, Yang X, Du M, Lin H, Huang J. Applications of gene pair methods in clinical research: advancing precision medicine. MOLECULAR BIOMEDICINE 2025; 6:22. [PMID: 40202606 PMCID: PMC11982013 DOI: 10.1186/s43556-025-00263-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 03/18/2025] [Accepted: 03/21/2025] [Indexed: 04/10/2025] Open
Abstract
The rapid evolution of high-throughput sequencing technologies has revolutionized biomedical research, producing vast amounts of gene expression data that hold immense potential for biological discovery and clinical applications. Effectively mining these large-scale, high-dimensional data is crucial for facilitating disease detection, subtype differentiation, and understanding the molecular mechanisms underlying disease progression. However, the conventional paradigm of single-gene profiling, measuring absolute expression levels of individual genes, faces critical limitations in clinical implementation. These include vulnerability to batch effects and platform-dependent normalization requirements. In contrast, emerging approaches analyzing relative expression relationships between gene pairs demonstrate unique advantages. By focusing on binary comparisons of two genes' expression magnitudes, these methods inherently normalize experimental variations while capturing biologically stable interaction patterns. In this review, we systematically evaluate gene pair-based analytical frameworks. We classify eleven computational approaches into two fundamental categories: expression value-based methods quantifying differential expression patterns, and rank-based methods exploiting transcriptional ordering relationships. To bridge methodological development with practical implementation, we establish a reproducible analytical pipeline incorporating feature selection, classifier construction, and model evaluation modules using real-world benchmark datasets from pulmonary tuberculosis studies. These findings position gene pair analysis as a transformative paradigm for mining high-dimensional omics data, with direct implications for precision biomarker discovery and mechanistic studies of disease progression.
Collapse
Affiliation(s)
- Changchun Wu
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Xueqin Xie
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Xin Yang
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Mengze Du
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, 611844, China
| | - Hao Lin
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Jian Huang
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| |
Collapse
|
2
|
Chu S, Jiang A, Chen L, Zhang X, Shen X, Zhou W, Ye S, Chen C, Zhang S, Zhang L, Chen Y, Miao Y, Wang W. Machine learning algorithms for predicting the risk of fracture in patients with diabetes in China. Heliyon 2023; 9:e18186. [PMID: 37501989 PMCID: PMC10368844 DOI: 10.1016/j.heliyon.2023.e18186] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 07/11/2023] [Accepted: 07/11/2023] [Indexed: 07/29/2023] Open
Abstract
Background Patients with diabetes are more likely to be predisposed to fractures compared to those without diabetes. In clinical practice, predicting fracture risk in diabetics is still difficult because of the limited availability and accessibility of existing fracture prediction tools in the diabetic population. The purpose of this study was to develop and validate models using machine learning (ML) algorithms to achieve high predictive power for fracture in patients with diabetes in China. Methods In this study, the clinical data of 775 hospitalized patients with diabetes was analyzed by using Decision Tree (DT), Gradient Boosting Decision Tree (GBDT), Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost) and Probabilistic Classification Vector Machines (PCVM) algorithms to construct risk prediction models for fractures. Moreover, the risk factors for diabetes-related fracture were identified by the feature selection algorithms. Results The ML algorithms extracted 17 most relevant factors from raw clinical data to maximize the accuracy of the prediction results, including bone mineral density, age, sex, weight, high-density lipoprotein cholesterol, height, duration of diabetes, total cholesterol, osteocalcin, N-terminal propeptide of type I, diastolic blood pressure, and body mass index. The 7 ML models including LR, SVM, RF, DT, GBDT, XGBoost, and PCVM had f1 scores of 0.75, 0.83, 0.84, 0.85, 0.87, 0.88, and 0.97, respectively. Conclusions This study identified 17 most relevant risk factors for diabetes-related fracture using ML algorithms. And the PCVM model proved to perform best in predicting the fracture risk in the diabetic population. This work proposes a cheap, safe, and extensible ML algorithm for the precise assessment of risk factors for diabetes-related fracture.
Collapse
Affiliation(s)
- Sijia Chu
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Graduate School, Wannan Medical College, Wuhu, China
| | - Aijun Jiang
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Lyuzhou Chen
- School of Data Science, University of Science and Technology of China, Hefei, China
| | - Xi Zhang
- Department of Endocrinology, The People's Hospital of Chizhou, Chizhou, China
| | | | - Wan Zhou
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Shandong Ye
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Chao Chen
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Shilu Zhang
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Graduate School, Wannan Medical College, Wuhu, China
| | - Li Zhang
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Graduate School, Wannan Medical College, Wuhu, China
| | - Yang Chen
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Graduate School, Anhui Medical University, Hefei, China
| | - Ya Miao
- Institution of Advanced Technology, University of Science and Technology of China, Hefei, China
| | - Wei Wang
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| |
Collapse
|
5
|
Abstract
The Semantic Web emerged as an extension to the traditional Web, adding meaning (semantics) to a distributed Web of structured and linked information. At its core, the concept of ontology provides the means to semantically describe and structure information, and expose it to software and human agents in a machine and human-readable form. For software agents to be realized, it is crucial to develop powerful artificial intelligence and machine-learning techniques, able to extract knowledge from information sources, and represent it in the underlying ontology. This survey aims to provide insight into key aspects of ontology-based knowledge extraction from various sources such as text, databases, and human expertise, realized in the realm of feature selection. First, common classification and feature selection algorithms are presented. Then, selected approaches, which utilize ontologies to represent features and perform feature selection and classification, are described. The selective and representative approaches span diverse application domains, such as document classification, opinion mining, manufacturing, recommendation systems, urban management, information security systems, and demonstrate the feasibility and applicability of such methods. This survey, in addition to the criteria-based presentation of related works, contributes a number of open issues and challenges related to this still active research topic.
Collapse
|