1
|
Patil AR, Schug J, Liu C, Lahori D, Descamps HC, Naji A, Kaestner KH, Faryabi RB, Vahedi G. Modeling type 1 diabetes progression using machine learning and single-cell transcriptomic measurements in human islets. Cell Rep Med 2024; 5:101535. [PMID: 38677282 PMCID: PMC11148720 DOI: 10.1016/j.xcrm.2024.101535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 01/22/2024] [Accepted: 04/07/2024] [Indexed: 04/29/2024]
Abstract
Type 1 diabetes (T1D) is a chronic condition in which beta cells are destroyed by immune cells. Despite progress in immunotherapies that could delay T1D onset, early detection of autoimmunity remains challenging. Here, we evaluate the utility of machine learning for early prediction of T1D using single-cell analysis of islets. Using gradient-boosting algorithms, we model changes in gene expression of single cells from pancreatic tissues in T1D and non-diabetic organ donors. We assess if mathematical modeling could predict the likelihood of T1D development in non-diabetic autoantibody-positive donors. While most autoantibody-positive donors are predicted to be non-diabetic, select donors with unique gene signatures are classified as T1D. Our strategy also reveals a shared gene signature in distinct T1D-associated models across cell types, suggesting a common effect of the disease on transcriptional outputs of these cells. Our study establishes a precedent for using machine learning in early detection of T1D.
Collapse
Affiliation(s)
- Abhijeet R Patil
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Immunology and Immune Health, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Jonathan Schug
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Chengyang Liu
- Institute for Immunology and Immune Health, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Surgery, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Deeksha Lahori
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Hélène C Descamps
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Ali Naji
- Institute for Immunology and Immune Health, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Surgery, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Klaus H Kaestner
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Robert B Faryabi
- Institute for Immunology and Immune Health, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Abramson Family Cancer Research Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Golnaz Vahedi
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Immunology and Immune Health, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Abramson Family Cancer Research Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
| |
Collapse
|
2
|
He J, Li M, Qiu J, Pu X, Guo Y. HOPEXGB: A Consensual Model for Predicting miRNA/lncRNA-Disease Associations Using a Heterogeneous Disease-miRNA-lncRNA Information Network. J Chem Inf Model 2024; 64:2863-2877. [PMID: 37604142 DOI: 10.1021/acs.jcim.3c00856] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Predicting disease-related microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) is crucial to find new biomarkers for the prevention, diagnosis, and treatment of complex human diseases. Computational predictions for miRNA/lncRNA-disease associations are of great practical significance, since traditional experimental detection is expensive and time-consuming. In this paper, we proposed a consensual machine-learning technique-based prediction approach to identify disease-related miRNAs and lncRNAs by high-order proximity preserved embedding (HOPE) and eXtreme Gradient Boosting (XGB), named HOPEXGB. By connecting lncRNA, miRNA, and disease nodes based on their correlations and relationships, we first created a heterogeneous disease-miRNA-lncRNA (DML) information network to achieve an effective fusion of information on similarities, correlations, and interactions among miRNAs, lncRNAs, and diseases. In addition, a more rational negative data set was generated based on the similarities of unknown associations with the known ones, so as to effectively reduce the false negative rate in the data set for model construction. By 10-fold cross-validation, HOPE shows better performance than other graph embedding methods. The final consensual HOPEXGB model yields robust performance with a mean prediction accuracy of 0.9569 and also demonstrates high sensitivity and specificity advantages compared to lncRNA/miRNA-specific predictions. Moreover, it is superior to other existing methods and gives promising performance on the external testing data, indicating that integrating the information on lncRNA-miRNA interactions and the similarities of lncRNAs/miRNAs is beneficial for improving the prediction performance of the model. Finally, case studies on lung, stomach, and breast cancers indicate that HOPEXGB could be a powerful tool for preclinical biomarker detection and bioexperiment preliminary screening for the diagnosis and prognosis of cancers. HOPEXGB is publicly available at https://github.com/airpamper/HOPEXGB.
Collapse
Affiliation(s)
- Jian He
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Jiangguo Qiu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
3
|
Li R, Zhang W, Shi B, Ma L, Jiang F, Wang X, Li J. A common variant SNP rs1937810 in the MPP7 gene contributes to the susceptibility of breast cancer in the Chinese Han population. Mol Genet Genomic Med 2023; 11:e2198. [PMID: 37194388 PMCID: PMC10496085 DOI: 10.1002/mgg3.2198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 04/23/2023] [Accepted: 05/04/2023] [Indexed: 05/18/2023] Open
Abstract
BACKGROUND Breast cancer (BC) is common cancer caused by environmental factors and genetic ones. Previous evidence has linked gene MAGUK P55 Scaffold Protein 7 (MPP7) to BC, despite that there has been no research evaluating the relationship between MPP7 genetic polymorphisms and BC susceptibility. We aimed to investigate the potential association of the MPP7 gene with the susceptibility to BC in Han Chinese individuals. METHODS In total, 1390 patients with BC and 2480 controls were enrolled. For genotyping, 20 tag SNPs were chosen. The serum levels of protein MPP7 were measured in all subjects using an enzyme-linked immunosorbent assay. Genetic association analysis was performed in both genotypic and allelic modes, and the relationship between BC patients' clinical features and genotypes of relevant SNPs was examined. The functional implications of significant markers were also evaluated. RESULTS After adjusting for Bonferroni correction, SNP rs1937810 was found to be significantly associated with the risk of BC (p = 1.19 × 10-4 ). The odds ratio of CC genotypes in BC patients was 49% higher than in controls (1.49 [1.23-1.81]). Serum MPP7 protein levels were significantly higher in BC patients than in controls (p < 0.001). The protein level of the CC genotype was the highest, and that of the CT and TT genotypes decreased in turn (both p < 0.001). CONCLUSIONS Our results linked SNP rs1937810 to the susceptibility of BC and the clinical features of BC patients. This SNP is also proved to be significantly related to the serum level of protein MPP7 in both BC patients and controls.
Collapse
Affiliation(s)
- Rong Li
- Department of RadiotherapyThe First Affiliated Hospital of Xi'an Jiaotong UniversityXi'anChina
| | - Wenpei Zhang
- Key Laboratory of National Health Commission for Forensic SciencesXi'an Jiaotong University Health Science CenterXi'anChina
| | - Bohui Shi
- Department of Breast SurgeryThe First Affiliated Hospital of Xi'an Jiaotong UniversityXi'anChina
| | - Li Ma
- Department of OncologyThe Second Affiliated Hospital of Xi'an Jiaotong UniversityXi'anChina
| | - Fanliu Jiang
- Key Laboratory of National Health Commission for Forensic SciencesXi'an Jiaotong University Health Science CenterXi'anChina
| | - Xiaochen Wang
- Key Laboratory of National Health Commission for Forensic SciencesXi'an Jiaotong University Health Science CenterXi'anChina
| | - Jieqiong Li
- Department of NursingThe First Affiliated Hospital of Xi'an Jiaotong UniversityXi'anChina
| |
Collapse
|
4
|
Dhakal P, Tayara H, Chong KT. An ensemble of stacking classifiers for improved prediction of miRNA-mRNA interactions. Comput Biol Med 2023; 164:107242. [PMID: 37473564 DOI: 10.1016/j.compbiomed.2023.107242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 06/21/2023] [Accepted: 07/07/2023] [Indexed: 07/22/2023]
Abstract
MicroRNAs (miRNAs) are small non-coding RNA molecules that play a crucial role in regulating gene expression at the post-transcriptional level by binding to potential target sites of messenger RNAs (mRNAs), facilitated by the Argonaute family of proteins. Selecting the conservative candidate target sites (CTS) is a challenging step, considering that most of the existing computational algorithms primarily focus on canonical site types, which is a time-consuming and inefficient utilization of miRNA target site interactions. We developed a stacking classifier algorithm that addresses the CTS selection criteria using feature-encoding techniques that generates feature vectors, including k-mer nucleotide composition, dinucleotide composition, pseudo-nucleotide composition, and sequence order coupling. This innovative stacking classifier algorithm surpassed previous state-of-the-art algorithms in predicting functional miRNA targets. We evaluated the performance of the proposed model on 10 independent test datasets and obtained an average accuracy of 79.77%, which is a significant improvement of 7.26 % over previous models. This improvement shows that the proposed method has great potential for distinguishing highly functional miRNA targets and can serve as a valuable tool in biomedical and drug development research.
Collapse
Affiliation(s)
- Priyash Dhakal
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896, Jeollabuk-do, Republic of Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju-si, 54896, Jeollabuk-do, Republic of Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896, Jeollabuk-do, Republic of Korea; Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju-si, 54896, Jeollabuk-do, Republic of Korea.
| |
Collapse
|
6
|
Ma B, Zhang W, Wang X, Jiang H, Tang L, Yang W, Kang Q, Cao J. Polymorphisms in TRIB2 and CAPRIN2 Genes Contribute to the Susceptibility to High Myopia-Induced Cataract in Han Chinese Population. Med Sci Monit 2023; 29:e937702. [PMID: 36710479 PMCID: PMC9896844 DOI: 10.12659/msm.937702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Myopia has been shown to be associated with many pathological complications including cataracts, and previous evidence supported that high myopia facilitates the formation of cataracts. However, no studies have identified a link between the genetic susceptibility of high myopia-induced cataracts (HMC) and the underlying genetic mechanisms. Our study aimed to determine how the TRIB2 and CAPRIN2 genes correlate to the risk of HMC. MATERIAL AND METHODS In total, we successfully recruited 3162 participants, including 1026 participants with high myopia and cataracts and 2136 controls with high myopia only. For genotyping, 22 tag single nucleotide polymorphisms (SNPs) in TRIB2 and CAPRIN2 genes were chosen. Single marker association analysis and functional effects of significant SNPs were carried out. RESULTS Strong correlation signals were captured for SNP rs890069 (χ²=22.13, P=2.55×10-6) in TRIB2 and SNP rs17739338 (χ²=16.07, P=6.10×10-5) in CAPRIN2. In patients with high myopia, the C allele at SNP rs890069 was strongly linked to cataract risk (OR [95% CI]=1.36 [1.20-1.55]). In patients with high myopia, the T allele at SNP rs17739338 was significantly related to a lower risk of cataract (OR [95% CI]=0.54 [0.40-0.74]). In different types of human tissues, SNPs rs890069 and rs17739338 were found to be significantly correlated to the levels of TRIB2 and CAPRIN2 gene expression. CONCLUSIONS Our study indicated that both TRIB2 and CAPRIN2 genes conferred the susceptibility to cataract in patients with high myopia and Chinese Han ancestry. Future research remains necessary for fully understanding the pathogenic mechanisms and genetic characteristics of cataract.
Collapse
Affiliation(s)
- Bo Ma
- Department of Ophthalmology, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, PR China
| | - Wenpei Zhang
- Department of Forensic Medicine, School of Medicine and Forensics, Xi’an Jiaotong University, Xi’an, Shaanxi, PR China
| | - Xiaochen Wang
- Department of Forensic Medicine, School of Medicine and Forensics, Xi’an Jiaotong University, Xi’an, Shaanxi, PR China
| | - Huili Jiang
- Department of Ophthalmology, Xi’an Fourth Hospital, Xi’an, Shaanxi, PR China
| | - Li Tang
- Department of Ophthalmology, Xi’an Fourth Hospital, Xi’an, Shaanxi, PR China
| | - Wen Yang
- Department of Ophthalmology, Xi’an Fourth Hospital, Xi’an, Shaanxi, PR China
| | - Qianyan Kang
- Department of Ophthalmology, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, PR China
| | - Juan Cao
- Department of Ophthalmology, Xi’an Fourth Hospital, Xi’an, Shaanxi, PR China
| |
Collapse
|
7
|
Wang X, Xiao L, Wang Z, Zhi L, Li Q. Common variants in GNL3 gene contributed the susceptibility of hand osteoarthritis in Han Chinese population. Sci Rep 2022; 12:16110. [PMID: 36167888 PMCID: PMC9515075 DOI: 10.1038/s41598-022-20287-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 09/12/2022] [Indexed: 11/09/2022] Open
Abstract
Osteoarthritis (OA) is one of the most popular degenerative joint diseases. The nucleolar GTP binding protein 3 (GNL3) gene encodes guanine nucleotide binding protein-like 3, which is related in cell proliferation, differentiation, and cell cycle regulation. Our study aimed to examine the contribution of GNL3 gene polymorphisms to the risk of hand OA and its related clinical features. A total of 3387 study participants including 1160 patients with hand OA and 2227 controls were recruited in this study. Eleven SNPs in GNL3 gene were selected for genotyping. Genetic association signals were examined using Plink. Relationships between significant SNPs and clinical features of hand OA were also explored. SNP rs11177 was found to be strongly associated with susceptibility of hand OA (P = 4.32 × 10-5). The minor allele of rs11177 was associated with increased susceptibility of hand OA. In addition, significant associations were also identified between genotypes of rs11177 and clinical features of hand OA patients including K-L grade (P < 0.01) and categorized pain scores (P < 0.01). Significant eQTL signals for rs11177 on GNL3 in multiple types of human tissues were also identified in GTEx database. Our results have established the link between GNL3 gene and susceptibility of hand OA.
Collapse
Affiliation(s)
- Xi Wang
- Department of Knee Joint Surgery, Xi'an Honghui Hospital, Xi'an, Shaanxi, China
| | - Lin Xiao
- Department of Knee Joint Surgery, Xi'an Honghui Hospital, Xi'an, Shaanxi, China
| | - Zhiyuan Wang
- Department of Knee Joint Surgery, Xi'an Honghui Hospital, Xi'an, Shaanxi, China
| | - Liqiang Zhi
- Department of Knee Joint Surgery, Xi'an Honghui Hospital, Xi'an, Shaanxi, China
| | - Qiang Li
- Department of Hand Surgery, Xi'an Honghui Hospital, No. 555 Youyi East Road, Xi'an, 710054, Shaanxi, China.
| |
Collapse
|