1
|
Modanwal S, Mishra A, Mishra N. An integrative analysis of GEO data to identify possible therapeutic biomarkers of prostate cancer and targeting potential protein through Zea mays phytochemicals by virtual screening approaches. J Biomol Struct Dyn 2025; 43:709-729. [PMID: 38217083 DOI: 10.1080/07391102.2023.2283163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 11/08/2023] [Indexed: 01/14/2024]
Abstract
Prostate cancer (PC) is a prevalent type of cancer among men. Delaying the treatment of patients with upgraded or upstaged cancer may lead to unmanageable circumstances. The aim of this study is to contribute to the finding of biomarkers that are specific to PC and identify drug candidates derived from plants. The information about cancer is critical for clinicians to make decisions about patient treatment in the era of precision medicine. Advances in genomics technology have opened up new possibilities for identifying genes that are associated with cancer, including PC. This study identifies novel differentially expressed genes for PC. The seven PC microarray datasets were selected from the National Center for Biotechnology Information (NCBI)/Gene Expression Omnibus (GEO). The differentially expressed genes (DEGs) were found based on a fold change of |logFC| ≥ 1 and an adjusted p-value of <0.05. The DEGs were further studied using several bioinformatics tools, including STRING, CytoHubba, SRplot, Coremine Medical database, FunRich and GeneMANIA, cBioPortal. The six new potential biomarkers, GAGE2A, GAGE12G, GAGE2E, GAGE13, GAGE12F and CSAG1 were identified. These biomarkers are associated with biological processes (BPs) such as cell division, and gene expression regulation, so these genes may have a crucial role in PC progression and may serve as potential biomarkers for PC. A total of 497 phytochemicals from corn plants have been screened against the target protein and found LTS0176591 as the best lead molecule with docking score of -6.31 kcal/mol. Further, molecular mechanics-generalized born surface area (MM-GBSA), molecular dynamics simulation, principal component analysis (PCA), free energy landscape (FEL) and molecular mechanics-Poisson-Boltzmann surface area (MM-PBSA) were carried out to validate the findings.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Shristi Modanwal
- Department of Applied Science, Indian Institute of Information Technology Allahabad, Prayagraj, India
| | - Ashutosh Mishra
- Department of Applied Science, Indian Institute of Information Technology Allahabad, Prayagraj, India
| | - Nidhi Mishra
- Department of Applied Science, Indian Institute of Information Technology Allahabad, Prayagraj, India
| |
Collapse
|
2
|
Yan J, Zeng Q, Wang X. RankCompV3: a differential expression analysis algorithm based on relative expression orderings and applications in single-cell RNA transcriptomics. BMC Bioinformatics 2024; 25:259. [PMID: 39112940 PMCID: PMC11304794 DOI: 10.1186/s12859-024-05889-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 07/30/2024] [Indexed: 08/11/2024] Open
Abstract
BACKGROUND Effective identification of differentially expressed genes (DEGs) has been challenging for single-cell RNA sequencing (scRNA-seq) profiles. Many existing algorithms have high false positive rates (FPRs) and often fail to identify weak biological signals. RESULTS We present a novel method for identifying DEGs in scRNA-seq data called RankCompV3. It is based on the comparison of relative expression orderings (REOs) of gene pairs which are determined by comparing the expression levels of a pair of genes in a set of single-cell profiles. The numbers of genes with consistently higher or lower expression levels than the gene of interest are counted in two groups in comparison, respectively, and the result is tabulated in a 3 × 3 contingency table which is tested by McCullagh's method to determine if the gene is dysregulated. In both simulated and real scRNA-seq data, RankCompV3 tightly controlled the FPR and demonstrated high accuracy, outperforming 11 other common single-cell DEG detection algorithms. Analysis with either regular single-cell or synthetic pseudo-bulk profiles produced highly concordant DEGs with the ground-truth. In addition, RankCompV3 demonstrates higher sensitivity to weak biological signals than other methods. The algorithm was implemented using Julia and can be called in R. The source code is available at https://github.com/pathint/RankCompV3.jl . CONCLUSIONS The REOs-based algorithm is a valuable tool for analyzing single-cell RNA profiles and identifying DEGs with high accuracy and sensitivity.
Collapse
Affiliation(s)
- Jing Yan
- Department of Bioinformatics, Fujian Key Laboratory of Medical Bioinformatics, School of Medical Technology and Engineering, Fujian Medical University, Fuzhou, 350122, China
| | - Qiuhong Zeng
- Department of Bioinformatics, Fujian Key Laboratory of Medical Bioinformatics, School of Medical Technology and Engineering, Fujian Medical University, Fuzhou, 350122, China
| | - Xianlong Wang
- Department of Bioinformatics, Fujian Key Laboratory of Medical Bioinformatics, School of Medical Technology and Engineering, Fujian Medical University, Fuzhou, 350122, China.
- The Second Affiliated Hospital, Fujian Medical University, Quanzhou, 362000, China.
| |
Collapse
|
3
|
Tan J, Yu X. A pyroptosis-related lncRNA-based prognostic index for hepatocellular carcinoma by relative expression orderings. Transl Cancer Res 2024; 13:1406-1424. [PMID: 38617506 PMCID: PMC11009817 DOI: 10.21037/tcr-23-1804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 01/29/2024] [Indexed: 04/16/2024]
Abstract
Background Hepatocellular carcinoma (HCC) is an invasive malignant tumor, and pyroptosis makes an important contribution to the pathology and progression of liver cancer. Many prognostic models have been proposed for HCC based on the quantitative expression level of candidate genes, which are unsuitable for clinical application due to their vulnerability against experimental batch effects. The aim of this study was to develop a novel pyroptosis-related long non-coding RNA (lncRNA)-based prognostic index (PLPI) for HCC based on relative expression orderings (REOs). Methods Firstly, the pyroptosis-related lncRNAs were identified through the Wilcoxon rank-sum test and gene co-expression analyses. Then, the novel prognostic model PLPI was constructed by pyroptosis-related lncRNA pairs, which were identified by multiple machine learning algorithms. Gene set enrichment, somatic mutation, and drug sensitivity analyses were conducted to measure the differences between high- and low-risk patients. Multiple immune analyses were used to explore the association between PLPI and the immunological microenvironment. Results In this study, a novel prognostic model PLPI based on 10 pyroptosis-related lncRNA pairs was constructed, which was proven to be an independent prognostic risk factor. The receiver operating characteristic (ROC) curves showed that the model had a good prognostic ability in the training, testing, and external set, respectively [5-year area under the curve (AUC) =0.73, 5-year AUC =0.81, 4-year AUC =0.79]. The results of survival, somatic mutation, and immune analyses showed that the patients in the low-risk group had a better prognosis, lower rates of somatic mutation, and better immune cell infiltration. Personalized chemotherapeutic drugs were also identified for the patients with HCC. Conclusions The novel PLPI not only greatly predicted the prognosis of patients with HCC but could also offer novel ideas and approaches for the therapeutic management of HCC.
Collapse
Affiliation(s)
- Jinhua Tan
- School of Sciences, Shanghai Institute of Technology, Shanghai, China
| | - Xiaoqing Yu
- School of Sciences, Shanghai Institute of Technology, Shanghai, China
| |
Collapse
|
4
|
Deschildre J, Vandemoortele B, Loers JU, De Preter K, Vermeirssen V. Evaluation of single-sample network inference methods for precision oncology. NPJ Syst Biol Appl 2024; 10:18. [PMID: 38360881 PMCID: PMC10869342 DOI: 10.1038/s41540-024-00340-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 01/17/2024] [Indexed: 02/17/2024] Open
Abstract
A major challenge in precision oncology is to detect targetable cancer vulnerabilities in individual patients. Modeling high-throughput omics data in biological networks allows identifying key molecules and processes of tumorigenesis. Traditionally, network inference methods rely on many samples to contain sufficient information for learning, resulting in aggregate networks. However, to implement patient-tailored approaches in precision oncology, we need to interpret omics data at the level of individual patients. Several single-sample network inference methods have been developed that infer biological networks for an individual sample from bulk RNA-seq data. However, only a limited comparison of these methods has been made and many methods rely on 'normal tissue' samples as reference, which are not always available. Here, we conducted an evaluation of the single-sample network inference methods SSN, LIONESS, SWEET, iENA, CSN and SSPGI using transcriptomic profiles of lung and brain cancer cell lines from the CCLE database. The methods constructed functional gene networks with distinct network characteristics. Hub gene analyses revealed different degrees of subtype-specificity across methods. Single-sample networks were able to distinguish between tumor subtypes, as exemplified by node strength clustering, enrichment of known subtype-specific driver genes among hubs and differential node strength. We also showed that single-sample networks correlated better to other omics data from the same cell line as compared to aggregate networks. We conclude that single-sample network inference methods can reflect sample-specific biology when 'normal tissue' samples are absent and we point out peculiarities of each method.
Collapse
Affiliation(s)
- Joke Deschildre
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Boris Vandemoortele
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Jens Uwe Loers
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Katleen De Preter
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Lab of Translational Onco-genomics and Bio-informatics, Center for Medical Biotechnology (VIB-UGent), Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Vanessa Vermeirssen
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Ghent, Belgium.
- Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium.
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium.
| |
Collapse
|
5
|
Liu Y, Lin Y, Yang W, Lin Y, Wu Y, Zhang Z, Lin N, Wang X, Tong M, Yu R. Application of individualized differential expression analysis in human cancer proteome. Brief Bioinform 2022; 23:6562685. [DOI: 10.1093/bib/bbac096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 02/06/2022] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Liquid chromatography–mass spectrometry-based quantitative proteomics can measure the expression of thousands of proteins from biological samples and has been increasingly applied in cancer research. Identifying differentially expressed proteins (DEPs) between tumors and normal controls is commonly used to investigate carcinogenesis mechanisms. While differential expression analysis (DEA) at an individual level is desired to identify patient-specific molecular defects for better patient stratification, most statistical DEP analysis methods only identify deregulated proteins at the population level. To date, robust individualized DEA algorithms have been proposed for ribonucleic acid data, but their performance on proteomics data is underexplored. Herein, we performed a systematic evaluation on five individualized DEA algorithms for proteins on cancer proteomic datasets from seven cancer types. Results show that the within-sample relative expression orderings (REOs) of protein pairs in normal tissues were highly stable, providing the basis for individualized DEA for proteins using REOs. Moreover, individualized DEA algorithms achieve higher precision in detecting sample-specific deregulated proteins than population-level methods. To facilitate the utilization of individualized DEA algorithms in proteomics for prognostic biomarker discovery and personalized medicine, we provide Individualized DEP Analysis IDEPAXMBD (XMBD: Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.) (https://github.com/xmuyulab/IDEPA-XMBD), which is a user-friendly and open-source Python toolkit that integrates individualized DEA algorithms for DEP-associated deregulation pattern recognition.
Collapse
Affiliation(s)
- Yachen Liu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
| | - Yalan Lin
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
| | - Wenxian Yang
- Aginome Scientific, Xiamen, Fujian 316005, China
| | - Yuxiang Lin
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Yujuan Wu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
| | - Zheyang Zhang
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Nuoqi Lin
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Xianlong Wang
- Department of Bioinformatics, School of Medical Technology and Engineering, Key Laboratory of Medical Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fuzhou, Fujian 350122, China
| | - Mengsha Tong
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Rongshan Yu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- Aginome Scientific, Xiamen, Fujian 316005, China
| |
Collapse
|
6
|
An Improved Stacked Autoencoder for Metabolomic Data Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:1051172. [PMID: 34434226 PMCID: PMC8382558 DOI: 10.1155/2021/1051172] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 06/28/2021] [Accepted: 07/28/2021] [Indexed: 12/24/2022]
Abstract
Naru3 (NR) is a traditional Mongolian medicine with high clinical efficacy and low incidence of side effects. Metabolomics is an approach that can facilitate the development of traditional drugs. However, metabolomic data have a high throughput, sparse, high-dimensional, and small sample nature, and their classification is challenging. Although deep learning methods have a wide range of applications, deep learning-based metabolomic studies have not been widely performed. We aimed to develop an improved stacked autoencoder (SAE) for metabolomic data classification. We established an NR-treated rheumatoid arthritis (RA) mouse model and classified the obtained metabolomic data using the Hessian-free SAE (HF-SAE) algorithm. During training, the unlabeled data were used for pretraining, and the labeled data were used for fine-tuning based on the HF algorithm for gradient descent optimization. The hybrid algorithm successfully classified the data. The results were compared with those of the support vector machine (SVM), k-nearest neighbor (KNN), and gradient descent SAE (GD-SAE) algorithms. A five-fold cross-validation was used to complete the classification experiment. In each fine-tuning process, the mean square error (MSE) and misclassification rates of the training and test data were recorded. We successfully established an NR animal model and an improved SAE for metabolomic data classification.
Collapse
|