1
|
Cottrell S, Wang R, Wei GW. PLPCA: Persistent Laplacian-Enhanced PCA for Microarray Data Analysis. J Chem Inf Model 2024; 64:2405-2420. [PMID: 37738663 PMCID: PMC10999748 DOI: 10.1021/acs.jcim.3c01023] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
Over the years, Principal Component Analysis (PCA) has served as the baseline approach for dimensionality reduction in gene expression data analysis. Its primary objective is to identify a subset of disease-causing genes from a vast pool of thousands of genes. However, PCA possesses inherent limitations that hinder its interpretability, introduce class ambiguity, and fail to capture complex geometric structures in the data. Although these limitations have been partially addressed in the literature by incorporating various regularizers, such as graph Laplacian regularization, existing PCA based methods still face challenges related to multiscale analysis and capturing higher-order interactions in the data. To address these challenges, we propose a novel approach called Persistent Laplacian-enhanced Principal Component Analysis (PLPCA). PLPCA amalgamates the advantages of earlier regularized PCA methods with persistent spectral graph theory, specifically persistent Laplacians derived from algebraic topology. In contrast to graph Laplacians, persistent Laplacians enable multiscale analysis through filtration and can incorporate higher-order simplicial complexes to capture higher-order interactions in the data. We evaluate and validate the performance of PLPCA using ten benchmark microarray data sets that exhibit a wide range of dimensions and data imbalance ratios. Our extensive studies over these data sets demonstrate that PLPCA provides up to 12% improvement to the current state-of-the-art PCA models on five evaluation metrics for classification tasks after dimensionality reduction.
Collapse
Affiliation(s)
- Sean Cottrell
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
2
|
Zou Z, Robinson JI, Steinberg LK, Henderson JP. Uropathogenic Escherichia coli wield enterobactin-derived catabolites as siderophores. J Biol Chem 2024; 300:105554. [PMID: 38072063 PMCID: PMC10788543 DOI: 10.1016/j.jbc.2023.105554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 11/20/2023] [Accepted: 12/05/2023] [Indexed: 12/19/2023] Open
Abstract
Uropathogenic Escherichia coli (UPEC) secrete multiple siderophore types to scavenge extracellular iron(III) ions during clinical urinary tract infections, despite the metabolic costs of biosynthesis. Here, we find the siderophore enterobactin (Ent) and its related products to be prominent components of the iron-responsive extracellular metabolome of a model UPEC strain. Using defined Ent biosynthesis and import mutants, we identify lower molecular weight dimeric exometabolites as products of incomplete siderophore catabolism, rather than prematurely released biosynthetic intermediates. In E. coli, iron acquisition from iron(III)-Ent complexes requires intracellular esterases that hydrolyze the siderophore. Although UPEC are equipped to consume the products of completely hydrolyzed Ent, we find that Ent and its derivatives may be incompletely hydrolyzed to yield products with retained siderophore activity. These results are consistent with catabolic inefficiency as means to obtain more than one iron ion per siderophore molecule. This is compatible with an evolved UPEC strategy to maximize the nutritional returns from metabolic investments in siderophore biosynthesis.
Collapse
Affiliation(s)
- Zongsen Zou
- Center for Women's Infectious Diseases Research, Washington University School of Medicine, St Louis, Missouri, USA; Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, St Louis, Missouri, USA
| | - John I Robinson
- Center for Women's Infectious Diseases Research, Washington University School of Medicine, St Louis, Missouri, USA; Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, St Louis, Missouri, USA
| | - Lindsey K Steinberg
- Center for Women's Infectious Diseases Research, Washington University School of Medicine, St Louis, Missouri, USA; Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, St Louis, Missouri, USA
| | - Jeffrey P Henderson
- Center for Women's Infectious Diseases Research, Washington University School of Medicine, St Louis, Missouri, USA; Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, St Louis, Missouri, USA.
| |
Collapse
|
3
|
Li Z, Wang Y, Zhao Q, Zhang S, Meng D. A Tensor-Based Online RPCA Model for Compressive Background Subtraction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10668-10682. [PMID: 35536805 DOI: 10.1109/tnnls.2022.3170789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Background subtraction of videos has been a fundamental research topic in computer vision in the past decades. To alleviate the computation burden and enhance the efficiency, background subtraction from online compressive measurements has recently attracted much attention. However, current methods still have limitations. First, they are all based on matrix modeling, which breaks the spatial structure within video frames. Second, they generally ignore the complex disturbance within the background, which reduces the efficiency of the low-rank assumption. To alleviate this issue, we propose a tensor-based online compressive video reconstruction and background subtraction method, abbreviated as NIOTenRPCA, by explicitly modeling the background disturbance in different frames as nonidentical but correlated noise. By virtue of such sophisticated modeling, the proposed method can well adapt to complex video scenes and, thus, perform more robustly. Extensive experiments on a series of real-world video datasets have demonstrated the effectiveness of the proposed method compared with the existing state of the arts. The code of our method is released on the website: https://github.com/crystalzina/NIOTenRPCA.
Collapse
|
4
|
Zou Z, Robinson JI, Steinberg LK, Henderson JP. Uropathogenic Escherichia coli wield enterobactin-derived catabolites as siderophores. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.25.550588. [PMID: 37546885 PMCID: PMC10402112 DOI: 10.1101/2023.07.25.550588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Uropathogenic E. coli (UPEC) secrete multiple siderophore types to scavenge extracellular iron(III) ions during clinical urinary tract infections, despite the metabolic costs of biosynthesis. Here we find the siderophore enterobactin and its related products to be prominent components of the iron-responsive extracellular metabolome of a model UPEC strain. Using defined enterobactin biosynthesis and import mutants, we identify lower molecular weight, dimeric exometabolites as products of incomplete siderophore catabolism, rather than prematurely released biosynthetic intermediates. In E. coli, iron acquisition from iron(III)-enterobactin complexes requires intracellular esterases that hydrolyze the siderophore. Although UPEC are equipped to consume the products of completely hydrolyzed enterobactin, we find that enterobactin and its derivatives may be incompletely hydrolyzed to yield products with retained siderophore activity. These results are consistent with catabolic inefficiency as means to obtain more than one iron ion per siderophore molecule. This is compatible with an evolved UPEC strategy to maximize the nutritional returns from metabolic investments in siderophore biosynthesis.
Collapse
Affiliation(s)
- Zongsen Zou
- Center for Women’s Infectious Diseases Research, Washington University School of Medicine, St. Louis, Missouri, USA
- Department of Internal Medicine, Division of Infectious Diseases, Washington University School of Medicine, St. Louis, Missouri, USA
| | - John I. Robinson
- Center for Women’s Infectious Diseases Research, Washington University School of Medicine, St. Louis, Missouri, USA
- Department of Internal Medicine, Division of Infectious Diseases, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Lindsey K. Steinberg
- Center for Women’s Infectious Diseases Research, Washington University School of Medicine, St. Louis, Missouri, USA
- Department of Internal Medicine, Division of Infectious Diseases, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Jeffrey P. Henderson
- Center for Women’s Infectious Diseases Research, Washington University School of Medicine, St. Louis, Missouri, USA
- Department of Internal Medicine, Division of Infectious Diseases, Washington University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
5
|
Yu Y, Zhou G, Zheng N, Qiu Y, Xie S, Zhao Q. Graph-Regularized Non-Negative Tensor-Ring Decomposition for Multiway Representation Learning. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:3114-3127. [PMID: 35468067 DOI: 10.1109/tcyb.2022.3157133] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Tensor-ring (TR) decomposition is a powerful tool for exploiting the low-rank property of multiway data and has been demonstrated great potential in a variety of important applications. In this article, non-negative TR (NTR) decomposition and graph-regularized NTR (GNTR) decomposition are proposed. The former equips TR decomposition with the ability to learn the parts-based representation by imposing non-negativity on the core tensors, and the latter additionally introduces a graph regularization to the NTR model to capture manifold geometry information from tensor data. Both of the proposed models extend TR decomposition and can be served as powerful representation learning tools for non-negative multiway data. The optimization algorithms based on an accelerated proximal gradient are derived for NTR and GNTR. We also empirically justified that the proposed methods can provide more interpretable and physically meaningful representations. For example, they are able to extract parts-based components with meaningful color and line patterns from objects. Extensive experimental results demonstrated that the proposed methods have better performance than state-of-the-art tensor-based methods in clustering and classification tasks.
Collapse
|
6
|
Fu Y, Du Q, Cui T, Lu Y, Niu G. A pan-cancer analysis reveals role of clusterin ( CLU) in carcinogenesis and prognosis of human tumors. Front Genet 2023; 13:1056184. [PMID: 36685863 PMCID: PMC9846084 DOI: 10.3389/fgene.2022.1056184] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 12/05/2022] [Indexed: 01/06/2023] Open
Abstract
Clusterin (CLU) is a chaperone-like protein that has been demonstrated to have a direct relationship with cancer occurrence, progression, or metastasis. Clusterin was downregulated in tumor tissues using three datasets of tongue squamous carcinoma from the Gene Expression Omnibus. We further retrieved datasets from The Cancer Genome Atlas and Gene Expression Omnibus to thoroughly investigate the carcinogenic consequences of Clusterin. Our findings revealed that decreased Clusterin expression in malignancies was associated with a worse overall survival prognosis in individuals with multiple tumors; Clusterin gene deep deletions were found in almost all malignancies and were connected to most cancer patient's prognosis, Clusterin DNA methylation level was dependent on tumor type, Clusterin expression was also linked to the invasion of cancer-associated CD8+ T-cells and fibroblasts in numerous cancer forms. Moreover, pathway enrichment analysis revealed that Clusterin primarily regulates biological processes such as cholesterol metabolism, phospholipid binding, and protein-lipid complex formation. Overall, our pan-cancer research suggests that Clusterin expression levels are linked to tumor carcinogenesis and prognosis, which contributes to understanding the probable mechanism of Clusterin in tumorigenesis as well as its clinical prognostic significance.
Collapse
Affiliation(s)
- Yizhe Fu
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Nanchang University, Nanchang, China,Department of Stomatology, Beijing Integrated Traditional Chinese and Western Medicine Hospital, Beijing, China
| | - Qiao Du
- Department of Stomatology, Beijing Integrated Traditional Chinese and Western Medicine Hospital, Beijing, China
| | - Tiehan Cui
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Yuying Lu
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Nanchang University, Nanchang, China,Department of Stomatology, Beijing Integrated Traditional Chinese and Western Medicine Hospital, Beijing, China
| | - Guangliang Niu
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Nanchang University, Nanchang, China,Department of Stomatology, Beijing Integrated Traditional Chinese and Western Medicine Hospital, Beijing, China,*Correspondence: Guangliang Niu,
| |
Collapse
|
7
|
He G, Wang H, Liu S, Zhang B. CSMVC: A Multiview Method for Multivariate Time-Series Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13425-13437. [PMID: 34469322 DOI: 10.1109/tcyb.2021.3083592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multivariate time-series (MTS) clustering is a fundamental technique in data mining with a wide range of real-world applications. To date, though some approaches have been developed, they suffer from various drawbacks, such as high computational cost or loss of information. Most existing approaches are single-view methods without considering the benefits of mutual-support multiple views. Moreover, due to its data structure, MTS data cannot be handled well by most multiview clustering methods. Toward this end, we propose a consistent and specific non-negative matrix factorization-based multiview clustering (CSMVC) method for MTS clustering. The proposed method constructs a multilayer graph to represent the original MTS data and generates multiple views with a subspace technique. The obtained multiview data are processed through a novel non-negative matrix factorization (NMF) method, which can explore the view-consistent and view-specific information simultaneously. Furthermore, an alternating optimization scheme is proposed to solve the corresponding optimization problem. We conduct extensive experiments on 13 benchmark datasets and the results demonstrate the superiority of our proposed method against other state-of-the-art algorithms under a wide range of evaluation metrics.
Collapse
|
8
|
Xu Y, Cui X, Zhang L, Zhao T, Wang Y. Metastasis-related gene identification by compound constrained NMF and a semisupervised cluster approach using pancancer multiomics features. Comput Biol Med 2022; 151:106263. [PMID: 36371902 DOI: 10.1016/j.compbiomed.2022.106263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/16/2022] [Accepted: 10/30/2022] [Indexed: 11/11/2022]
Abstract
In recent years, with the gradual increase in pancancer-related research, more attention has been given to the field of pancancer metastasis. However, the molecular mechanism of pancancer metastasis is very unclear, and identification methods for pancancer metastasis-related genes are still lacking. In view of this research status, we developed a novel pipeline to identify pancancer metastasis-related genes based on compound constrained nonnegative matrix factorization (CCNMF). To solve the above problems, the following modules were designed. A correntropy operator and feature similarity fusion (FSF) were first adopted to process the multiomics features of genes; thus, the influences caused by irrelevant biomolecular patterns, manifested as non-Gaussian noise, were minimized. CCNMF was then adopted to handle the above features with compound constraints consisting of a gene relation network and a "metastasis-related" gene set, which maximizes the biological interpretability of the metafeatures generated by NMF. Since a negative set of pancancer "metastasis-related" genes could hardly be obtained, semisupervised analyses were performed on gene features acquired by each step in our pipeline to examine our method's effect. 83% of the 236 candidates identified by the above method were associated with the metastasis of one or more cancers, 71.9% candidates were identified immune-related in pancancer in addition to the hallmark genes. Our study provides an effective and interpretable method for identifying metastasis-related as well as immune-related genes, and the method is successfully applied to TCGA pancancer data.
Collapse
Affiliation(s)
- Yining Xu
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Xinran Cui
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Liyuan Zhang
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Tianyi Zhao
- School of medicine and Health, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Yadong Wang
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| |
Collapse
|
9
|
Wang J, Wang L, Nie F, Li X. A Novel Formulation of Trace Ratio Linear Discriminant Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5568-5578. [PMID: 33857000 DOI: 10.1109/tnnls.2021.3071030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The linear discriminant analysis (LDA) method needs to be transformed into another form to acquire an approximate closed-form solution, which could lead to the error between the approximate solution and the true value. Furthermore, the sensitivity of dimensionality reduction (DR) methods to subspace dimensionality cannot be eliminated. In this article, a new formulation of trace ratio LDA (TRLDA) is proposed, which has an optimal solution of LDA. When solving the projection matrix, the TRLDA method given by us is transformed into a quadratic problem with regard to the Stiefel manifold. In addition, we propose a new trace difference problem named optimal dimensionality linear discriminant analysis (ODLDA) to determine the optimal subspace dimension. The nonmonotonicity of ODLDA guarantees the existence of optimal subspace dimensionality. Both the two approaches have achieved efficient DR on several data sets.
Collapse
|
10
|
Quantitative Detection of Gastrointestinal Tumor Markers Using a Machine Learning Algorithm and Multicolor Quantum Dot Biosensor. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:9022821. [PMID: 36093502 PMCID: PMC9458379 DOI: 10.1155/2022/9022821] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 07/27/2022] [Accepted: 08/02/2022] [Indexed: 11/17/2022]
Abstract
This work was to explore the application value of gastrointestinal tumor markers based on gene feature selection model of principal component analysis (PCA) algorithm and multicolor quantum dots (QDs) immunobiosensor in the detection of gastrointestinal tumors. Based on the PCA method, the neighborhood rough set algorithm was introduced to improve it, and the tumor gene feature selection model (OPCA) was established to analyze its classification accuracy and accuracy. Four kinds of coupled biosensors were fabricated based on QDs, namely, 525 nm Cd Se/Zn S QDs-carbohydrate antigen 125 (QDs525-CA125 McAb), 605 nm Cd Se/Zn S QDs-cancer antigen 19-9 (QDs605-CA19-9 McAb), 645 nm Cd Se/Zn S QDs-anticancer embryonic antigen (QDs 645-CEA McAb), and 565 nm Cd Se/Zn S QDs-anti-alpha-fetoprotein (QDs565-AFP McAb). The quantum dot-antibody conjugates were identified and quantified by fluorescence spectroscopy and ultraviolet absorption spectroscopy. The results showed that the classification precision of OPCA model in colon tumor and gastric cancer datasets was 99.52% and 99.03%, respectively, and the classification accuracy was 94.86% and 94.2%, respectively, which were significantly higher than those of other algorithms. The fluorescence values of AFP McAb, CEA McAb, CA19-9 McAb, and CA125 McAb reached the maximum when the conjugation concentrations were 25 µg/mL, 20 µg/mL, 30 µg/mL, and 30 µg/m, respectively. The highest recovery rate of AFP was 98.51%, and its fluorescence intensity was 35.78 ± 2.99, which was significantly higher than that of other antigens (P < 0.001). In summary, the OPCA model based on PCA algorithm can obtain fewer feature gene sets and improve the accuracy of sample classification. Intelligent immunobiosensors based on machine learning algorithms and QDs have potential application value in gastrointestinal gene feature selection and tumor marker detection, which provides a new idea for clinical diagnosis of gastrointestinal tumors.
Collapse
|
11
|
Gao YL, Wu MJ, Liu JX, Zheng CH, Wang J. Robust Principal Component Analysis Based On Hypergraph Regularization for Sample Clustering and Co-Characteristic Gene Selection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2420-2430. [PMID: 33690124 DOI: 10.1109/tcbb.2021.3065054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Extracting genes involved in cancer lesions from gene expression data is critical for cancer research and drug development. The method of feature selection has attracted much attention in the field of bioinformatics. Principal Component Analysis (PCA) is a widely used method for learning low-dimensional representation. Some variants of PCA have been proposed to improve the robustness and sparsity of the algorithm. However, the existing methods ignore the high-order relationships between data. In this paper, a new model named Robust Principal Component Analysis via Hypergraph Regularization (HRPCA) is proposed. In detail, HRPCA utilizes L2,1-norm to reduce the effect of outliers and make data sufficiently row-sparse. And the hypergraph regularization is introduced to consider the complex relationship among data. Important information hidden in the data are mined, and this method ensures the accuracy of the resulting data relationship information. Extensive experiments on multi-view biological data demonstrate that the feasible and effective of the proposed approach.
Collapse
|
12
|
Arya N, Saha S. Generative Incomplete Multi-View Prognosis Predictor for Breast Cancer: GIMPP. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2252-2263. [PMID: 34143737 DOI: 10.1109/tcbb.2021.3090458] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In today's digital world, we are equipped with modern computer-based data collection sources and feature extraction methods. It enhances the availability of the multi-view data and corresponding researches. Multi-view prediction models form a mainstream research direction in the healthcare and bioinformatics domain. While these models are designed with the assumption that there is no missing data for any views, in the real world, certain views of the data are often not having the same number of samples, resulting in the incomplete multi-view dataset. The studies performed over these datasets are termed incomplete multi-view clustering or prediction. Here, we develop a two-stage generative incomplete multi-view prediction model named GIMPP to address the missing view problem of breast cancer prognosis prediction by explicitly generating the missing data. The first stage incorporates the multi-view encoder networks and the bi-modal attention scheme to learn common latent space representations by leveraging complementary knowledge between different views. The second stage generates missing view data using view-specific generative adversarial networks conditioned on the shared representations and encoded features given by other views. Experimental results on TCGA-BRCA and METABRIC datasets proves the usefulness of the developed method over the state-of-the-art methods.
Collapse
|
13
|
Zhang LX, Yan H, Liu Y, Xu J, Song J, Yu DJ. Enhancing Characteristic Gene Selection and Tumor Classification by the Robust Laplacian Supervised Discriminative Sparse PCA. J Chem Inf Model 2022; 62:1794-1807. [PMID: 35353532 DOI: 10.1021/acs.jcim.1c01403] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Characteristic gene selection and tumor classification of gene expression data play major roles in genomic research. Due to the characteristics of a small sample size and high dimensionality of gene expression data, it is a common practice to perform dimensionality reduction prior to the use of machine learning-based methods to analyze the expression data. In this context, classical principal component analysis (PCA) and its improved versions have been widely used. Recently, methods based on supervised discriminative sparse PCA have been developed to improve the performance of data dimensionality reduction. However, such methods still have limitations: most of them have not taken into consideration the improvement of robustness to outliers and noise, label information, sparsity, as well as capturing intrinsic geometrical structures in one objective function. To address this drawback, in this study, we propose a novel PCA-based method, known as the robust Laplacian supervised discriminative sparse PCA, termed RLSDSPCA, which enforces the L2,1 norm on the error function and incorporates the graph Laplacian into supervised discriminative sparse PCA. To evaluate the efficacy of the proposed RLSDSPCA, we applied it to the problems of characteristic gene selection and tumor classification problems using gene expression data. The results demonstrate that the proposed RLSDSPCA method, when used in combination with other related methods, can effectively identify new pathogenic genes associated with diseases. In addition, RLSDSPCA has also achieved the best performance compared with the state-of-the-art methods on tumor classification in terms of major performance metrics. The codes and data sets used in the study are freely available at http://csbio.njust.edu.cn/bioinf/rlsdspca/.
Collapse
Affiliation(s)
- Lu-Xing Zhang
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - He Yan
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Yan Liu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Jian Xu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, Victoria 3800, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| |
Collapse
|
14
|
Wang CY, Gao YL, Liu JX, Kong XZ, Zheng CH. Single-Cell RNA Sequencing Data Clustering by Low-Rank Subspace Ensemble Framework. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1154-1164. [PMID: 33026977 DOI: 10.1109/tcbb.2020.3029187] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The rapid development of single-cell RNA sequencing (scRNA-seq)technology reveals the gene expression status and gene structure of individual cells, reflecting the heterogeneity and diversity of cells. The traditional methods of scRNA-seq data analysis treat data as the same subspace, and hide structural information in other subspaces. In this paper, we propose a low-rank subspace ensemble clustering framework (LRSEC)to analyze scRNA-seq data. Assuming that the scRNA-seq data exist in multiple subspaces, the low-rank model is used to find the lowest rank representation of the data in the subspace. It is worth noting that the penalty factor of the low-rank kernel function is uncertain, and different penalty factors correspond to different low-rank structures. Moreover, the single cluster model is difficult to find the cellular structure of all datasets. To strengthen the correlation between model solutions, we construct a new ensemble clustering framework LRSEC by using the low-rank model as the basic learner. The LRSEC framework captures the global structure of data through low-rank subspaces, which has better clustering performance than a single clustering model. We validate the performance of the LRSEC framework on seven small datasets and one large dataset and obtain satisfactory results.
Collapse
|
15
|
Ye Q, Zhang X, Lin X. Drug-target interaction prediction via multiple classification strategies. BMC Bioinformatics 2022; 22:461. [PMID: 35057737 PMCID: PMC8772044 DOI: 10.1186/s12859-021-04366-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 09/08/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Computational prediction of the interaction between drugs and protein targets is very important for the new drug discovery, as the experimental determination of drug-target interaction (DTI) is expensive and time-consuming. However, different protein targets are with very different numbers of interactions. Specifically, most interactions focus on only a few targets. As a result, targets with larger numbers of interactions could own enough positive samples for predicting their interactions but the positive samples for targets with smaller numbers of interactions could be not enough. Only using a classification strategy may not be able to deal with the above two cases at the same time. To overcome the above problem, in this paper, a drug-target interaction prediction method based on multiple classification strategies (MCSDTI) is proposed. In MCSDTI, targets are firstly divided into two parts according to the number of interactions of the targets, where one part contains targets with smaller numbers of interactions (TWSNI) and another part contains targets with larger numbers of interactions (TWLNI). And then different classification strategies are respectively designed for TWSNI and TWLNI to predict the interaction. Furthermore, TWSNI and TWLNI are evaluated independently, which can overcome the problem that result could be mainly determined by targets with large numbers of interactions when all targets are evaluated together. RESULTS We propose a new drug-target interaction (MCSDTI) prediction method, which uses multiple classification strategies. MCSDTI is tested on five DTI datasets, such as nuclear receptors (NR), ion channels (IC), G protein coupled receptors (GPCR), enzymes (E), and drug bank (DB). Experiments show that the AUCs of our method are respectively 3.31%, 1.27%, 2.02%, 2.02% and 1.04% higher than that of the second best methods on NR, IC, GPCR and E for TWLNI; And AUCs of our method are respectively 1.00%, 3.20% and 2.70% higher than the second best methods on NR, IC, and E for TWSNI. CONCLUSION MCSDTI is a competitive method compared to the previous methods for all target parts on most datasets, which administrates that different classification strategies for different target parts is an effective way to improve the effectiveness of DTI prediction.
Collapse
Affiliation(s)
- Qing Ye
- Hubei Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China
| | - Xiaolong Zhang
- Hubei Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China.
| | - Xiaoli Lin
- Hubei Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China
| |
Collapse
|
16
|
Liu Q. A truncated nuclear norm and graph-Laplacian regularized low-rank representation method for tumor clustering and gene selection. BMC Bioinformatics 2022; 22:436. [PMID: 35057728 PMCID: PMC8772046 DOI: 10.1186/s12859-021-04333-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 08/23/2021] [Indexed: 12/24/2022] Open
Abstract
Background Clustering and feature selection act major roles in many communities. As a matrix factorization, Low-Rank Representation (LRR) has attracted lots of attentions in clustering and feature selection, but sometimes its performance is frustrated when the data samples are insufficient or contain a lot of noise. Results To address this drawback, a novel LRR model named TGLRR is proposed by integrating the truncated nuclear norm with graph-Laplacian. Different from the nuclear norm minimizing all singular values, the truncated nuclear norm only minimizes some smallest singular values, which can dispel the harm of shrinkage of the leading singular values. Finally, an efficient algorithm based on Linearized Alternating Direction with Adaptive Penalty is applied to resolving the optimization problem. Conclusions The results show that the TGLRR method exceeds the existing state-of-the-art methods in aspect of tumor clustering and gene selection on integrated gene expression data.
Collapse
|
17
|
Kong XZ, Song Y, Liu JX, Zheng CH, Yuan SS, Wang J, Dai LY. Joint Lp-Norm and L 2,1-Norm Constrained Graph Laplacian PCA for Robust Tumor Sample Clustering and Gene Network Module Discovery. Front Genet 2021; 12:621317. [PMID: 33708239 PMCID: PMC7940841 DOI: 10.3389/fgene.2021.621317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 01/29/2021] [Indexed: 11/17/2022] Open
Abstract
The dimensionality reduction method accompanied by different norm constraints plays an important role in mining useful information from large-scale gene expression data. In this article, a novel method named Lp-norm and L2,1-norm constrained graph Laplacian principal component analysis (PL21GPCA) based on traditional principal component analysis (PCA) is proposed for robust tumor sample clustering and gene network module discovery. Three aspects are highlighted in the PL21GPCA method. First, to degrade the high sensitivity to outliers and noise, the non-convex proximal Lp-norm (0 < p < 1)constraint is applied on the loss function. Second, to enhance the sparsity of gene expression in cancer samples, the L2,1-norm constraint is used on one of the regularization terms. Third, to retain the geometric structure of the data, we introduce the graph Laplacian regularization item to the PL21GPCA optimization model. Extensive experiments on five gene expression datasets, including one benchmark dataset, two single-cancer datasets from The Cancer Genome Atlas (TCGA), and two integrated datasets of multiple cancers from TCGA, are performed to validate the effectiveness of our method. The experimental results demonstrate that the PL21GPCA method performs better than many other methods in terms of tumor sample clustering. Additionally, this method is used to discover the gene network modules for the purpose of finding key genes that may be associated with some cancers.
Collapse
Affiliation(s)
| | | | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Chun-Hou Zheng
- School of Computer Science, Qufu Normal University, Rizhao, China
| | | | | | | |
Collapse
|
18
|
Muhammad K, Khan S, Ser JD, Albuquerque VHCD. Deep Learning for Multigrade Brain Tumor Classification in Smart Healthcare Systems: A Prospective Survey. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:507-522. [PMID: 32603291 DOI: 10.1109/tnnls.2020.2995800] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Brain tumor is one of the most dangerous cancers in people of all ages, and its grade recognition is a challenging problem for radiologists in health monitoring and automated diagnosis. Recently, numerous methods based on deep learning have been presented in the literature for brain tumor classification (BTC) in order to assist radiologists for a better diagnostic analysis. In this overview, we present an in-depth review of the surveys published so far and recent deep learning-based methods for BTC. Our survey covers the main steps of deep learning-based BTC methods, including preprocessing, features extraction, and classification, along with their achievements and limitations. We also investigate the state-of-the-art convolutional neural network models for BTC by performing extensive experiments using transfer learning with and without data augmentation. Furthermore, this overview describes available benchmark data sets used for the evaluation of BTC. Finally, this survey does not only look into the past literature on the topic but also steps on it to delve into the future of this area and enumerates some research directions that should be followed in the future, especially for personalized and smart healthcare.
Collapse
|
19
|
Xu D, Zhang J, Xu H, Zhang Y, Chen W, Gao R, Dehmer M. Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data. BMC Genomics 2020; 21:650. [PMID: 32962626 PMCID: PMC7510277 DOI: 10.1186/s12864-020-07038-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/30/2020] [Indexed: 12/19/2022] Open
Abstract
Background The small number of samples and the curse of dimensionality hamper the better application of deep learning techniques for disease classification. Additionally, the performance of clustering-based feature selection algorithms is still far from being satisfactory due to their limitation in using unsupervised learning methods. To enhance interpretability and overcome this problem, we developed a novel feature selection algorithm. In the meantime, complex genomic data brought great challenges for the identification of biomarkers and therapeutic targets. The current some feature selection methods have the problem of low sensitivity and specificity in this field. Results In this article, we designed a multi-scale clustering-based feature selection algorithm named MCBFS which simultaneously performs feature selection and model learning for genomic data analysis. The experimental results demonstrated that MCBFS is robust and effective by comparing it with seven benchmark and six state-of-the-art supervised methods on eight data sets. The visualization results and the statistical test showed that MCBFS can capture the informative genes and improve the interpretability and visualization of tumor gene expression and single-cell sequencing data. Additionally, we developed a general framework named McbfsNW using gene expression data and protein interaction data to identify robust biomarkers and therapeutic targets for diagnosis and therapy of diseases. The framework incorporates the MCBFS algorithm, network recognition ensemble algorithm and feature selection wrapper. McbfsNW has been applied to the lung adenocarcinoma (LUAD) data sets. The preliminary results demonstrated that higher prediction results can be attained by identified biomarkers on the independent LUAD data set, and we also structured a drug-target network which may be good for LUAD therapy. Conclusions The proposed novel feature selection method is robust and effective for gene selection, classification, and visualization. The framework McbfsNW is practical and helpful for the identification of biomarkers and targets on genomic data. It is believed that the same methods and principles are extensible and applicable to other different kinds of data sets.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Jialin Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| | - Wei Chen
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Matthias Dehmer
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Steyr Campus, Steyr, Austria.,College of Computer and Control Engineering, Nankai University, Tianjin, 300071, China.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria
| |
Collapse
|
20
|
Yin C, Chen Z. Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning. Healthcare (Basel) 2020; 8:E291. [PMID: 32846941 PMCID: PMC7551840 DOI: 10.3390/healthcare8030291] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 08/19/2020] [Accepted: 08/20/2020] [Indexed: 01/07/2023] Open
Abstract
Disease classification based on machine learning has become a crucial research topic in the fields of genetics and molecular biology. Generally, disease classification involves a supervised learning style; i.e., it requires a large number of labelled samples to achieve good classification performance. However, in the majority of the cases, labelled samples are hard to obtain, so the amount of training data are limited. However, many unclassified (unlabelled) sequences have been deposited in public databases, which may help the training procedure. This method is called semi-supervised learning and is very useful in many applications. Self-training can be implemented using high- to low-confidence samples to prevent noisy samples from affecting the robustness of semi-supervised learning in the training process. The deep forest method with the hyperparameter settings used in this paper can achieve excellent performance. Therefore, in this work, we propose a novel combined deep learning model and semi-supervised learning with self-training approach to improve the performance in disease classification, which utilizes unlabelled samples to update a mechanism designed to increase the number of high-confidence pseudo-labelled samples. The experimental results show that our proposed model can achieve good performance in disease classification and disease-causing gene identification.
Collapse
Affiliation(s)
- Chunwu Yin
- School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China;
| | - Zhanbo Chen
- School of Information and Statistics, Guangxi University of Finance and Economics, Nanning 530003, China
- Center of Guangxi Cooperative Innovation for Education Performance Assessment, Guangxi University of Finance and Economics, Nanning 530003, China
| |
Collapse
|
21
|
Wu MJ, Gao YL, Liu JX, Zheng CH, Wang J. Integrative Hypergraph Regularization Principal Component Analysis for Sample Clustering and Co-Expression Genes Network Analysis on Multi-Omics Data. IEEE J Biomed Health Inform 2020; 24:1823-1834. [DOI: 10.1109/jbhi.2019.2948456] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
22
|
PCA via joint graph Laplacian and sparse constraint: Identification of differentially expressed genes and sample clustering on gene expression data. BMC Bioinformatics 2019; 20:716. [PMID: 31888433 PMCID: PMC6936054 DOI: 10.1186/s12859-019-3229-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background In recent years, identification of differentially expressed genes and sample clustering have become hot topics in bioinformatics. Principal Component Analysis (PCA) is a widely used method in gene expression data. However, it has two limitations: first, the geometric structure hidden in data, e.g., pair-wise distance between data points, have not been explored. This information can facilitate sample clustering; second, the Principal Components (PCs) determined by PCA are dense, leading to hard interpretation. However, only a few of genes are related to the cancer. It is of great significance for the early diagnosis and treatment of cancer to identify a handful of the differentially expressed genes and find new cancer biomarkers. Results In this study, a new method gLSPCA is proposed to integrate both graph Laplacian and sparse constraint into PCA. gLSPCA on the one hand improves the clustering accuracy by exploring the internal geometric structure of the data, on the other hand identifies differentially expressed genes by imposing a sparsity constraint on the PCs. Conclusions Experiments of gLSPCA and its comparison with existing methods, including Z-SPCA, GPower, PathSPCA, SPCArt, gLPCA, are performed on real datasets of both pancreatic cancer (PAAD) and head & neck squamous carcinoma (HNSC). The results demonstrate that gLSPCA is effective in identifying differentially expressed genes and sample clustering. In addition, the applications of gLSPCA on these datasets provide several new clues for the exploration of causative factors of PAAD and HNSC.
Collapse
|
23
|
Wang J, Lu CH, Liu JX, Dai LY, Kong XZ. Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints. BMC Bioinformatics 2019; 20:718. [PMID: 31888442 PMCID: PMC6936083 DOI: 10.1186/s12859-019-3231-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Background Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small samples of gene expression data and the noise of the data make data mining and research difficult. Although there are many effective and feasible methods to deal with this problem, the possibility remains that these methods are flawed. Results In this paper, we propose the graph regularized low-rank representation under symmetric and sparse constraints (sgLRR) method in which we introduce graph regularization based on manifold learning and symmetric sparse constraints into the traditional low-rank representation (LRR). For the sgLRR method, by means of symmetric constraint and sparse constraint, the effect of raw data noise on low-rank representation is alleviated. Further, sgLRR method preserves the important intrinsic local geometrical structures of the raw data by introducing graph regularization. We apply this method to cluster multi-cancer samples based on gene expression data, which improves the clustering quality. First, the gene expression data are decomposed by sgLRR method. And, a lowest rank representation matrix is obtained, which is symmetric and sparse. Then, an affinity matrix is constructed to perform the multi-cancer sample clustering by using a spectral clustering algorithm, i.e., normalized cuts (Ncuts). Finally, the multi-cancer samples clustering is completed. Conclusions A series of comparative experiments demonstrate that the sgLRR method based on low rank representation has a great advantage and remarkable performance in the clustering of multi-cancer samples.
Collapse
Affiliation(s)
- Juan Wang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Cong-Hai Lu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China.
| | - Ling-Yun Dai
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Xiang-Zhen Kong
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| |
Collapse
|
24
|
Wang CY, Liu JX, Yu N, Zheng CH. Sparse Graph Regularization Non-Negative Matrix Factorization Based on Huber Loss Model for Cancer Data Analysis. Front Genet 2019; 10:1054. [PMID: 31824556 PMCID: PMC6882287 DOI: 10.3389/fgene.2019.01054] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2019] [Accepted: 10/01/2019] [Indexed: 12/02/2022] Open
Abstract
Non-negative matrix factorization (NMF) is a matrix decomposition method based on the square loss function. To exploit cancer information, cancer gene expression data often uses the NMF method to reduce dimensionality. Gene expression data usually have some noise and outliers, while the original NMF loss function is very sensitive to non-Gaussian noise. To improve the robustness and clustering performance of the algorithm, we propose a sparse graph regularization NMF based on Huber loss model for cancer data analysis (Huber-SGNMF). Huber loss is a function between L1-norm and L2-norm that can effectively handle non-Gaussian noise and outliers. Taking into account the sparsity matrix and data geometry information, sparse penalty and graph regularization terms are introduced into the model to enhance matrix sparsity and capture data manifold structure. Before the experiment, we first analyzed the robustness of Huber-SGNMF and other models. Experiments on The Cancer Genome Atlas (TCGA) data have shown that Huber-SGNMF performs better than other most advanced methods in sample clustering and differentially expressed gene selection.
Collapse
Affiliation(s)
- Chuan-Yuan Wang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
- *Correspondence: Jin-Xing Liu,
| | - Na Yu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Chun-Hou Zheng
- School of Software Engineering, Qufu Normal University, Qufu, China
| |
Collapse
|
25
|
A study on metaheuristics approaches for gene selection in microarray data: algorithms, applications and open challenges. EVOLUTIONARY INTELLIGENCE 2019. [DOI: 10.1007/s12065-019-00306-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|