1
|
Fan D, Liu Y, Liu Y. The Latest Advances in Microfluidic DLD Cell Sorting Technology: The Optimization of Channel Design. BIOSENSORS 2025; 15:126. [PMID: 39997028 PMCID: PMC11853672 DOI: 10.3390/bios15020126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2025] [Revised: 02/07/2025] [Accepted: 02/12/2025] [Indexed: 02/26/2025]
Abstract
Cell sorting plays a crucial role in both medical and biological research. As a key passive sorting technique in the field of microfluidics, deterministic lateral displacement (DLD) has been widely applied to cell separation and sorting. This review aims to summarize the latest advances in the optimization of channel design for microfluidic DLD cell sorting. First, we provide an overview of the design elements of microfluidic DLD cell sorting channels, focusing on key factors that affect separation efficiency and accuracy, including channel geometry, fluid dynamics, and the interaction between cells and channel surfaces. Subsequently, we review recent innovations and progress in channel design for microfluidic DLD technology, exploring its applications in biomedical fields and its integration with machine learning. Additionally, we discuss the challenges currently faced in optimizing channel design for microfluidic DLD cell sorting. Finally, based on existing research, we make a summary and put forward prospective views on the further development of this field.
Collapse
Affiliation(s)
- Dan Fan
- School of Engineering, Dali University, Dali 671003, China;
| | - Yi Liu
- School of Engineering, Dali University, Dali 671003, China;
| | - Yaling Liu
- Precision Medicine Translational Research Center, West China Hospital, Sichuan University, Chengdu 610041, China
- Department of Bioengineering, Lehigh University, Bethlehem, PA 18015, USA
| |
Collapse
|
2
|
Hua W, Qi J, Zhou M, Han S, Xu X, Su J, Pan T, Wu D, Han Y. Overexpression of REC8 induces aberrant gamete meiotic division and contributes to AML pathogenesis - a multiplexed microarray analysis and mendelian randomization study. Ann Hematol 2024; 103:3563-3572. [PMID: 39012516 DOI: 10.1007/s00277-024-05882-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 07/04/2024] [Indexed: 07/17/2024]
Abstract
Acute myeloid leukemia (AML) is a notably lethal disease, characterized by malignant clonal proliferation of hematopoietic stem cells in the bone marrow. This study seeks to unveil potential therapeutic targets for AML, using a combined approach of microarray analysis and Mendelian randomization (MR). We collected data samples from the Gene Expression Omnibus (GEO) database and extracted pQTL data from genome-wide association studies (GWAS) to identify overlapping genes between the DEGs and GWAS data. Gene enrichment and pathway annotation analyses were performed on these genes. Furthermore, we validated gene expression levels and assessed their clinical relevance. By taking the intersection of these gene sets, we obtained a list of co-expressed genes, including four upregulated genes (REC8, TPM2, ZMIZ1, CD82) and two downregulated genes (IFNAR1, TMCO3). MR analysis demonstrated that genetically predicted protein levels of CD82, REC8, ZMIZ1, and TPM2 were significantly associated with increased odds of AML, while IFNAR1 and TMCO3 showed a protective effect. Gene ontology and KEGG pathway analyses revealed significant enrichment in functions related to female gamete generation, meiosis, p53 signaling pathway, and cardiac muscle contraction. Differences in immune cell profiles were observed between AML survivors and those with poor prognosis, including lower levels of neutrophils and higher levels of follicular helper T cells in the latter group. This study identifies a causal relationship between gene expression and AML and highlights the potential role of REC8 in leukemogenesis, possibly through its impact on gametocyte meiotic abnormalities. The findings provide new insights into the prevention and treatment of leukemia.
Collapse
Affiliation(s)
- Wenxi Hua
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
| | - Jiaqian Qi
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
- Key Laboratory of Thrombosis and Hemostasis of Ministry of Health, Suzhou, China
| | - Meng Zhou
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
- Key Laboratory of Thrombosis and Hemostasis of Ministry of Health, Suzhou, China
| | - Shiyu Han
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Xiaoyan Xu
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
- Key Laboratory of Thrombosis and Hemostasis of Ministry of Health, Suzhou, China
| | - Jinwen Su
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
| | - Tingting Pan
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
- Key Laboratory of Thrombosis and Hemostasis of Ministry of Health, Suzhou, China
| | - Depei Wu
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China.
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China.
- Key Laboratory of Thrombosis and Hemostasis of Ministry of Health, Suzhou, China.
- State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, China.
| | - Yue Han
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China.
- Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China.
- Key Laboratory of Thrombosis and Hemostasis of Ministry of Health, Suzhou, China.
- State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, China.
| |
Collapse
|
3
|
Al-Azani S, Alkhnbashi OS, Ramadan E, Alfarraj M. Gene Expression-Based Cancer Classification for Handling the Class Imbalance Problem and Curse of Dimensionality. Int J Mol Sci 2024; 25:2102. [PMID: 38396779 PMCID: PMC10889442 DOI: 10.3390/ijms25042102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 12/25/2023] [Accepted: 12/27/2023] [Indexed: 02/25/2024] Open
Abstract
Cancer is a leading cause of death globally. The majority of cancer cases are only diagnosed in the late stages of cancer due to the use of conventional methods. This reduces the chance of survival for cancer patients. Therefore, early detection consequently followed by early diagnoses are important tasks in cancer research. Gene expression microarray technology has been applied to detect and diagnose most types of cancers in their early stages and has gained encouraging results. In this paper, we address the problem of classifying cancer based on gene expression for handling the class imbalance problem and the curse of dimensionality. The oversampling technique is utilized to overcome this problem by adding synthetic samples. Another common issue related to the gene expression dataset addressed in this paper is the curse of dimensionality. This problem is addressed by applying chi-square and information gain feature selection techniques. After applying these techniques individually, we proposed a method to select the most significant genes by combining those two techniques (CHiS and IG). We investigated the effect of these techniques individually and in combination. Four benchmarking biomedical datasets (Leukemia-subtypes, Leukemia-ALLAML, Colon, and CuMiDa) were used. The experimental results reveal that the oversampling techniques improve the results in most cases. Additionally, the performance of the proposed feature selection technique outperforms individual techniques in nearly all cases. In addition, this study provides an empirical study for evaluating several oversampling techniques along with ensemble-based learning. The experimental results also reveal that SVM-SMOTE, along with the random forests classifier, achieved the highest results, with a reporting accuracy of 100%. The obtained results surpass the findings in the existing literature as well.
Collapse
Affiliation(s)
- Sadam Al-Azani
- SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia;
| | - Omer S. Alkhnbashi
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia; (O.S.A.); (E.R.)
| | - Emad Ramadan
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia; (O.S.A.); (E.R.)
| | - Motaz Alfarraj
- SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia;
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia; (O.S.A.); (E.R.)
- Electrical Engineering Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia
| |
Collapse
|
4
|
Smith RN, Rosales IA, Tomaszewski KT, Mahowald GT, Araujo-Medina M, Acheampong E, Bruce A, Rios A, Otsuka T, Tsuji T, Hotta K, Colvin R. Utility of Banff Human Organ Transplant Gene Panel in Human Kidney Transplant Biopsies. Transplantation 2023; 107:1188-1199. [PMID: 36525551 PMCID: PMC10132999 DOI: 10.1097/tp.0000000000004389] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
BACKGROUND Microarray transcript analysis of human renal transplantation biopsies has successfully identified the many patterns of graft rejection. To evaluate an alternative, this report tests whether gene expression from the Banff Human Organ Transplant (B-HOT) probe set panel, derived from validated microarrays, can identify the relevant allograft diagnoses directly from archival human renal transplant formalin-fixed paraffin-embedded biopsies. To test this hypothesis, principal components (PCs) of gene expressions were used to identify allograft diagnoses, to classify diagnoses, and to determine whether the PC data were rich enough to identify diagnostic subtypes by clustering, which are all needed if the B-HOT panel can substitute for microarrays. METHODS RNA was isolated from routine, archival formalin-fixed paraffin-embedded tissue renal biopsy cores with both rejection and nonrejection diagnoses. The B-HOT panel expression of 770 genes was analyzed by PCs, which were then tested to determine their ability to identify diagnoses. RESULTS PCs of microarray gene sets identified the Banff categories of renal allograft diagnoses, modeled well the aggregate diagnoses, showing a similar correspondence with the pathologic diagnoses as microarrays. Clustering of the PCs identified diagnostic subtypes including non-chronic antibody-mediated rejection with high endothelial expression. PCs of cell types and pathways identified new mechanistic patterns including differential expression of B and plasma cells. CONCLUSIONS Using PCs of gene expression from the B-Hot panel confirms the utility of the B-HOT panel to identify allograft diagnoses and is similar to microarrays. The B-HOT panel will accelerate and expand transcript analysis and will be useful for longitudinal and outcome studies.
Collapse
Affiliation(s)
- Rex N Smith
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| | - Ivy A Rosales
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| | - Kristen T Tomaszewski
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| | - Grace T Mahowald
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Milagros Araujo-Medina
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Ellen Acheampong
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Amy Bruce
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Andrea Rios
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Takuya Otsuka
- Department of Surgical Pathology, Hokkaido University Hospital, Sapporo, Japan
| | - Takahiro Tsuji
- Department of Pathology, Sapporo City General Hospital, Sapporo, Japan
| | - Kiyohiko Hotta
- Department of Urology, Hokkaido University Hospital, Sapporo, Japan
| | - Robert Colvin
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| |
Collapse
|
5
|
Wu Y, Zhu D, Wang X. Tree enhanced deep adaptive network for cancer prediction with high dimension low sample size microarray data. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
6
|
Yu K, Huang M, Chen S, Feng C, Li W. GSEnet: feature extraction of gene expression data and its application to Leukemia classification. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:4881-4891. [PMID: 35430845 DOI: 10.3934/mbe.2022228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Gene expression data is highly dimensional. As disease-related genes account for only a tiny fraction, a deep learning model, namely GSEnet, is proposed to extract instructive features from gene expression data. This model consists of three modules, namely the pre-conv module, the SE-Resnet module, and the SE-conv module. Effectiveness of the proposed model on the performance improvement of 9 representative classifiers is evaluated. Seven evaluation metrics are used for this assessment on the GSE99095 dataset. Robustness and advantages of the proposed model compared with representative feature selection methods are also discussed. Results show superiority of the proposed model on the improvement of the classification precision and accuracy.
Collapse
Affiliation(s)
- Kun Yu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning 110819, China
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Shenyang, Liaoning 110819, China
| | - Mingxu Huang
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning 110819, China
| | - Shuaizheng Chen
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning 110819, China
| | - Chaolu Feng
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Shenyang, Liaoning 110819, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning 110819, China
| | - Wei Li
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Shenyang, Liaoning 110819, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning 110819, China
| |
Collapse
|
7
|
Rather AA, Chachoo MA. Manifold learning based robust clustering of gene expression data for cancer subtyping. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100907] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
|
8
|
The Higher-Order of Adaptive Lasso and Elastic Net Methods for Classification on High Dimensional Data. MATHEMATICS 2021. [DOI: 10.3390/math9101091] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The lasso and elastic net methods are the popular technique for parameter estimation and variable selection. Moreover, the adaptive lasso and elastic net methods use the adaptive weights on the penalty function based on the lasso and elastic net estimates. The adaptive weight is related to the power order of the estimator. Normally, these methods focus to estimate parameters in terms of linear regression models that are based on the dependent variable and independent variable as a continuous scale. In this paper, we compare the lasso and elastic net methods and the higher-order of the adaptive lasso and adaptive elastic net methods for classification on high dimensional data. The classification is used to classify the categorical data for dependent variable dependent on the independent variables, which is called the logistic regression model. The categorical data are considered a binary variable, and the independent variables are used as the continuous variable. The high dimensional data are represented when the number of independent variables is higher than the sample sizes. For this research, the simulation of the logistic regression is considered as the binary dependent variable and 20, 30, 40, and 50 as the independent variables when the sample sizes are less than the number of the independent variables. The independent variables are generated from normal distribution on several variances, and the dependent variables are obtained from the probability of logit function and transforming it to predict the binary data. For application in real data, we express the classification of the type of leukemia as the dependent variables and the subset of gene expression as the independent variables. The criterion of these methods is to compare by the average percentage of predicted accuracy value. The results are found that the higher-order of adaptive lasso method is satisfied with large dispersion, but the higher-order of adaptive elastic net method outperforms on small dispersion.
Collapse
|
9
|
A novel PCA-based approach for building on-board sensor classifiers for water contaminant detection. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2020.05.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
10
|
Abstract
BACKGROUND Pleural Mesothelioma (PM) is an unusual, belligerent tumor that rapidly develops into cancer in the pleura of the lungs. Pleural Mesothelioma is a common type of Mesothelioma that accounts for about 75% of all Mesothelioma diagnosed yearly in the U.S. Diagnosis of Mesothelioma takes several months and is expensive. Given the risk and constraints associated with PM diagnosis, early identification of this ailment is essential for patient health. OBJECTIVE In this study, we use artificial intelligence algorithms recommending the best fit model for early diagnosis and prognosis of Malignant Pleural Mesothelioma (MPM). METHODS We retrospectively retrieved patients' clinical data collected by Dicle University, Turkey and applied multilayered perceptron (MLP), voted perceptron (VP), Clojure classifier (CC), kernel logistic regression (KLR), stochastic gradient decent (SGD), adaptive boosting (AdaBoost), Hoeffding tree (VFDT), and primal estimated sub-gradient solver for support vector machine (s-Pegasos). We evaluated the models, compared and tested them using paired t-test (corrected) at 0.05 significance based on their respective classification accuracy, f-measure, precision, recall, root mean squared error, receivers' characteristic curve (ROC), and precision-recall curve (PRC). RESULTS In phase 1, SGD, AdaBoost.M1, KLR, MLP, VFDT generate optimal results with the highest possible performance measures. In phase 2, AdaBoost, with a classification accuracy of 71.29%, outperformed all other algorithms. C-reactive protein, platelet count, duration of symptoms, gender, and pleural protein were found to be the most relevant predictors that can prognosticate Mesothelioma. CONCLUSION This study confirms that data obtained from biopsy and imaging tests are strong predictors of Mesothelioma but are associated with a high cost; however, they can identify Mesothelioma with optimal accuracy.
Collapse
|
11
|
Basavegowda HS, Dagnew G. Deep learning approach for microarray cancer data classification. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2020. [DOI: 10.1049/trit.2019.0028] [Citation(s) in RCA: 110] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Affiliation(s)
- Hema Shekar Basavegowda
- Department of Studies and Research in Computer ScienceMangalore UniversityMangaloreKarnatakaIndia
| | - Guesh Dagnew
- Department of Studies and Research in Computer ScienceMangalore UniversityMangaloreKarnatakaIndia
| |
Collapse
|
12
|
Hare SR, Bratholm LA, Glowacki DR, Carpenter BK. Low dimensional representations along intrinsic reaction coordinates and molecular dynamics trajectories using interatomic distance matrices. Chem Sci 2019; 10:9954-9968. [PMID: 32055352 PMCID: PMC6991188 DOI: 10.1039/c9sc02742d] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 08/23/2019] [Indexed: 01/22/2023] Open
Abstract
Most chemical transformations (reactions or conformational changes) that are of interest to researchers have many degrees of freedom, usually too many to visualize without reducing the dimensionality of the system to include only the most important atomic motions. In this article, we describe a method of using Principal Component Analysis (PCA) for analyzing a series of molecular geometries (e.g., a reaction pathway or molecular dynamics trajectory) and determining the reduced dimensional space that captures the most structural variance in the fewest dimensions. The software written to carry out this method is called PathReducer, which permits (1) visualizing the geometries in a reduced dimensional space, (2) determining the axes that make up the reduced dimensional space, and (3) projecting the series of geometries into the low-dimensional space for visualization. We investigated two options to represent molecular structures within PathReducer: aligned Cartesian coordinates and matrices of interatomic distances. We found that interatomic distance matrices better captured non-linear motions in a smaller number of dimensions. To demonstrate the utility of PathReducer, we have carried out a number of applications where we have projected molecular dynamics trajectories into a reduced dimensional space defined by an intrinsic reaction coordinate. The visualizations provided by this analysis show that dynamic paths can differ greatly from the minimum energy pathway on a potential energy surface. Viewing intrinsic reaction coordinates and trajectories in this way provides a quick way to gather qualitative information about the pathways trajectories take relative to a minimum energy path. Given that the outputs from PCA are linear combinations of the input molecular structure coordinates (i.e., Cartesian coordinates or interatomic distances), they can be easily transferred to other types of calculations that require the definition of a reduced dimensional space (e.g., biased molecular dynamics simulations).
Collapse
Affiliation(s)
- Stephanie R Hare
- University of Bristol School of Chemistry , Cantock's Close , Bristol , UK BS8 1TS
- University of Bristol School of Mathematics , University Walk , Bristol , UK BS8 1TW
| | - Lars A Bratholm
- University of Bristol School of Chemistry , Cantock's Close , Bristol , UK BS8 1TS
- University of Bristol School of Mathematics , University Walk , Bristol , UK BS8 1TW
| | - David R Glowacki
- University of Bristol School of Chemistry , Cantock's Close , Bristol , UK BS8 1TS
- University of Bristol School of Computer Science , Merchant Venturers Building, Woodland Road , Bristol , UK BS8 1UB
| | - Barry K Carpenter
- Cardiff University School of Chemistry , Main Building, Park Place , Cardiff , UK CF10 3AT .
| |
Collapse
|
13
|
Jansi Rani M, Devaraj D. Two-Stage Hybrid Gene Selection Using Mutual Information and Genetic Algorithm for Cancer Data Classification. J Med Syst 2019; 43:235. [PMID: 31209677 DOI: 10.1007/s10916-019-1372-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 06/05/2019] [Indexed: 01/20/2023]
Abstract
Cancer is a deadly disease which requires a very complex and costly treatment. Microarray data classification plays an important role in cancer treatment. An efficient gene selection technique to select the more promising genes is necessary for cancer classification. Here, we propose a Two-stage MI-GA Gene Selection algorithm for selecting informative genes in cancer data classification. In the first stage, Mutual Information based gene selection is applied which selects only the genes that have high information related to the cancer. The genes which have high mutual information value are given as input to the second stage. The Genetic Algorithm based gene selection is applied in the second stage to identify and select the optimal set of genes required for accurate classification. For classification, Support Vector Machine (SVM) is used. The proposed MI-GA gene selection approach is applied to Colon, Lung and Ovarian cancer datasets and the results show that the proposed gene selection approach results in higher classification accuracy compared to the existing methods.
Collapse
Affiliation(s)
- M Jansi Rani
- School of Computing, Kalasalingam Academy of Research and Education, Krishnankoil, Virudhunagar, India.
| | - D Devaraj
- School of Electronics & Electrical Technology, Kalasalingam Academy of Research and Education, Krishnankoil, Virudhunagar, India
| |
Collapse
|
14
|
Ayyad SM, Saleh AI, Labib LM. Gene expression cancer classification using modified K-Nearest Neighbors technique. Biosystems 2019; 176:41-51. [PMID: 30611843 DOI: 10.1016/j.biosystems.2018.12.009] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2018] [Revised: 12/17/2018] [Accepted: 12/31/2018] [Indexed: 12/18/2022]
Abstract
Gene expression microarray classification is a crucial research field as it has been employed in cancer prediction and diagnosis systems. Gene expression data are composed of dozens of samples characterized by thousands of genes. Hence, an accurate and effective classification of such samples is a challenge. Machine learning techniques have been broadly utilized to build substantial and precise classification models. This paper proposes a new classification technique for gene expression data, which is called Modified k-nearest neighbor (MKNN). MKNN is applied in two scenarios namely; smallest modified KNN (SMKNN) and largest modified KNN (LMKNN). Both implementations are undertaken to enhance the performance of KNN. The key idea is to employ robust neighbors from training data by using a new weighting strategy. Several experiments have been performed on six different gene expression datasets. Experiments have shown that MKNN in its both scenarios outperforms traditional as well as recent ones. MKNN has been compared against (i) KNN, (ii) weighted KNN, (iii) support vector machine (SVM), (iv) fuzzy support vector machine, (v) brain emotional learning (BEL) in terms of classification accuracy, precision, and recall. On the other hand, results show that MKNN introduces smaller testing time than both KNN and weighted KNN.
Collapse
Affiliation(s)
- Sarah M Ayyad
- Computers and Systems Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt.
| | - Ahmed I Saleh
- Computers and Systems Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt.
| | - Labib M Labib
- Computers and Systems Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt.
| |
Collapse
|
15
|
Gene Selection in Cancer Classification Using Sparse Logistic Regression with L1/2 Regularization. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8091569] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.
Collapse
|
16
|
|
17
|
Urda D, Luque-Baena RM, Franco L, Jerez JM, Sanchez-Marono N. Machine learning models to search relevant genetic signatures in clinical context. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) 2017:1649-1656. [DOI: 10.1109/ijcnn.2017.7966049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
18
|
Jia X, Zhu G, Han Q, Lu Z. The biological knowledge discovery by PCCF measure and PCA-F projection. PLoS One 2017; 12:e0175104. [PMID: 28399180 PMCID: PMC5388332 DOI: 10.1371/journal.pone.0175104] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Accepted: 03/21/2017] [Indexed: 11/19/2022] Open
Abstract
In the process of biological knowledge discovery, PCA is commonly used to complement the clustering analysis, but PCA typically gives the poor visualizations for most gene expression data sets. Here, we propose a PCCF measure, and use PCA-F to display clusters of PCCF, where PCCF and PCA-F are modeled from the modified cumulative probabilities of genes. From the analysis of simulated and experimental data sets, we demonstrate that PCCF is more appropriate and reliable for analyzing gene expression data compared to other commonly used distances or similarity measures, and PCA-F is a good visualization technique for identifying clusters of PCCF, where we aim at such data sets that the expression values of genes are collected at different time points.
Collapse
Affiliation(s)
- Xingang Jia
- Department of Mathematics, Southeast University, Nanjing 210096, PR China
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, PR China
- * E-mail:
| | - Guanqun Zhu
- Department of chemistry, Nanjing Agricultural University, Nanjing 210000, PR China
| | - Qiuhong Han
- Department of Mathematics, Nanjing Forestry University, Nanjing 210037, PR China
| | - Zuhong Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, PR China
| |
Collapse
|
19
|
Detecting Susceptibility to Breast Cancer with SNP-SNP Interaction Using BPSOHS and Emotional Neural Networks. BIOMED RESEARCH INTERNATIONAL 2017; 2016:5164347. [PMID: 27294121 PMCID: PMC4879248 DOI: 10.1155/2016/5164347] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Revised: 04/18/2016] [Accepted: 04/20/2016] [Indexed: 02/08/2023]
Abstract
Studies for the association between diseases and informative single nucleotide polymorphisms (SNPs) have received great attention. However, most of them just use the whole set of useful SNPs and fail to consider the SNP-SNP interactions, while these interactions have already been proven in biology experiments. In this paper, we use a binary particle swarm optimization with hierarchical structure (BPSOHS) algorithm to improve the effective of PSO for the identification of the SNP-SNP interactions. Furthermore, in order to use these SNP interactions in the susceptibility analysis, we propose an emotional neural network (ENN) to treat SNP interactions as emotional tendency. Different from the normal architecture, just as the emotional brain, this architecture provides a specific path to treat the emotional value, by which the SNP interactions can be considered more quickly and directly. The ENN helps us use the prior knowledge about the SNP interactions and other influence factors together. Finally, the experimental results prove that the proposed BPSOHS_ENN algorithm can detect the informative SNP-SNP interaction and predict the breast cancer risk with a much higher accuracy than existing methods.
Collapse
|
20
|
A winner-take-all approach to emotional neural networks with universal approximation property. Inf Sci (N Y) 2016. [DOI: 10.1016/j.ins.2016.01.055] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
21
|
Mollaee M, Moattar MH. A novel feature extraction approach based on ensemble feature selection and modified discriminant independent component analysis for microarray data classification. Biocybern Biomed Eng 2016. [DOI: 10.1016/j.bbe.2016.05.001] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
22
|
Algamal ZY, Lee MH. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification. Comput Biol Med 2015; 67:136-45. [DOI: 10.1016/j.compbiomed.2015.10.008] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2015] [Revised: 10/07/2015] [Accepted: 10/08/2015] [Indexed: 10/22/2022]
|