Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kong Y, Yu T. A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data. Bioinformatics 2018;34:3727-3737. [PMID: 29850911 PMCID: PMC6198851 DOI: 10.1093/bioinformatics/bty429] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 04/30/2018] [Accepted: 05/23/2018] [Indexed: 12/16/2022] Open

For:	Kong Y, Yu T. A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data. Bioinformatics 2018;34:3727-3737. [PMID: 29850911 PMCID: PMC6198851 DOI: 10.1093/bioinformatics/bty429] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 04/30/2018] [Accepted: 05/23/2018] [Indexed: 12/16/2022] Open

Number

Cited by Other Article(s)

Zeng Y, Zhang Y, Xiao Z, Sui H. A multi-classification deep neural network for cancer type identification from high-dimension, small-sample and imbalanced gene microarray data. Sci Rep 2025;15:5239. [PMID: 39939378 PMCID: PMC11822135 DOI: 10.1038/s41598-025-89475-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Accepted: 02/05/2025] [Indexed: 02/14/2025] Open

Abstract

Gene microarray technology provides an efficient way to diagnose cancer. However, microarray gene expression data face the challenges of high-dimension, small-sample, and multi-class imbalance. The coupling of these challenges leads to inaccurate results when using traditional feature selection and classification algorithms. Due to fast learning speed and good classification performance, deep neural network such as generative adversarial network has been proven one of the best classification algorithms, especially in bioinformatics domain. However, it is limited to binary application and inefficient in processing high-dimensional sparse features. This paper proposes a multi-classification generative adversarial network model combined with features bundling (MGAN-FB) to handle the coupling of high-dimension, small-sample, and multi-class imbalance for gene microarray data classification at both feature and algorithmic levels. At feature level, a deep encoder structure combining feature bundling (FB) mechanism and squeeze and excite (SE) mechanism, is designed for the generator. So, the sparsity, correlation and consequence of high-dimension features are all taken into consideration for adaptive features extraction. It achieves effective dimensionality reduction without transitional information loss. At algorithmic level, a softmax module coupled with multi-classifier are introduced into the discriminator, with a new objective function is distinctively designed for the proposed MGAN-FB model, considering encode loss, reconstruction loss, discrimination loss and multi-classification loss. We extend generative adversaria framework from the binary classification to the multi-classification field. Experiments are performed on eight open-source gene microarray datasets from classification performance, running time and non-parametric tests, which demonstrate that the proposed method has obvious advantages over other 7 compared methods.

Collapse

Li R, Yi H, Ma S. A Selective Review of Network Analysis Methods for Gene Expression Data. Methods Mol Biol 2025;2880:293-307. [PMID: 39900765 DOI: 10.1007/978-1-0716-4276-4_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2025]

Khullar S, Huang X, Ramesh R, Svaren J, Wang D. NetREm: Network Regression Embeddings reveal cell-type transcription factor coordination for gene regulation. BIOINFORMATICS ADVANCES 2024;5:vbae206. [PMID: 40260118 PMCID: PMC12011367 DOI: 10.1093/bioadv/vbae206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 10/22/2024] [Accepted: 12/18/2024] [Indexed: 04/23/2025]

Sun C, Liu ZP. Discovering explainable biomarkers for breast cancer anti-PD1 response via network Shapley value analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;257:108481. [PMID: 39488042 DOI: 10.1016/j.cmpb.2024.108481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 10/20/2024] [Accepted: 10/24/2024] [Indexed: 11/04/2024]

Abstract

BACKGROUND AND OBJECTIVE

Immunotherapy holds promise in enhancing pathological complete response rates in breast cancer, albeit confined to a select cohort of patients. Consequently, pinpointing factors predictive of treatment responsiveness is of paramount importance. Gene expression and regulation, inherently operating within intricate networks, constitute fundamental molecular machinery for cellular processes and often serve as robust biomarkers. Nevertheless, contemporary feature selection approaches grapple with two key challenges: opacity in modeling and scarcity in accounting for gene-gene interactions METHODS: To address these limitations, we devise a novel feature selection methodology grounded in cooperative game theory, harmoniously integrating with sophisticated machine learning models. This approach identifies interconnected gene regulatory network biomarker modules with priori genetic linkage architecture. Specifically, we leverage Shapley values on network to quantify feature importance, while strategically constraining their integration based on network expansion principles and nodal adjacency, thereby fostering enhanced interpretability in feature selection. We apply our methods to a publicly available single-cell RNA sequencing dataset of breast cancer immunotherapy responses, using the identified feature gene set as biomarkers. Functional enrichment analysis with independent validations further illustrates their effective predictive performance RESULTS: We demonstrate the sophistication and excellence of the proposed method in data with network structure. It unveiled a cohesive biomarker module encompassing 27 genes for immunotherapy response. Notably, this module proves adept at precisely predicting anti-PD1 therapeutic outcomes in breast cancer patients with classification accuracy of 0.905 and AUC value of 0.971, underscoring its unique capacity to illuminate gene functionalities CONCLUSION: The proposed method is effective for identifying network module biomarkers, and the detected anti-PD1 response biomarkers can enrich our understanding of the underlying physiological mechanisms of immunotherapy, which have a promising application for realizing precision medicine.

Collapse

Liang H, Luo H, Sang Z, Jia M, Jiang X, Wang Z, Cong S, Yao X. GREMI: An Explainable Multi-Omics Integration Framework for Enhanced Disease Prediction and Module Identification. IEEE J Biomed Health Inform 2024;28:6983-6996. [PMID: 39110558 DOI: 10.1109/jbhi.2024.3439713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]

Jagadesh P, Khan AH, Priya BS, Asheeka A, Zoubir Z, Magbool HM, Alam S, Bakather OY. Artificial neural network, machine learning modelling of compressive strength of recycled coarse aggregate based self-compacting concrete. PLoS One 2024;19:e0303101. [PMID: 38739642 PMCID: PMC11090367 DOI: 10.1371/journal.pone.0303101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/15/2024] [Indexed: 05/16/2024] Open

Abstract

This research study aims to understand the application of Artificial Neural Networks (ANNs) to forecast the Self-Compacting Recycled Coarse Aggregate Concrete (SCRCAC) compressive strength. From different literature, 602 available data sets from SCRCAC mix designs are collected, and the data are rearranged, reconstructed, trained and tested for the ANN model development. The models were established using seven input variables: the mass of cementitious content, water, natural coarse aggregate content, natural fine aggregate content, recycled coarse aggregate content, chemical admixture and mineral admixture used in the SCRCAC mix designs. Two normalization techniques are used for data normalization to visualize the data distribution. For each normalization technique, three transfer functions are used for modelling. In total, six different types of models were run in MATLAB and used to estimate the 28th day SCRCAC compressive strength. Normalization technique 2 performs better than 1 and TANSING is the best transfer function. The best k-fold cross-validation fold is k = 7. The coefficient of determination for predicted and actual compressive strength is 0.78 for training and 0.86 for testing. The impact of the number of neurons and layers on the model was performed. Inputs from standards are used to forecast the 28th day compressive strength. Apart from ANN, Machine Learning (ML) techniques like random forest, extra trees, extreme boosting and light gradient boosting techniques are adopted to predict the 28th day compressive strength of SCRCAC. Compared to ML, ANN prediction shows better results in terms of sensitive analysis. The study also extended to determine 28th day compressive strength from experimental work and compared it with 28th day compressive strength from ANN best model. Standard and ANN mix designs have similar fresh and hardened properties. The average compressive strength from ANN model and experimental results are 39.067 and 38.36 MPa, respectively with correlation coefficient is 1. It appears that ANN can validly predict the compressive strength of concrete.

Collapse

Chereda H, Leha A, Beißbarth T. Stable feature selection utilizing Graph Convolutional Neural Network and Layer-wise Relevance Propagation for biomarker discovery in breast cancer. Artif Intell Med 2024;151:102840. [PMID: 38658129 DOI: 10.1016/j.artmed.2024.102840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 03/05/2024] [Accepted: 03/10/2024] [Indexed: 04/26/2024]

Abstract

High-throughput technologies are becoming increasingly important in discovering prognostic biomarkers and in identifying novel drug targets. With Mammaprint, Oncotype DX, and many other prognostic molecular signatures breast cancer is one of the paradigmatic examples of the utility of high-throughput data to deliver prognostic biomarkers, that can be represented in a form of a rather short gene list. Such gene lists can be obtained as a set of features (genes) that are important for the decisions of a Machine Learning (ML) method applied to high-dimensional gene expression data. Several studies have identified predictive gene lists for patient prognosis in breast cancer, but these lists are unstable and have only a few genes in common. Instability of feature selection impedes biological interpretability: genes that are relevant for cancer pathology should be members of any predictive gene list obtained for the same clinical type of patients. Stability and interpretability of selected features can be improved by including information on molecular networks in ML methods. Graph Convolutional Neural Network (GCNN) is a contemporary deep learning approach applicable to gene expression data structured by a prior knowledge molecular network. Layer-wise Relevance Propagation (LRP) and SHapley Additive exPlanations (SHAP) are methods to explain individual decisions of deep learning models. We used both GCNN+LRP and GCNN+SHAP techniques to construct feature sets by aggregating individual explanations. We suggest a methodology to systematically and quantitatively analyze the stability, the impact on the classification performance, and the interpretability of the selected feature sets. We used this methodology to compare GCNN+LRP to GCNN+SHAP and to more classical ML-based feature selection approaches. Utilizing a large breast cancer gene expression dataset we show that, while feature selection with SHAP is useful in applications where selected features have to be impactful for classification performance, among all studied methods GCNN+LRP delivers the most stable (reproducible) and interpretable gene lists.

Collapse

Nissar I, Alam S, Masood S, Kashif M. MOB-CBAM: A dual-channel attention-based deep learning generalizable model for breast cancer molecular subtypes prediction using mammograms. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;248:108121. [PMID: 38531147 DOI: 10.1016/j.cmpb.2024.108121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 02/15/2024] [Accepted: 03/06/2024] [Indexed: 03/28/2024]

Abstract

BACKGROUND AND OBJECTIVE

Deep Learning models have emerged as a significant tool in generating efficient solutions for complex problems including cancer detection, as they can analyze large amounts of data with high efficiency and performance. Recent medical studies highlight the significance of molecular subtype detection in breast cancer, aiding the development of personalized treatment plans as different subtypes of cancer respond better to different therapies.

METHODS

In this work, we propose a novel lightweight dual-channel attention-based deep learning model MOB-CBAM that utilizes the backbone of MobileNet-V3 architecture with a Convolutional Block Attention Module to make highly accurate and precise predictions about breast cancer. We used the CMMD mammogram dataset to evaluate the proposed model in our study. Nine distinct data subsets were created from the original dataset to perform coarse and fine-grained predictions, enabling it to identify masses, calcifications, benign, malignant tumors and molecular subtypes of cancer, including Luminal A, Luminal B, HER-2 Positive, and Triple Negative. The pipeline incorporates several image pre-processing techniques, including filtering, enhancement, and normalization, for enhancing the model's generalization ability.

RESULTS

While identifying benign versus malignant tumors, i.e., coarse-grained classification, the MOB-CBAM model produced exceptional results with 99 % accuracy, precision, recall, and F1-score values of 0.99 and MCC of 0.98. In terms of fine-grained classification, the MOB-CBAM model has proven to be highly efficient in accurately identifying mass with (benign/malignant) and calcification with (benign/malignant) classification tasks with an impressive accuracy rate of 98 %. We have also cross-validated the efficiency of the proposed MOB-CBAM deep learning architecture on two datasets: MIAS and CBIS-DDSM. On the MIAS dataset, an accuracy of 97 % was reported for the task of classifying benign, malignant, and normal images, while on the CBIS-DDSM dataset, an accuracy of 98 % was achieved for the classification of mass with either benign or malignant, and calcification with benign and malignant tumors.

CONCLUSION

This study presents lightweight MOB-CBAM, a novel deep learning framework, to address breast cancer diagnosis and subtype prediction. The model's innovative incorporation of the CBAM enhances precise predictions. The extensive evaluation of the CMMD dataset and cross-validation on other datasets affirm the model's efficacy.

Collapse

Yang B, Wang L, Bao W. Identify Diabetes-related Targets based on ForgeNet_GPC. Curr Comput Aided Drug Des 2024;20:1042-1054. [PMID: 38173214 DOI: 10.2174/0115734099258183230929173855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/06/2023] [Accepted: 08/15/2023] [Indexed: 01/05/2024]

Tian L, Yu T. An integrated deep learning framework for the interpretation of untargeted metabolomics data. Brief Bioinform 2023;24:bbad244. [PMID: 37369636 DOI: 10.1093/bib/bbad244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 06/02/2023] [Accepted: 06/12/2023] [Indexed: 06/29/2023] Open

Abstract

Untargeted metabolomics is gaining widespread applications. The key aspects of the data analysis include modeling complex activities of the metabolic network, selecting metabolites associated with clinical outcome and finding critical metabolic pathways to reveal biological mechanisms. One of the key roadblocks in data analysis is not well-addressed, which is the problem of matching uncertainty between data features and known metabolites. Given the limitations of the experimental technology, the identities of data features cannot be directly revealed in the data. The predominant approach for mapping features to metabolites is to match the mass-to-charge ratio (m/z) of data features to those derived from theoretical values of known metabolites. The relationship between features and metabolites is not one-to-one since some metabolites share molecular composition, and various adduct ions can be derived from the same metabolite. This matching uncertainty causes unreliable metabolite selection and functional analysis results. Here we introduce an integrated deep learning framework for metabolomics data that take matching uncertainty into consideration. The model is devised with a gradual sparsification neural network based on the known metabolic network and the annotation relationship between features and metabolites. This architecture characterizes metabolomics data and reflects the modular structure of biological system. Three goals can be achieved simultaneously without requiring much complex inference and additional assumptions: (1) evaluate metabolite importance, (2) infer feature-metabolite matching likelihood and (3) select disease sub-networks. When applied to a COVID metabolomics dataset and an aging mouse brain dataset, our method found metabolic sub-networks that were easily interpretable.

Collapse

Tian L, Wu W, Yu T. Graph Random Forest: A Graph Embedded Algorithm for Identifying Highly Connected Important Features. Biomolecules 2023;13:1153. [PMID: 37509188 PMCID: PMC10377046 DOI: 10.3390/biom13071153] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/26/2023] [Accepted: 06/30/2023] [Indexed: 07/30/2023] Open

Lee S, Jung H, Park J, Ahn J. Accurate Prediction of Cancer Prognosis by Exploiting Patient-Specific Cancer Driver Genes. Int J Mol Sci 2023;24:ijms24076445. [PMID: 37047418 PMCID: PMC10095073 DOI: 10.3390/ijms24076445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/17/2023] [Accepted: 03/28/2023] [Indexed: 04/03/2023] Open

Alharbi F, Vakanski A. Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering (Basel) 2023;10:bioengineering10020173. [PMID: 36829667 PMCID: PMC9952758 DOI: 10.3390/bioengineering10020173] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open

Hou X, Hou J, Huang G. Bi-dimensional principal gene feature selection from big gene expression data. PLoS One 2022;17:e0278583. [PMID: 36477666 PMCID: PMC9728919 DOI: 10.1371/journal.pone.0278583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 11/20/2022] [Indexed: 12/12/2022] Open

Wang C, Lye X, Kaalia R, Kumar P, Rajapakse JC. Deep learning and multi-omics approach to predict drug responses in cancer. BMC Bioinformatics 2022;22:632. [PMID: 36443676 PMCID: PMC9703655 DOI: 10.1186/s12859-022-04964-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 09/25/2022] [Indexed: 11/29/2022] Open

Abstract

BACKGROUND

Cancers are genetically heterogeneous, so anticancer drugs show varying degrees of effectiveness on patients due to their differing genetic profiles. Knowing patient's responses to numerous cancer drugs are needed for personalized treatment for cancer. By using molecular profiles of cancer cell lines available from Cancer Cell Line Encyclopedia (CCLE) and anticancer drug responses available in the Genomics of Drug Sensitivity in Cancer (GDSC), we will build computational models to predict anticancer drug responses from molecular features.

RESULTS

We propose a novel deep neural network model that integrates multi-omics data available as gene expressions, copy number variations, gene mutations, reverse phase protein array expressions, and metabolomics expressions, in order to predict cellular responses to known anti-cancer drugs. We employ a novel graph embedding layer that incorporates interactome data as prior information for prediction. Moreover, we propose a novel attention layer that effectively combines different omics features, taking their interactions into account. The network outperformed feedforward neural networks and reported 0.90 for [Formula: see text] values for prediction of drug responses from cancer cell lines data available in CCLE and GDSC.

CONCLUSION

The outstanding results of our experiments demonstrate that the proposed method is capable of capturing the interactions of genes and proteins, and integrating multi-omics features effectively. Furthermore, both the results of ablation studies and the investigations of the attention layer imply that gene mutation has a greater influence on the prediction of drug responses than other omics data types. Therefore, we conclude that our approach can not only predict the anti-cancer drug response precisely but also provides insights into reaction mechanisms of cancer cell lines and drugs as well.

Collapse

Sparse multi-label feature selection via dynamic graph manifold regularization. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01679-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2022]

Guo X, Han J, Song Y, Yin Z, Liu S, Shang X. Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions. Front Genet 2022;13:921775. [PMID: 36046233 PMCID: PMC9421127 DOI: 10.3389/fgene.2022.921775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open

Abstract Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks. Collapse

Yang B, Bao W, Hong S. Alzheimer-Compound Identification Based on Data Fusion and forgeNet_SVM. Front Aging Neurosci 2022;14:931729. [PMID: 35959292 PMCID: PMC9357977 DOI: 10.3389/fnagi.2022.931729] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 05/24/2022] [Indexed: 11/17/2022] Open

Kumar R, Khatri A, Acharya V. Deep learning uncovers distinct behavior of rice network to pathogens response. iScience 2022;25:104546. [PMID: 35754717 PMCID: PMC9218438 DOI: 10.1016/j.isci.2022.104546] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 05/06/2022] [Accepted: 06/02/2022] [Indexed: 12/15/2022] Open

Rezaee K, Jeon G, Khosravi MR, Attar HH, Sabzevari A. Deep learning‐based microarray cancer classification and ensemble gene selection approach. IET Syst Biol 2022;16:120-131. [PMID: 35790076 PMCID: PMC9290776 DOI: 10.1049/syb2.12044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 04/04/2022] [Accepted: 05/31/2022] [Indexed: 12/19/2022] Open

EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm. ENTROPY 2022;24:e24070873. [PMID: 35885095 PMCID: PMC9322764 DOI: 10.3390/e24070873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/15/2022] [Accepted: 06/22/2022] [Indexed: 02/04/2023]

Xing X, Yang F, Li H, Zhang J, Zhao Y, Gao M, Huang J, Yao J. Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis. Bioinformatics 2022;38:2178-2186. [PMID: 35157021 DOI: 10.1093/bioinformatics/btac088] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 01/29/2022] [Accepted: 02/09/2022] [Indexed: 02/03/2023] Open

Tan K, Huang W, Liu X, Hu J, Dong S. A multi-modal fusion framework based on multi-task correlation learning for cancer prognosis prediction. Artif Intell Med 2022;126:102260. [DOI: 10.1016/j.artmed.2022.102260] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 01/07/2022] [Accepted: 02/16/2022] [Indexed: 12/30/2022]

Jin Z, Kang J, Yu T. Feature selection and classification over the network with missing node observations. Stat Med 2022;41:1242-1262. [PMID: 34816464 PMCID: PMC9773124 DOI: 10.1002/sim.9267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 09/14/2021] [Accepted: 10/29/2021] [Indexed: 12/25/2022]

Zhang Y, Ma Y. Non-negative multi-label feature selection with dynamic graph constraints. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107924] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

Li L, Liu ZP. A connected network-regularized logistic regression model for feature selection. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02877-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Zhang Y, Ma Y, Yang X. Multi-label feature selection based on logistic regression and manifold learning. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03008-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Qiao C, Yang L, Shi Y, Fang H, Kang Y. Deep belief networks with self-adaptive sparsity. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02361-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Dynamic subspace dual-graph regularized multi-label feature selection. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Li C, Gao Z, Su B, Xu G, Lin X. Data analysis methods for defining biomarkers from omics data. Anal Bioanal Chem 2021;414:235-250. [PMID: 34951658 DOI: 10.1007/s00216-021-03813-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 11/26/2021] [Accepted: 11/29/2021] [Indexed: 02/01/2023]

Yu K, Xie W, Wang L, Zhang S, Li W. Determination of biomarkers from microarray data using graph neural network and spectral clustering. Sci Rep 2021;11:23828. [PMID: 34903818 PMCID: PMC8668890 DOI: 10.1038/s41598-021-03316-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 12/02/2021] [Indexed: 11/26/2022] Open

Ma W, Su K, Wu H. Evaluation of some aspects in supervised cell type identification for single-cell RNA-seq: classifier, feature selection, and reference construction. Genome Biol 2021;22:264. [PMID: 34503564 PMCID: PMC8427961 DOI: 10.1186/s13059-021-02480-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 08/25/2021] [Indexed: 11/10/2022] Open

Nguyen ND, Jin T, Wang D. Varmole: a biologically drop-connect deep neural network model for prioritizing disease risk variants and genes. Bioinformatics 2021;37:1772-1775. [PMID: 33031552 PMCID: PMC8289382 DOI: 10.1093/bioinformatics/btaa866] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 09/07/2020] [Accepted: 09/23/2020] [Indexed: 12/23/2022] Open

Tan K, Huang W, Liu X, Hu J, Dong S. A Hierarchical Graph Convolution Network for Representation Learning of Gene Expression Data. IEEE J Biomed Health Inform 2021;25:3219-3229. [PMID: 33449889 DOI: 10.1109/jbhi.2021.3052008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Wang X, Dong Y, Zheng Y, Chen Y. Multiomics metabolic and epigenetics regulatory network in cancer: A systems biology perspective. J Genet Genomics 2021;48:520-530. [PMID: 34362682 DOI: 10.1016/j.jgg.2021.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 05/07/2021] [Accepted: 05/11/2021] [Indexed: 12/21/2022]

Yang H, Zhuang Z, Pan W. A graph convolutional neural network for gene expression data analysis with multiple gene networks. Stat Med 2021;40:5547-5564. [PMID: 34258781 DOI: 10.1002/sim.9140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 04/07/2021] [Accepted: 06/21/2021] [Indexed: 02/01/2023]

Yang S, Zhu F, Ling X, Liu Q, Zhao P. Intelligent Health Care: Applications of Deep Learning in Computational Medicine. Front Genet 2021;12:607471. [PMID: 33912213 PMCID: PMC8075004 DOI: 10.3389/fgene.2021.607471] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/05/2021] [Indexed: 12/24/2022] Open

Feng J, Jiang L, Li S, Tang J, Wen L. Multi-Omics Data Fusion via a Joint Kernel Learning Model for Cancer Subtype Discovery and Essential Gene Identification. Front Genet 2021;12:647141. [PMID: 33747053 PMCID: PMC7969795 DOI: 10.3389/fgene.2021.647141] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 02/02/2021] [Indexed: 01/17/2023] Open

GVES: machine learning model for identification of prognostic genes with a small dataset. Sci Rep 2021;11:439. [PMID: 33431999 PMCID: PMC7801384 DOI: 10.1038/s41598-020-79889-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 12/08/2020] [Indexed: 12/16/2022] Open

Liu J, Su R, Zhang J, Wei L. Classification and gene selection of triple-negative breast cancer subtype embedding gene connectivity matrix in deep neural network. Brief Bioinform 2021;22:6067882. [PMID: 33415328 DOI: 10.1093/bib/bbaa395] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/16/2020] [Accepted: 12/01/2020] [Indexed: 12/13/2022] Open

Liu T, Huang J, Liao T, Pu R, Liu S, Peng Y. A Hybrid Deep Learning Model for Predicting Molecular Subtypes of Human Breast Cancer Using Multimodal Data. Ing Rech Biomed 2021. [DOI: 10.1016/j.irbm.2020.12.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Lee S, Lim S, Lee T, Sung I, Kim S. Cancer subtype classification and modeling by pathway attention and propagation. Bioinformatics 2020;36:3818-3824. [PMID: 32207514 DOI: 10.1093/bioinformatics/btaa203] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 01/13/2020] [Accepted: 03/19/2020] [Indexed: 01/04/2023] Open

Kong Y, Yu T. forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction. Bioinformatics 2020;36:3507-3515. [PMID: 32163118 DOI: 10.1093/bioinformatics/btaa164] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 02/07/2020] [Accepted: 03/08/2020] [Indexed: 12/31/2022] Open

Gallins P, Saghapour E, Zhou YH. Exploring the Limits of Combined Image/'omics Analysis for Non-cancer Histological Phenotypes. Front Genet 2020;11:555886. [PMID: 33193632 PMCID: PMC7644963 DOI: 10.3389/fgene.2020.555886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Accepted: 09/09/2020] [Indexed: 11/13/2022] Open

Abstract

The last several years have witnessed an explosion of methods and applications for combining image data with 'omics data, and for prediction of clinical phenotypes. Much of this research has focused on cancer histology, for which genetic perturbations are large, and the signal to noise ratio is high. Related research on chronic, complex diseases is limited by tissue sample availability, lower genomic signal strength, and the less extreme and tissue-specific nature of intermediate histological phenotypes. Data from the GTEx Consortium provides a unique opportunity to investigate the connections among phenotypic histological variation, imaging data, and 'omics profiling, from multiple tissue-specific phenotypes at the sub-clinical level. Investigating histological designations in multiple tissues, we survey the evidence for genomic association and prediction of histology, and use the results to test the limits of prediction accuracy using machine learning methods applied to the imaging data, genomics data, and their combination. We find that expression data has similar or superior accuracy for pathology prediction as our use of imaging data, despite the fact that pathological determination is made from the images themselves. A variety of machine learning methods have similar performance, while network embedding methods offer at best limited improvements. These observations hold across a range of tissues and predictor types. The results are supportive of the use of genomic measurements for prediction, and in using the same target tissue in which pathological phenotyping has been performed. Although this last finding is sensible, to our knowledge our study is the first to demonstrate this fact empirically. Even while prediction accuracy remains a challenge, the results show clear evidence of pathway and tissue-specific biology.

Collapse

Xu D, Zhang J, Xu H, Zhang Y, Chen W, Gao R, Dehmer M. Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data. BMC Genomics 2020;21:650. [PMID: 32962626 PMCID: PMC7510277 DOI: 10.1186/s12864-020-07038-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/30/2020] [Indexed: 12/19/2022] Open

Abstract

Background

The small number of samples and the curse of dimensionality hamper the better application of deep learning techniques for disease classification. Additionally, the performance of clustering-based feature selection algorithms is still far from being satisfactory due to their limitation in using unsupervised learning methods. To enhance interpretability and overcome this problem, we developed a novel feature selection algorithm. In the meantime, complex genomic data brought great challenges for the identification of biomarkers and therapeutic targets. The current some feature selection methods have the problem of low sensitivity and specificity in this field.

Results

In this article, we designed a multi-scale clustering-based feature selection algorithm named MCBFS which simultaneously performs feature selection and model learning for genomic data analysis. The experimental results demonstrated that MCBFS is robust and effective by comparing it with seven benchmark and six state-of-the-art supervised methods on eight data sets. The visualization results and the statistical test showed that MCBFS can capture the informative genes and improve the interpretability and visualization of tumor gene expression and single-cell sequencing data. Additionally, we developed a general framework named McbfsNW using gene expression data and protein interaction data to identify robust biomarkers and therapeutic targets for diagnosis and therapy of diseases. The framework incorporates the MCBFS algorithm, network recognition ensemble algorithm and feature selection wrapper. McbfsNW has been applied to the lung adenocarcinoma (LUAD) data sets. The preliminary results demonstrated that higher prediction results can be attained by identified biomarkers on the independent LUAD data set, and we also structured a drug-target network which may be good for LUAD therapy.

Conclusions

The proposed novel feature selection method is robust and effective for gene selection, classification, and visualization. The framework McbfsNW is practical and helpful for the identification of biomarkers and targets on genomic data. It is believed that the same methods and principles are extensible and applicable to other different kinds of data sets.

Collapse

Nakashima S, Nacher JC, Song J, Akutsu T. An Overview of Bioinformatics Methods for Analyzing Autism Spectrum Disorders. Curr Pharm Des 2020;25:4552-4559. [PMID: 31713477 DOI: 10.2174/1381612825666191111154837] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 11/07/2019] [Indexed: 02/06/2023]

Hu J, Li Y, Gao W, Zhang P. Robust multi-label feature selection with dual-graph regularization. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106126] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

A supervised machine learning-based methodology for analyzing dysregulation in splicing machinery: An application in cancer diagnosis. Artif Intell Med 2020;108:101950. [PMID: 32972670 DOI: 10.1016/j.artmed.2020.101950] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 08/15/2020] [Accepted: 08/18/2020] [Indexed: 02/06/2023]

Li J, Ping Y, Li H, Li H, Liu Y, Liu B, Wang Y. Prognostic prediction of carcinoma by a differential-regulatory-network-embedded deep neural network. Comput Biol Chem 2020;88:107317. [PMID: 32622180 DOI: 10.1016/j.compbiolchem.2020.107317] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 06/21/2020] [Indexed: 02/04/2023]

Zhou X, Chai H, Zhao H, Luo CH, Yang Y. Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network. Gigascience 2020;9:giaa076. [PMID: 32649756 PMCID: PMC7350980 DOI: 10.1093/gigascience/giaa076] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 04/23/2020] [Accepted: 06/24/2020] [Indexed: 12/13/2022] Open