1
|
Zhang H, Hu H, Zhou D, Zhang X, Cao B. Compact CNN module balancing between feature diversity and redundancy. Neural Netw 2025; 188:107456. [PMID: 40220561 DOI: 10.1016/j.neunet.2025.107456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Revised: 03/29/2025] [Accepted: 03/31/2025] [Indexed: 04/14/2025]
Abstract
Feature diversity and redundancy play a crucial role in enhancing a model's performance, although their effect on network design remains underexplored. Herein, we introduce BDRConv, a compact convolutional neural network (CNN) module that establishes a balance between feature diversity and redundancy to generate and retain features with moderate redundancy and high diversity while reducing computational costs. Specifically, input features are divided into a main part and an expansion part. The main part extracts intrinsic and diverse features, while the expansion part enhances diverse information extraction. Experiments on the CIFAR10, ImageNet, and MS COCO datasets demonstrate that BDRConv-equipped networks outperform state-of-the-art methods in accuracy, with significantly lower floating-point operations (FLOPs) and parameters. In addition, BDRConv module as a plug-and-play component can easily replace existing convolution modules, offering potential for broader CNN applications.
Collapse
Affiliation(s)
- Huihuang Zhang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou, 310023, China
| | - Haigen Hu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou, 310023, China.
| | - Deming Zhou
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou, 310023, China
| | - Xiaoqin Zhang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou, 310023, China
| | - Bin Cao
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou, 310023, China
| |
Collapse
|
2
|
Hajim WI, Zainudin S, Daud KM, Alheeti K. Golden eagle optimized CONV-LSTM and non-negativity-constrained autoencoder to support spatial and temporal features in cancer drug response prediction. PeerJ Comput Sci 2024; 10:e2520. [PMID: 39896419 PMCID: PMC11784781 DOI: 10.7717/peerj-cs.2520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 10/25/2024] [Indexed: 02/04/2025]
Abstract
Advanced machine learning (ML) and deep learning (DL) methods have recently been utilized in Drug Response Prediction (DRP), and these models use the details from genomic profiles, such as extensive drug screening data and cell line data, to predict the response of drugs. Comparatively, the DL-based prediction approaches provided better learning of such features. However, prior knowledge, like pathway data, is sometimes discarded as irrelevant since the drug response datasets are multidimensional and noisy. Optimized feature learning and extraction processes are suggested to handle this problem. First, the noise and class imbalance problems must be tackled to avoid low identification accuracy, long prediction times, and poor applicability. This article aims to apply the Non-Negativity-Constrained Auto Encoder (NNCAE) network to tackle these issues, enhance the adaptive search for the optimal size of sliding windows, and ensure that deep network architectures are adept at learning the vital hidden features. NNCAE methodology is used after performing the standard pre-processing procedures to handle the noise and class imbalance problem. This class balanced and noise-removed input data features are learned to train the proposed hybrid classifier. The classification model, Golden Eagle Optimization-based Convolutional Long Short-Term Memory neural networks (GEO-Conv-LSTM), is assembled by integrating Convolutional Neural Network CNN and LSTM models, with parameter tuning performed by the GEO algorithm. Evaluations are conducted on two large datasets from the Genomics of Drug Sensitivity in Cancer (GDSC) repository, and the proposed NNCAE-GEO-Conv-LSTM-based approach has achieved 96.99% and 97.79% accuracies, respectively, with reduced processing time and error rate for the DRP problem.
Collapse
Affiliation(s)
- Wesam Ibrahim Hajim
- Department of Applied Geology, College of Sciences, University of Tikrit, Tikrit, Salah ad Din, Iraq
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Suhaila Zainudin
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Khattab Alheeti
- Department of Computer Networking Systems College of Computer Sciences and Information Technology, University of Anbar, Ramadi, Al Anbar, Iraq
| |
Collapse
|
3
|
Laison EKE, Hamza Ibrahim M, Boligarla S, Li J, Mahadevan R, Ng A, Muthuramalingam V, Lee WY, Yin Y, Nasri BR. Identifying Potential Lyme Disease Cases Using Self-Reported Worldwide Tweets: Deep Learning Modeling Approach Enhanced With Sentimental Words Through Emojis. J Med Internet Res 2023; 25:e47014. [PMID: 37843893 PMCID: PMC10616745 DOI: 10.2196/47014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 07/26/2023] [Accepted: 08/31/2023] [Indexed: 10/17/2023] Open
Abstract
BACKGROUND Lyme disease is among the most reported tick-borne diseases worldwide, making it a major ongoing public health concern. An effective Lyme disease case reporting system depends on timely diagnosis and reporting by health care professionals, and accurate laboratory testing and interpretation for clinical diagnosis validation. A lack of these can lead to delayed diagnosis and treatment, which can exacerbate the severity of Lyme disease symptoms. Therefore, there is a need to improve the monitoring of Lyme disease by using other data sources, such as web-based data. OBJECTIVE We analyzed global Twitter data to understand its potential and limitations as a tool for Lyme disease surveillance. We propose a transformer-based classification system to identify potential Lyme disease cases using self-reported tweets. METHODS Our initial sample included 20,000 tweets collected worldwide from a database of over 1.3 million Lyme disease tweets. After preprocessing and geolocating tweets, tweets in a subset of the initial sample were manually labeled as potential Lyme disease cases or non-Lyme disease cases using carefully selected keywords. Emojis were converted to sentiment words, which were then replaced in the tweets. This labeled tweet set was used for the training, validation, and performance testing of DistilBERT (distilled version of BERT [Bidirectional Encoder Representations from Transformers]), ALBERT (A Lite BERT), and BERTweet (BERT for English Tweets) classifiers. RESULTS The empirical results showed that BERTweet was the best classifier among all evaluated models (average F1-score of 89.3%, classification accuracy of 90.0%, and precision of 97.1%). However, for recall, term frequency-inverse document frequency and k-nearest neighbors performed better (93.2% and 82.6%, respectively). On using emojis to enrich the tweet embeddings, BERTweet had an increased recall (8% increase), DistilBERT had an increased F1-score of 93.8% (4% increase) and classification accuracy of 94.1% (4% increase), and ALBERT had an increased F1-score of 93.1% (5% increase) and classification accuracy of 93.9% (5% increase). The general awareness of Lyme disease was high in the United States, the United Kingdom, Australia, and Canada, with self-reported potential cases of Lyme disease from these countries accounting for around 50% (9939/20,000) of the collected English-language tweets, whereas Lyme disease-related tweets were rare in countries from Africa and Asia. The most reported Lyme disease-related symptoms in the data were rash, fatigue, fever, and arthritis, while symptoms, such as lymphadenopathy, palpitations, swollen lymph nodes, neck stiffness, and arrythmia, were uncommon, in accordance with Lyme disease symptom frequency. CONCLUSIONS The study highlights the robustness of BERTweet and DistilBERT as classifiers for potential cases of Lyme disease from self-reported data. The results demonstrated that emojis are effective for enrichment, thereby improving the accuracy of tweet embeddings and the performance of classifiers. Specifically, emojis reflecting sadness, empathy, and encouragement can reduce false negatives.
Collapse
Affiliation(s)
- Elda Kokoe Elolo Laison
- Département de médecine sociale et préventive, École de Santé Publique de l'Université de Montréal, Université de Montréal, Montréal, QC, Canada
| | | | - Srikanth Boligarla
- Harvard Extension School, Harvard University, Cambridge, MA, United States
| | - Jiaxin Li
- Harvard Extension School, Harvard University, Cambridge, MA, United States
| | - Raja Mahadevan
- Harvard Extension School, Harvard University, Cambridge, MA, United States
| | - Austen Ng
- Harvard Extension School, Harvard University, Cambridge, MA, United States
| | | | - Wee Yi Lee
- Harvard Extension School, Harvard University, Cambridge, MA, United States
| | - Yijun Yin
- Harvard Extension School, Harvard University, Cambridge, MA, United States
| | - Bouchra R Nasri
- Département de médecine sociale et préventive, École de Santé Publique de l'Université de Montréal, Université de Montréal, Montréal, QC, Canada
| |
Collapse
|
4
|
Zhang C, Hu Y, Gao L. Defining and identifying cell sub-crosstalk pairs for characterizing cell-cell communication patterns. Sci Rep 2023; 13:15746. [PMID: 37735248 PMCID: PMC10514069 DOI: 10.1038/s41598-023-42883-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 09/15/2023] [Indexed: 09/23/2023] Open
Abstract
Current cell-cell communication analysis focuses on quantifying intercellular interactions at cell type level. In the tissue microenvironment, one type of cells could be divided into multiple cell subgroups that function differently and communicate with other cell types or subgroups via different ligand-receptor-mediated signaling pathways. Given two cell types, we define a cell sub-crosstalk pair (CSCP) as a combination of two cell subgroups with strong and similar intercellular crosstalk signals and identify CSCPs based on coupled non-negative matrix factorization. Using single-cell spatial transcriptomics data of mouse olfactory bulb and visual cortex, we find that cells of different types within CSCPs are significantly spatially closer with each other than those in the whole single-cell spatial map. To demonstrate the utility of CSCPs, we apply 13 cell-cell communication analysis methods to sampled single-cell transcriptomics datasets at CSCP level and reveal ligand-receptor interactions masked at cell type level. Furthermore, by analyzing single-cell transcriptomics data from 29 breast cancer patients with different immunotherapy responses, we find that CSCPs are useful predictive features to discriminate patients responding to anti-PD-1 therapy from non-responders. Taken together, partitioning a cell type pair into CSCPs enables fine-grained characterization of cell-cell communication in tissue and tumor microenvironments.
Collapse
Affiliation(s)
- Chenxing Zhang
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Yuxuan Hu
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China.
| |
Collapse
|
5
|
Shang M, Yuan Y, Luo X, Zhou M. An α-β-Divergence-Generalized Recommender for Highly Accurate Predictions of Missing User Preferences. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8006-8018. [PMID: 33600329 DOI: 10.1109/tcyb.2020.3026425] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
To quantify user-item preferences, a recommender system (RS) commonly adopts a high-dimensional and sparse (HiDS) matrix. Such a matrix can be represented by a non-negative latent factor analysis model relying on a single latent factor (LF)-dependent, non-negative, and multiplicative update algorithm. However, existing models' representative abilities are limited due to their specialized learning objective. To address this issue, this study proposes an α- β -divergence-generalized model that enjoys fast convergence. Its ideas are three-fold: 1) generalizing its learning objective with α- β -divergence to achieve highly accurate representation of HiDS data; 2) incorporating a generalized momentum method into parameter learning for fast convergence; and 3) implementing self-adaptation of controllable hyperparameters for excellent practicability. Empirical studies on six HiDS matrices from real RSs demonstrate that compared with state-of-the-art LF models, the proposed one achieves significant accuracy and efficiency gain to estimate huge missing data in an HiDS matrix.
Collapse
|
6
|
A Study of Hybrid Predictions Based on the Synthesized Health Indicator for Marine Systems and Their Equipment Failure. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Ship mechanical system health prognosis is one of the major tasks of ship intelligent operation and maintenance (O&M). However, current failure prediction methods are aimed at single pieces of equipment, and system-level monitoring remains an underexplored area. To address this issue, an integration method based on a synthesized health indicator (SHI) and dynamic hybrid prediction is proposed. To accurately reflect the changes in system health conditions, a multi-state parameter fusion method based on dynamic kernel principal component analysis (DKPCA) and the stacked autoencoder (SAE) is presented, along with construction of a system SHI. Taking into consideration that the system degradation process includes global degradation trends, local self-healing phenomena, and local interference, a dynamic hybrid prediction model is established after SHI decomposition. The performance of the proposed approach is applied to a ship fuel-oil system to show its effectiveness.
Collapse
|
7
|
Wang T, Ng WWY, Pelillo M, Kwong S. LiSSA: Localized Stochastic Sensitive Autoencoders. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2748-2760. [PMID: 31331899 DOI: 10.1109/tcyb.2019.2923756] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The training of autoencoder (AE) focuses on the selection of connection weights via a minimization of both the training error and a regularized term. However, the ultimate goal of AE training is to autoencode future unseen samples correctly (i.e., good generalization). Minimizing the training error with different regularized terms only indirectly minimizes the generalization error. Moreover, the trained model may not be robust to small perturbations of inputs which may lead to a poor generalization capability. In this paper, we propose a localized stochastic sensitive AE (LiSSA) to enhance the robustness of AE with respect to input perturbations. With the local stochastic sensitivity regularization, LiSSA reduces sensitivity to unseen samples with small differences (perturbations) from training samples. Meanwhile, LiSSA preserves the local connectivity from the original input space to the representation space that learns a more robustness features (intermediate representation) for unseen samples. The classifier using these learned features yields a better generalization capability. Extensive experimental results on 36 benchmarking datasets indicate that LiSSA outperforms several classical and recent AE training methods significantly on classification tasks.
Collapse
|
8
|
Li Y, Sixou B, Peyrin F. A Review of the Deep Learning Methods for Medical Images Super Resolution Problems. Ing Rech Biomed 2021. [DOI: 10.1016/j.irbm.2020.08.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
9
|
Ayinde BO, Inanc T, Zurada JM. Regularizing Deep Neural Networks by Enhancing Diversity in Feature Extraction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2650-2661. [PMID: 30624232 DOI: 10.1109/tnnls.2018.2885972] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper proposes a new and efficient technique to regularize the neural network in the context of deep learning using correlations among features. Previous studies have shown that oversized deep neural network models tend to produce a lot of redundant features that are either the shifted version of one another or are very similar and show little or no variations, thus resulting in redundant filtering. We propose a way to address this problem and show that such redundancy can be avoided using regularization and adaptive feature dropout mechanism. We show that regularizing both negative and positive correlated features according to their differentiation and based on their relative cosine distances yields network extracting dissimilar features with less overfitting and better generalization. This concept is illustrated with deep multilayer perceptron, convolutional neural network, sparse autoencoder, gated recurrent unit, and long short-term memory on MNIST digits recognition, CIFAR-10, ImageNet, and Stanford Natural Language Inference data sets.
Collapse
|
10
|
Lemke T, Peter C. EncoderMap: Dimensionality Reduction and Generation of Molecule Conformations. J Chem Theory Comput 2019; 15:1209-1215. [DOI: 10.1021/acs.jctc.8b00975] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Tobias Lemke
- Theoretical Chemistry, University of Konstanz, 78547 Konstanz, Germany
| | - Christine Peter
- Theoretical Chemistry, University of Konstanz, 78547 Konstanz, Germany
| |
Collapse
|
11
|
Ogundijo OE, Wang X. SeqClone: sequential Monte Carlo based inference of tumor subclones. BMC Bioinformatics 2019; 20:6. [PMID: 30611189 PMCID: PMC6320595 DOI: 10.1186/s12859-018-2562-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 12/06/2018] [Indexed: 11/13/2022] Open
Abstract
Background Tumor samples are heterogeneous. They consist of varying cell populations or subclones and each subclone is characterized with a distinct single nucleotide variant (SNV) profile. This explains the source of genetic heterogeneity observed in tumor sequencing data. To make precise prognosis and design effective therapy for cancer, ascertaining the subclonal composition of a tumor is of great importance. Results In this paper, we propose a state-space formulation of the feature allocation model. This model is interpreted as the blind deconvolution of the expected variant allele fractions (VAFs). VAFs are deconvolved into a binary matrix of genotypes and a matrix of genotype proportions in the samples. Specifically, we consider a sequential construction of the genotype matrix which we model by Indian buffet process (IBP). We describe an efficient sequential Monte Carlo (SMC) algorithm, SeqClone, that jointly estimates the genotypes of subclones and their proportions in the samples. When compared to other methods for resolving tumor heterogeneity, SeqClone provides comparable and sometimes, better estimates of model parameters. By design, SeqClone conveniently handles any number of probed SNVs in the samples. In particular, we can analyze VAFs from newly probed SNVs to improve existing estimates, an attribute not present in existing solutions. Conclusions We show that the SMC algorithm for deconvolving VAFs from tumor sequencing data is a robust and promising alternative for explaining the observed genetic heterogeneity in tumor samples. Electronic supplementary material The online version of this article (10.1186/s12859-018-2562-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Oyetunji E Ogundijo
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Xiaodong Wang
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
12
|
Alami N, En-nahnahi N, Ouatik SA, Meknassi M. Using Unsupervised Deep Learning for Automatic Summarization of Arabic Documents. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2018. [DOI: 10.1007/s13369-018-3198-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
13
|
Ogundijo OE, Wang X. Bayesian estimation of scaled mutation rate under the coalescent: a sequential Monte Carlo approach. BMC Bioinformatics 2017; 18:541. [PMID: 29216822 PMCID: PMC5721689 DOI: 10.1186/s12859-017-1948-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2017] [Accepted: 11/21/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Samples of molecular sequence data of a locus obtained from random individuals in a population are often related by an unknown genealogy. More importantly, population genetics parameters, for instance, the scaled population mutation rate Θ=4N e μ for diploids or Θ=2N e μ for haploids (where N e is the effective population size and μ is the mutation rate per site per generation), which explains some of the evolutionary history and past qualities of the population that the samples are obtained from, is of significant interest. RESULTS In this paper, we present the evolution of sequence data in a Bayesian framework and the approximation of the posterior distributions of the unknown parameters of the model, which include Θ via the sequential Monte Carlo (SMC) samplers for static models. Specifically, we approximate the posterior distributions of the unknown parameters with a set of weighted samples i.e., the set of highly probable genealogies out of the infinite set of possible genealogies that describe the sampled sequences. The proposed SMC algorithm is evaluated on simulated DNA sequence datasets under different mutational models and real biological sequences. In terms of the accuracy of the estimates, the proposed SMC method shows a comparable and sometimes, better performance than the state-of-the-art MCMC algorithms. CONCLUSIONS We showed that the SMC algorithm for static model is a promising alternative to the state-of-the-art approach for simulating from the posterior distributions of population genetics parameters.
Collapse
Affiliation(s)
- Oyetunji E Ogundijo
- Department of Electrical Engineering, Columbia University, New York, 10027, USA
| | - Xiaodong Wang
- Department of Electrical Engineering, Columbia University, New York, 10027, USA.
| |
Collapse
|