1
|
Moon HJ, Cho SB. Continual Learning by Contrastive Learning of Regularized Classes in Multivariate Gaussian Distributions. Int J Neural Syst 2025:2550025. [PMID: 40186335 DOI: 10.1142/s012906572550025x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2025]
Abstract
Deep neural networks struggle with incremental updates due to catastrophic forgetting, where newly acquired knowledge interferes with the learned previously. Continual learning (CL) methods aim to overcome this limitation by effectively updating the model without losing previous knowledge, but they find it difficult to continuously maintain knowledge about previous tasks, resulting from overlapping stored information. In this paper, we propose a CL method that preserves previous knowledge as multivariate Gaussian distributions by independently storing the model's outputs per class and continually reproducing them for future tasks. We enhance the discriminability between classes and ensure the plasticity for future tasks by exploiting contrastive learning and representation regularization. The class-wise spatial means and covariances, distinguished in the latent space, are stored in memory, where the previous knowledge is effectively preserved and reproduced for incremental tasks. Extensive experiments on benchmark datasets such as CIFAR-10, CIFAR-100, and ImageNet-100 demonstrate that the proposed method achieves accuracies of 93.21%, 77.57%, and 78.15%, respectively, outperforming state-of-the-art CL methods by 2.34 %p, 2.1 %p, and 1.91 %p. Additionally, it achieves the lowest mean forgetting rates across all datasets.
Collapse
Affiliation(s)
- Hyung-Jun Moon
- Department of Artificial Intelligence, Yonsei University, 50 Yonsei-ro, Sudaemoon-gu, Seoul 03722, South Korea
| | - Sung-Bae Cho
- Department of Computer Science, Yonsei University, 50 Yonsei-ro, Sudaemoon-gu, Seoul 03722, South Korea
| |
Collapse
|
2
|
Zhou X, Jiang Z, Zhou S, Ren Z, Zhang Y, Yu T, Liu Y. Frequency-Assisted Local Attention in Lower Layers of Visual Transformers. Int J Neural Syst 2025; 35:2550015. [PMID: 40016195 DOI: 10.1142/s0129065725500157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2025]
Abstract
Since vision transformers excel at establishing global relationships between features, they play an important role in current vision tasks. However, the global attention mechanism restricts the capture of local features, making convolutional assistance necessary. This paper indicates that transformer-based models can attend to local information without using convolutional blocks, similar to convolutional kernels, by employing a special initialization method. Therefore, this paper proposes a novel hybrid multi-scale model called Frequency-Assisted Local Attention Transformer (FALAT). FALAT introduces a Frequency-Assisted Window-based Positional Self-Attention (FWPSA) module that limits the attention distance of query tokens, enabling the capture of local contents in the early stage. The information from value tokens in the frequency domain enhances information diversity during self-attention computation. Additionally, the traditional convolutional method is replaced with a depth-wise separable convolution to downsample in the spatial reduction attention module for long-distance contents in the later stages. Experimental results demonstrate that FALAT-S achieves 83.0% accuracy on IN-1k with an input size of [Formula: see text] using 29.9[Formula: see text]M parameters and 5.6[Formula: see text]G FLOPs. This model outperforms the Next-ViT-S by 0.9[Formula: see text]APb/0.8[Formula: see text]APm with Mask-R-CNN [Formula: see text] on COCO and surpasses the recent FastViT-SA36 by 3.1% mIoU with FPN on ADE20k.
Collapse
Affiliation(s)
- Xin Zhou
- School of Mechanical Engineering and Automation, Northeastern University, Wenhua Road, Shen Yang, Liao Ning, P. R. China
| | - Zeyu Jiang
- School of Mechanical Engineering and Automation, Northeastern University, Wenhua Road, Shen Yang, Liao Ning, P. R. China
| | - Shihua Zhou
- School of Mechanical Engineering and Automation, Northeastern University, Wenhua Road, Shen Yang, Liao Ning, P. R. China
| | - Zhaohui Ren
- School of Mechanical Engineering and Automation, Northeastern University, Wenhua Road, Shen Yang, Liao Ning, P. R. China
| | - Yongchao Zhang
- School of Mechanical Engineering and Automation, Northeastern University, Wenhua Road, Shen Yang, Liao Ning, P. R. China
| | - Tianzhuang Yu
- School of Mechanical Engineering and Automation, Northeastern University, Wenhua Road, Shen Yang, Liao Ning, P. R. China
| | - Yulin Liu
- School of Mechanical Engineering and Automation, Northeastern University, Wenhua Road, Shen Yang, Liao Ning, P. R. China
| |
Collapse
|
3
|
Xue Y, Lin Y, Neri F. Architecture Knowledge Distillation for Evolutionary Generative Adversarial Network. Int J Neural Syst 2025; 35:2550013. [PMID: 39967019 DOI: 10.1142/s0129065725500133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2025]
Abstract
Generative Adversarial Networks (GANs) are effective for image generation, but their unstable training limits broader applications. Additionally, neural architecture search (NAS) for GANs with one-shot models often leads to insufficient subnet training, where subnets inherit weights from a supernet without proper optimization, further degrading performance. To address both issues, we propose Architecture Knowledge Distillation for Evolutionary GAN (AKD-EGAN). AKD-EGAN operates in two stages. First, architecture knowledge distillation (AKD) is used during supernet training to efficiently optimize subnetworks and accelerate learning. Second, a multi-objective evolutionary algorithm (MOEA) searches for optimal subnet architectures, ensuring efficiency by considering multiple performance metrics. This approach, combined with a strategy for architecture inheritance, enhances GAN stability and image quality. Experiments show that AKD-EGAN surpasses state-of-the-art methods, achieving a Fréchet Inception Distance (FID) of 7.91 and an Inception Score (IS) of 8.97 on CIFAR-10, along with competitive results on STL-10 (FID: 20.32, IS: 10.06). Code and models will be available at https://github.com/njit-ly/AKD-EGAN.
Collapse
Affiliation(s)
- Yu Xue
- School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing 210044, P. R. China
| | - Yan Lin
- School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing 210044, P. R. China
| | - Ferrante Neri
- NICE Research Group, School of Computer Science and Electronic Engineering, University of Surrey, Guildford, GU2 7XS, UK
| |
Collapse
|
4
|
Li X, Yan S, Wu Y, Dai C, Guo Y. A Novel State Space Model with Dynamic Graphic Neural Network for EEG Event Detection. Int J Neural Syst 2025; 35:2550008. [PMID: 39962836 DOI: 10.1142/s012906572550008x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/09/2025]
Abstract
Electroencephalography (EEG) is a widely used physiological signal to obtain information of brain activity, and its automatic detection holds significant research importance, which saves doctors' time, improves detection efficiency and accuracy. However, current automatic detection studies face several challenges: large EEG data volumes require substantial time and space for data reading and model training; EEG's long-term dependencies test the temporal feature extraction capabilities of models; and the dynamic changes in brain activity and the non-Euclidean spatial structure between electrodes complicate the acquisition of spatial information. The proposed method uses range-EEG (rEEG) to extract time-frequency features from EEG to reduce data volume and resource consumption. Additionally, the next-generation state-space model Mamba is utilized as a temporal feature extractor to effectively capture the temporal information in EEG data. To address the limitations of state space models (SSMs) in spatial feature extraction, Mamba is combined with Dynamic Graph Neural Networks, creating an efficient model called DG-Mamba for EEG event detection. Testing on seizure detection and sleep stage classification tasks showed that the proposed method improved training speed by 10 times and reduced memory usage to less than one-seventh of the original data while maintaining superior performance. On the TUSZ dataset, DG-Mamba achieved an AUROC of 0.931 for seizure detection and in the sleep stage classification task, the proposed model surpassed all baselines.
Collapse
Affiliation(s)
- Xinying Li
- School of Information Science and Technology, Fudan University, Shanghai 200433, P. R. China
| | - Shengjie Yan
- School of Information Science and Technology, Fudan University, Shanghai 200433, P. R. China
| | - Yonglin Wu
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200241, P. R. China
| | - Chenyun Dai
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200241, P. R. China
| | - Yao Guo
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200241, P. R. China
| |
Collapse
|
5
|
Huang W, Tang Y, Wang S, Li J, Cheng K, Yan H. Unraveling the Differential Efficiency of Dorsal and Ventral Pathways in Visual Semantic Decoding. Int J Neural Syst 2025; 35:2550009. [PMID: 39789871 DOI: 10.1142/s0129065725500091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
Abstract
Visual semantic decoding aims to extract perceived semantic information from the visual responses of the human brain and convert it into interpretable semantic labels. Although significant progress has been made in semantic decoding across individual visual cortices, studies on the semantic decoding of the ventral and dorsal cortical visual pathways remain limited. This study proposed a graph neural network (GNN)-based semantic decoding model on a natural scene dataset (NSD) to investigate the decoding differences between the dorsal and ventral pathways in process various parts of speech, including verbs, nouns, and adjectives. Our results indicate that the decoding accuracies for verbs and nouns with motion attributes were significantly higher for the dorsal pathway as compared to those for the ventral pathway. Comparative analyses reveal that the dorsal pathway significantly outperformed the ventral pathway in terms of decoding performance for verbs and nouns with motion attributes, with evidence showing that this superiority largely stemmed from higher-level visual cortices rather than lower-level ones. Furthermore, these two pathways appear to converge in their heightened sensitivity toward semantic content related to actions. These findings reveal unique visual neural mechanisms through which the dorsal and ventral cortical pathways segregate and converge when processing stimuli with different semantic categories.
Collapse
Affiliation(s)
- Wei Huang
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, P. R. China
| | - Ying Tang
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, P. R. China
| | - Sizhuo Wang
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, P. R. China
| | - Jingpeng Li
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, P. R. China
| | - Kaiwen Cheng
- College of Language Intelligence, Sichuan International Studies University, Chongqing 400031, P. R. China
| | - Hongmei Yan
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, P. R. China
| |
Collapse
|
6
|
Noneman KK, Mayo JP. Decoding Continuous Tracking Eye Movements from Cortical Spiking Activity. Int J Neural Syst 2025; 35:2450070. [PMID: 39545725 PMCID: PMC12049095 DOI: 10.1142/s0129065724500709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Eye movements are the primary way primates interact with the world. Understanding how the brain controls the eyes is therefore crucial for improving human health and designing visual rehabilitation devices. However, brain activity is challenging to decipher. Here, we leveraged machine learning algorithms to reconstruct tracking eye movements from high-resolution neuronal recordings. We found that continuous eye position could be decoded with high accuracy using spiking data from only a few dozen cortical neurons. We tested eight decoders and found that neural network models yielded the highest decoding accuracy. Simpler models performed well above chance with a substantial reduction in training time. We measured the impact of data quantity (e.g. number of neurons) and data format (e.g. bin width) on training time, inference time, and generalizability. Training models with more input data improved performance, as expected, but the format of the behavioral output was critical for emphasizing or omitting specific oculomotor events. Our results provide the first demonstration, to our knowledge, of continuously decoded eye movements across a large field of view. Our comprehensive investigation of predictive power and computational efficiency for common decoder architectures provides a much-needed foundation for future work on real-time gaze-tracking devices.
Collapse
Affiliation(s)
- Kendra K. Noneman
- Neuroscience Institute, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213,USA
| | - J. Patrick Mayo
- Department of Ophthalmology, University of Pittsburgh, 1622 Locust Street, Pittsburgh, PA 15219, USA
| |
Collapse
|
7
|
Zhao L, Zou R, Jin L. Deep Learning Recognition of Paroxysmal Kinesigenic Dyskinesia Based on EEG Functional Connectivity. Int J Neural Syst 2025; 35:2550001. [PMID: 39560445 DOI: 10.1142/s0129065725500017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Abstract
Paroxysmal kinesigenic dyskinesia (PKD) is a rare neurological disorder marked by transient involuntary movements triggered by sudden actions. Current diagnostic approaches, including genetic screening, face challenges in identifying secondary cases due to symptom overlap with other disorders. This study introduces a novel PKD recognition method utilizing a resting-state electroencephalogram (EEG) functional connectivity matrix and a deep learning architecture (AT-1CBL). Resting-state EEG data from 44 PKD patients and 44 healthy controls (HCs) were collected using a 128-channel EEG system. Functional connectivity matrices were computed and transformed into graph data to examine brain network property differences between PKD patients and controls through graph theory. Source localization was conducted to explore neural circuit differences in patients. The AT-1CBL model, integrating 1D-CNN and Bi-LSTM with attentional mechanisms, achieved a classification accuracy of 93.77% on phase lag index (PLI) features in the Theta band. Graph theoretic analysis revealed significant phase synchronization impairments in the Theta band of the functional brain network in PKD patients, particularly in the distribution of weak connections compared to HCs. Source localization analyses indicated greater differences in functional connectivity in sensorimotor regions and the frontal-limbic system in PKD patients, suggesting abnormalities in motor integration related to clinical symptoms. This study highlights the potential of deep learning models based on EEG functional connectivity for accurate and cost-effective PKD diagnosis, supporting the development of portable EEG devices for clinical monitoring and diagnosis. However, the limited dataset size may affect generalizability, and further exploration of multimodal data integration and advanced deep learning architectures is necessary to enhance the robustness of PKD diagnostic models.
Collapse
Affiliation(s)
- Liang Zhao
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, P. R. China
| | - Renling Zou
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, P. R. China
| | - Linpeng Jin
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, P. R. China
| |
Collapse
|
8
|
Wang K, Zhou R, Wang J, Neri F, Fu Y, Zhou S. A Cloud Detection Network Based on Adaptive Laplacian Coordination Enhanced Cross-Feature U-Net. Int J Neural Syst 2024:2550005. [PMID: 39673153 DOI: 10.1142/s0129065725500054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2024]
Abstract
Cloud cover experiences rapid fluctuations, significantly impacting the irradiance reaching the ground and causing frequent variations in photovoltaic power output. Accurate detection of thin and fragmented clouds is crucial for reliable photovoltaic power generation forecasting. In this paper, we introduce a novel cloud detection method, termed Adaptive Laplacian Coordination Enhanced Cross-Feature U-Net (ALCU-Net). This method augments the traditional U-Net architecture with three innovative components: an Adaptive Feature Coordination (AFC) module, an Adaptive Laplacian Cross-Feature U-Net with a Multi-Grained Laplacian-Enhanced (MLE) feature module, and a Criss-Cross Feature Fused Detection (CCFE) module. The AFC module enhances spatial coherence and bridges semantic gaps across multi-channel images. The Adaptive Laplacian Cross-Feature U-Net integrates features from adjacent hierarchical levels, using the MLE module to refine cloud characteristics and edge details over time. The CCFE module, embedded in the U-Net decoder, leverages criss-cross features to improve detection accuracy. Experimental evaluations show that ALCU-Net consistently outperforms existing cloud detection methods, demonstrating superior accuracy in identifying both thick and thin clouds and in mapping fragmented cloud patches across various environments, including oceans, polar regions, and complex ocean-land mixtures.
Collapse
Affiliation(s)
- Kaizheng Wang
- Faculty of Electric Power Engineering, Kunming University of Science and Technology, Kunming 650500, P.R. China
| | - Ruohan Zhou
- Faculty of Electric Power Engineering, Kunming University of Science and Technology, Kunming 650500, P.R. China
| | - Jian Wang
- Faculty of Electric Power Engineering, Kunming University of Science and Technology, Kunming 650500, P.R. China
| | - Ferrante Neri
- Nature Inspired Computing and Engineering Research Group, School of Computer Science and Electronic Engineering, University of Surrey, Guildford, Surrey GU2 7XH, UK
| | - Yitong Fu
- Faculty of Electric Power Engineering, Kunming University of Science and Technology, Kunming 650500, P.R. China
| | - Shunzhen Zhou
- Faculty of Electric Power Engineering, Kunming University of Science and Technology, Kunming 650500, P.R. China
| |
Collapse
|
9
|
Zheng X, Yang Y, Li D, Deng Y, Xie Y, Yi Z, Ma L, Xu L. Precise Localization for Anatomo-Physiological Hallmarks of the Cervical Spine by Using Neural Memory Ordinary Differential Equation. Int J Neural Syst 2024; 34:2450056. [PMID: 39049777 DOI: 10.1142/s0129065724500564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
In the evaluation of cervical spine disorders, precise positioning of anatomo-physiological hallmarks is fundamental for calculating diverse measurement metrics. Despite the fact that deep learning has achieved impressive results in the field of keypoint localization, there are still many limitations when facing medical image. First, these methods often encounter limitations when faced with the inherent variability in cervical spine datasets, arising from imaging factors. Second, predicting keypoints for only 4% of the entire X-ray image surface area poses a significant challenge. To tackle these issues, we propose a deep neural network architecture, NF-DEKR, specifically tailored for predicting keypoints in cervical spine physiological anatomy. Leveraging neural memory ordinary differential equation with its distinctive memory learning separation and convergence to a singular global attractor characteristic, our design effectively mitigates inherent data variability. Simultaneously, we introduce a Multi-Resolution Focus module to preprocess feature maps before entering the disentangled regression branch and the heatmap branch. Employing a differentiated strategy for feature maps of varying scales, this approach yields more accurate predictions of densely localized keypoints. We construct a medical dataset, SCUSpineXray, comprising X-ray images annotated by orthopedic specialists and conduct similar experiments on the publicly available UWSpineCT dataset. Experimental results demonstrate that compared to the baseline DEKR network, our proposed method enhances average precision by 2% to 3%, accompanied by a marginal increase in model parameters and the floating-point operations (FLOPs). The code (https://github.com/Zhxyi/NF-DEKR) is available.
Collapse
Affiliation(s)
- Xi Zheng
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Yi Yang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Road, Chengdu 610041, P. R. China
| | - Dehan Li
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Yi Deng
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Road, Chengdu 610041, P. R. China
| | - Yuexiong Xie
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Zhang Yi
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Litai Ma
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Road, Chengdu 610041, P. R. China
| | - Lei Xu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| |
Collapse
|
10
|
Jiang P, Neri F, Xue Y, Maulik U. A Generalized Attention Mechanism to Enhance the Accuracy Performance of Neural Networks. Int J Neural Syst 2024; 34:2450063. [PMID: 39212940 DOI: 10.1142/s0129065724500631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
In many modern machine learning (ML) models, attention mechanisms (AMs) play a crucial role in processing data and identifying significant parts of the inputs, whether these are text or images. This selective focus enables subsequent stages of the model to achieve improved classification performance. Traditionally, AMs are applied as a preprocessing substructure before a neural network, such as in encoder/decoder architectures. In this paper, we extend the application of AMs to intermediate stages of data propagation within ML models. Specifically, we propose a generalized attention mechanism (GAM), which can be integrated before each layer of a neural network for classification tasks. The proposed GAM allows for at each layer/step of the ML architecture identification of the most relevant sections of the intermediate results. Our experimental results demonstrate that incorporating the proposed GAM into various ML models consistently enhances the accuracy of these models. This improvement is achieved with only a marginal increase in the number of parameters, which does not significantly affect the training time.
Collapse
Affiliation(s)
- Pengcheng Jiang
- School of Software, Nanjing University of Information Science and Technology, Nanjing 210044, P. R. China
| | - Ferrante Neri
- NICE Research Group, School of Computer Science and Electronic Engineering, University of Surrey, Guildford GU2 7XS, UK
| | - Yu Xue
- School of Software, Nanjing University of Information Science and Technology, Nanjing 210044, P. R. China
| | - Ujjwal Maulik
- Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
| |
Collapse
|
11
|
Rafiei MH, Gauthier LV, Adeli H, Takabi D. Self-Supervised Learning for Near-Wild Cognitive Workload Estimation. J Med Syst 2024; 48:107. [PMID: 39576291 DOI: 10.1007/s10916-024-02122-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 11/08/2024] [Indexed: 11/24/2024]
Abstract
Feedback on cognitive workload may reduce decision-making mistakes. Machine learning-based models can produce feedback from physiological data such as electroencephalography (EEG) and electrocardiography (ECG). Supervised machine learning requires large training data sets that are (1) relevant and decontaminated and (2) carefully labeled for accurate approximation, a costly and tedious procedure. Commercial over-the-counter devices are low-cost resolutions for the real-time collection of physiological modalities. However, they produce significant artifacts when employed outside of laboratory settings, compromising machine learning accuracies. Additionally, the physiological modalities that most successfully machine-approximate cognitive workload in everyday settings are unknown. To address these challenges, a first-ever hybrid implementation of feature selection and self-supervised machine learning techniques is introduced. This model is employed on data collected outside controlled laboratory settings to (1) identify relevant physiological modalities to machine approximate six levels of cognitive-physical workloads from a seven-modality repository and (2) postulate limited labeling experiments and machine approximate mental-physical workloads using self-supervised learning techniques.
Collapse
Affiliation(s)
- Mohammad H Rafiei
- Whiting School of Engineering, Johns Hopkins University, 21218, Baltimore, MD, USA
| | - Lynne V Gauthier
- Department of Physical Therapy and Kinesiology, University of Massachusetts Lowell, 01854, Lowell, MA, USA
| | - Hojjat Adeli
- Departments of Biomedical Informatics and Neuroscience, The Ohio State University, 43210, Columbus, OH, USA.
| | - Daniel Takabi
- School of Cybersecurity, Old Dominion University, 23529, Norfolk, VA, USA
| |
Collapse
|
12
|
Lanzino R, Avola D, Fontana F, Cinque L, Scarcello F, Foresti GL. SATEER: Subject-Aware Transformer for EEG-Based Emotion Recognition. Int J Neural Syst 2024:2550002. [PMID: 39560447 DOI: 10.1142/s0129065725500029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Abstract
This study presents a Subject-Aware Transformer-based neural network designed for the Electroencephalogram (EEG) Emotion Recognition task (SATEER), which entails the analysis of EEG signals to classify and interpret human emotional states. SATEER processes the EEG waveforms by transforming them into Mel spectrograms, which can be seen as particular cases of images with the number of channels equal to the number of electrodes used during the recording process; this type of data can thus be processed using a Computer Vision pipeline. Distinct from preceding approaches, this model addresses the variability in individual responses to identical stimuli by incorporating a User Embedder module. This module enables the association of individual profiles with their EEGs, thereby enhancing classification accuracy. The efficacy of the model was rigorously evaluated using four publicly available datasets, demonstrating superior performance over existing methods in all conducted benchmarks. For instance, on the AMIGOS dataset (A dataset for Multimodal research of affect, personality traits, and mood on Individuals and GrOupS), SATEER's accuracy exceeds 99.8% accuracy across all labels and showcases an improvement of 0.47% over the state of the art. Furthermore, an exhaustive ablation study underscores the pivotal role of the User Embedder module and each other component of the presented model in achieving these advancements.
Collapse
Affiliation(s)
- Romeo Lanzino
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Danilo Avola
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Federico Fontana
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Luigi Cinque
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Francesco Scarcello
- Department of Computer Engineering, Modeling, Electronics, and Systems Engineering University of Calabria, Via Pietro Bucci, Rende (CS) 87036, Italy
| | - Gian Luca Foresti
- Department of Mathematics, Computer Science and Physics, University of Udine, Via delle Scienze Udine 33100, Italy
| |
Collapse
|
13
|
Ma C, Neri F, Gu L, Wang Z, Wang J, Qing A, Wang Y. Crowd Counting Using Meta-Test-Time Adaptation. Int J Neural Syst 2024; 34:2450061. [PMID: 39252679 DOI: 10.1142/s0129065724500618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Machine learning algorithms are commonly used for quickly and efficiently counting people from a crowd. Test-time adaptation methods for crowd counting adjust model parameters and employ additional data augmentation to better adapt the model to the specific conditions encountered during testing. The majority of current studies concentrate on unsupervised domain adaptation. These approaches commonly perform hundreds of epochs of training iterations, requiring a sizable number of unannotated data of every new target domain apart from annotated data of the source domain. Unlike these methods, we propose a meta-test-time adaptive crowd counting approach called CrowdTTA, which integrates the concept of test-time adaptation into the meta-learning framework and makes it easier for the counting model to adapt to the unknown test distributions. To facilitate the reliable supervision signal at the pixel level, we introduce uncertainty by inserting the dropout layer into the counting model. The uncertainty is then used to generate valuable pseudo labels, serving as effective supervisory signals for adapting the model. In the context of meta-learning, one image can be regarded as one task for crowd counting. In each iteration, our approach is a dual-level optimization process. In the inner update, we employ a self-supervised consistency loss function to optimize the model so as to simulate the parameters update process that occurs during the test phase. In the outer update, we authentically update the parameters based on the image with ground truth, improving the model's performance and making the pseudo labels more accurate in the next iteration. At test time, the input image is used for adapting the model before testing the image. In comparison to various supervised learning and domain adaptation methods, our results via extensive experiments on diverse datasets showcase the general adaptive capability of our approach across datasets with varying crowd densities and scales.
Collapse
Affiliation(s)
- Chaoqun Ma
- School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, P. R. China
| | - Ferrante Neri
- NICE Group, School of Computer Science and Electronic Engineering, University of Surrey, Guildford, Surrey GU2 7XH, UK
| | - Li Gu
- Department of Computer Science and Software Engineering, Concordia University, Montreal, QC H3H 2L9, Canada
| | - Ziqiang Wang
- Department of Computer Science and Software Engineering, Concordia University, Montreal, QC H3H 2L9, Canada
| | - Jian Wang
- Faculty of Electric Power Engineering, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - Anyong Qing
- School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, P. R. China
| | - Yang Wang
- Department of Computer Science and Software Engineering, Concordia University, Montreal, QC H3H 2L9, Canada
| |
Collapse
|
14
|
Liu Y, Jiang Y, Liu J, Li J, Liu M, Nie W, Yuan Q. Efficient EEG Feature Learning Model Combining Random Convolutional Kernel with Wavelet Scattering for Seizure Detection. Int J Neural Syst 2024; 34:2450060. [PMID: 39252680 DOI: 10.1142/s0129065724500606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Automatic seizure detection has significant value in epilepsy diagnosis and treatment. Although a variety of deep learning models have been proposed to automatically learn electroencephalography (EEG) features for seizure detection, the generalization performance and computational burden of such deep models remain the bottleneck of practical application. In this study, a novel lightweight model based on random convolutional kernel transform (ROCKET) is developed for EEG feature learning for seizure detection. Specifically, random convolutional kernels are embedded into the structure of a wavelet scattering network instead of original wavelet transform convolutions. Then the significant EEG features are selected from the scattering coefficients and convolutional outputs by analysis of variance (ANOVA) and minimum redundancy-maximum relevance (MRMR) methods. This model not only preserves the merits of the fast-training process from ROCKET, but also provides insight into seizure detection by retaining only the helpful channels. The extreme gradient boosting (XGboost) classifier was combined with this EEG feature learning model to build a comprehensive seizure detection system that achieved promising epoch-based results, with over 90% of both sensitivity and specificity on the scalp and intracranial EEG databases. The experimental comparisons showed that the proposed method outperformed other state-of-the-art methods for cross-patient and patient-specific seizure detection.
Collapse
Affiliation(s)
- Yasheng Liu
- Shandong Province Key Laboratory of Medical Physics and Image Processing Technology, School of Physics and Electronics, Shandong Normal University, Jinan 250358, P. R. China
| | - Yonghui Jiang
- Shandong Province Key Laboratory of Medical Physics and Image Processing Technology, School of Physics and Electronics, Shandong Normal University, Jinan 250358, P. R. China
| | - Jie Liu
- Department of Pediatric Intensive Care Unit, Shandong Provincial Maternal and Child Health Care Hospital, Affiliated to Qingdao University, Jinan 250014, P. R. China
| | - Jie Li
- Shandong Province Key Laboratory of Medical Physics and Image Processing Technology, School of Physics and Electronics, Shandong Normal University, Jinan 250358, P. R. China
| | - Mingze Liu
- Shandong Province Key Laboratory of Medical Physics and Image Processing Technology, School of Physics and Electronics, Shandong Normal University, Jinan 250358, P. R. China
| | - Weiwei Nie
- The First Affiliated Hospital of Shandong First Medical University, Shandong First Medical University, Jinan 250014, P. R. China
| | - Qi Yuan
- Shandong Province Key Laboratory of Medical Physics and Image Processing Technology, School of Physics and Electronics, Shandong Normal University, Jinan 250358, P. R. China
| |
Collapse
|
15
|
Guang M, Yan C, Xu Y, Wang J, Jiang C. A Multichannel Convolutional Decoding Network for Graph Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13206-13216. [PMID: 37141052 DOI: 10.1109/tnnls.2023.3266243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Graph convolutional networks (GCNs) have shown superior performance on graph classification tasks, and their structure can be considered as an encoder-decoder pair. However, most existing methods lack the comprehensive consideration of global and local in decoding, resulting in the loss of global information or ignoring some local information of large graphs. And the commonly used cross-entropy loss is essentially an encoder-decoder global loss, which cannot supervise the training states of the two local components (encoder and decoder). We propose a multichannel convolutional decoding network (MCCD) to solve the above-mentioned problems. MCCD first adopts a multichannel GCN encoder, which has better generalization than a single-channel GCN encoder since different channels can extract graph information from different perspectives. Then, we propose a novel decoder with a global-to-local learning pattern to decode graph information, and this decoder can better extract global and local information. We also introduce a balanced regularization loss to supervise the training states of the encoder and decoder so that they are sufficiently trained. Experiments on standard datasets demonstrate the effectiveness of our MCCD in terms of accuracy, runtime, and computational complexity.
Collapse
|
16
|
Chen W, Yu Z, Yang C, Lu Y. Abnormal Behavior Recognition Based on 3D Dense Connections. Int J Neural Syst 2024; 34:2450049. [PMID: 39010725 DOI: 10.1142/s0129065724500497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
Abnormal behavior recognition is an important technology used to detect and identify activities or events that deviate from normal behavior patterns. It has wide applications in various fields such as network security, financial fraud detection, and video surveillance. In recent years, Deep Convolution Networks (ConvNets) have been widely applied in abnormal behavior recognition algorithms and have achieved significant results. However, existing abnormal behavior detection algorithms mainly focus on improving the accuracy of the algorithms and have not explored the real-time nature of abnormal behavior recognition. This is crucial to quickly identify abnormal behavior in public places and improve urban public safety. Therefore, this paper proposes an abnormal behavior recognition algorithm based on three-dimensional (3D) dense connections. The proposed algorithm uses a multi-instance learning strategy to classify various types of abnormal behaviors, and employs dense connection modules and soft-threshold attention mechanisms to reduce the model's parameter count and enhance network computational efficiency. Finally, redundant information in the sequence is reduced by attention allocation to mitigate its negative impact on recognition results. Experimental verification shows that our method achieves a recognition accuracy of 95.61% on the UCF-crime dataset. Comparative experiments demonstrate that our model has strong performance in terms of recognition accuracy and speed.
Collapse
Affiliation(s)
- Wei Chen
- School of Electrical and Control Engineering, North China University of Technology, Beijing 100144, P. R. China
| | - Zhanhe Yu
- School of Information Science and Technology, North China University of Technology, Beijing 100144, P. R. China
| | - Chaochao Yang
- School of Electrical and Control Engineering, North China University of Technology, Beijing 100144, P. R. China
| | - Yuanyao Lu
- School of Information Science and Technology, North China University of Technology, Beijing 100144, P. R. China
| |
Collapse
|
17
|
Ren Q, Zhang L, Liu S, Liu JX, Shang J, Liu X. A Delayed Spiking Neural Membrane System for Adaptive Nearest Neighbor-Based Density Peak Clustering. Int J Neural Syst 2024:2450050. [PMID: 38973024 DOI: 10.1142/s0129065724500503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2024]
Abstract
Although the density peak clustering (DPC) algorithm can effectively distribute samples and quickly identify noise points, it lacks adaptability and cannot consider the local data structure. In addition, clustering algorithms generally suffer from high time complexity. Prior research suggests that clustering algorithms grounded in P systems can mitigate time complexity concerns. Within the realm of membrane systems (P systems), spiking neural P systems (SN P systems), inspired by biological nervous systems, are third-generation neural networks that possess intricate structures and offer substantial parallelism advantages. Thus, this study first improved the DPC by introducing the maximum nearest neighbor distance and K-nearest neighbors (KNN). Moreover, a method based on delayed spiking neural P systems (DSN P systems) was proposed to improve the performance of the algorithm. Subsequently, the DSNP-ANDPC algorithm was proposed. The effectiveness of DSNP-ANDPC was evaluated through comprehensive evaluations across four synthetic datasets and 10 real-world datasets. The proposed method outperformed the other comparison methods in most cases.
Collapse
Affiliation(s)
- Qianqian Ren
- School of Computer Science, Qufu Normal University, Rizhao 276826, P. R. China
| | - Lianlian Zhang
- School of Computer Science, Qufu Normal University, Rizhao 276826, P. R. China
| | - Shaoyi Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, P. R. China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, P. R. China
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao 276826, P. R. China
| | - Xiyu Liu
- Academy of Management Science, Business School, Shandong Normal University, Jinan 250300, P. R. China
| |
Collapse
|
18
|
Choi K, Choe Y, Park H. Reinforcement Learning May Demystify the Limited Human Motor Learning Efficacy Due to Visual-Proprioceptive Mismatch. Int J Neural Syst 2024; 34:2450037. [PMID: 38655914 DOI: 10.1142/s0129065724500370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Vision and proprioception have fundamental sensory mismatches in delivering locational information, and such mismatches are critical factors limiting the efficacy of motor learning. However, it is still not clear how and to what extent this mismatch limits motor learning outcomes. To further the understanding of the effect of sensory mismatch on motor learning outcomes, a reinforcement learning algorithm and the simplified biomechanical elbow joint model were employed to mimic the motor learning process in a computational environment. By applying a reinforcement learning algorithm to the motor learning of elbow joint flexion task, simulation results successfully explained how visual-proprioceptive mismatch limits motor learning outcomes in terms of motor control accuracy and task completion speed. The larger the perceived angular offset between the two sensory modalities, the lower the motor control accuracy. Also, the more similar the peak reward amplitude of the two sensory modalities, the lower the motor control accuracy. In addition, simulation results suggest that insufficient exploration rate limits task completion speed, and excessive exploration rate limits motor control accuracy. Such a speed-accuracy trade-off shows that a moderate exploration rate could serve as another important factor in motor learning.
Collapse
Affiliation(s)
- Kyungrak Choi
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Yoonsuck Choe
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Hangue Park
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
- Department of Biomedical Engineering, Sungkyunkwan University, Suwon, South Korea
- Department of Intelligent Precision Healthcare Convergence, Sungkyunkwan University, Suwon, South Korea
| |
Collapse
|
19
|
Zhang L, Xu F, Neri F. An Asynchronous Spiking Neural Membrane System for Edge Detection. Int J Neural Syst 2024; 34:2450023. [PMID: 38490956 DOI: 10.1142/s0129065724500230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
Spiking neural membrane systems (SN P systems) are a class of bio-inspired models inspired by the activities and connectivity of neurons. Extensive studies have been made on SN P systems with synchronization-based communication, while further efforts are needed for the systems with rhythm-based communication. In this work, we design an asynchronous SN P system with resonant connections where all the enabled neurons in the same group connected by resonant connections should instantly produce spikes with the same rhythm. In the designed system, each of the three modules implements one type of the three operations associated with the edge detection of digital images, and they collaborate each other through the resonant connections. An algorithm called EDSNP for edge detection is proposed to simulate the working of the designed asynchronous SN P system. A quantitative analysis of EDSNP and the related methods for edge detection had been conducted to evaluate the performance of EDSNP. The performance of the EDSNP in processing the testing images is superior to the compared methods, based on the quantitative metrics of accuracy, error rate, mean square error, peak signal-to-noise ratio and true positive rate. The results indicate the potential of the temporal firing and the proper neuronal connections in the SN P system to achieve good performance in edge detection.
Collapse
Affiliation(s)
- Luping Zhang
- Jiangxi Engineering Technology Research Center of Nuclear, Geoscience Data Science and System, Jiangxi Engineering Laboratory on Radioactive Geoscience and Big Data Technology, School of Information Engineering, East China University of Technology, Nanchang 330013, Jiangxi, P. R. China
| | - Fei Xu
- Key Laboratory of Image Information Processing and Intelligent Control of Education Ministry of China, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, P. R. China
| | - Ferrante Neri
- NICE Research Group, School of Computer Science and Electronic Engineering, University of Surrey, Guildford, Surrey GU2 7XH, UK
| |
Collapse
|
20
|
Gill TS, Zaidi SSH, Shirazi MA. Attention-based deep convolutional neural network for classification of generalized and focal epileptic seizures. Epilepsy Behav 2024; 155:109732. [PMID: 38636140 DOI: 10.1016/j.yebeh.2024.109732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 02/03/2024] [Accepted: 02/27/2024] [Indexed: 04/20/2024]
Abstract
Epilepsy affects over 50 million people globally. Electroencephalography is critical for epilepsy diagnosis, but manual seizure classification is time-consuming and requires extensive expertise. This paper presents an automated multi-class seizure classification model using EEG signals from the Temple University Hospital Seizure Corpus ver. 1.5.2. 11 features including time-based correlation, time-based eigenvalues, power spectral density, frequency-based correlation, frequency-based eigenvalues, sample entropy, spectral entropy, logarithmic sum, standard deviation, absolute mean, and ratio of Daubechies D4 wavelet transformed coefficients were extracted from 10-second sliding windows across channels. The model combines multi-head self-attention mechanism with a deep convolutional neural network (CNN) to classify seven subtypes of generalized and focal epileptic seizures. The model achieved 0.921 weighted accuracy and 0.902 weighted F1 score in classifying focal onset non-motor, generalized onset non-motor, simple partial, complex partial, absence, tonic, and tonic-clonic seizures. In comparison, a CNN model without multi-head attention achieved 0.767 weighted accuracy. Ablation studies were conducted to validate the importance of transformer encoders and attention. The promising classification results demonstrate the potential of deep learning for handling EEG complexity and improving epilepsy diagnosis. This seizure classification model could enable timely interventions when translated into clinical practice.
Collapse
Affiliation(s)
- Taimur Shahzad Gill
- Department of Electronics and Power Engineering, National University of Sciences and Technology, Islamabad 44000, Pakistan.
| | - Syed Sajjad Haider Zaidi
- Department of Electronics and Power Engineering, National University of Sciences and Technology, Islamabad 44000, Pakistan.
| | - Muhammad Ayaz Shirazi
- Department of Electronics and Power Engineering, National University of Sciences and Technology, Islamabad 44000, Pakistan.
| |
Collapse
|
21
|
Ermini I, Zandron C. Modular Spiking Neural Membrane Systems for Image Classification. Int J Neural Syst 2024; 34:2450021. [PMID: 38453666 DOI: 10.1142/s0129065724500217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
A variant of membrane computing models called Spiking Neural P systems (SNP systems) closely mimics the structure and behavior of biological neurons. As third-generation neural networks, SNP systems have flexible architectures allowing the design of bio-inspired machine learning algorithms. This paper proposes Modular Spiking Neural P (MSNP) systems to solve image classification problems, a novel SNP system to be applied in scenarios where hundreds or even thousands of different classes are considered. A main issue to face in such situations is related to the structural complexity of the network. MSNP systems devised in this work allow to approach the general classification problem by dividing it in smaller parts, that are then faced by single entities of the network. As a benchmark dataset, the Oxford Flowers 102 dataset is considered, consisting of more than 8000 pictures of flowers belonging to the 102 species commonly found in the UK. These classes sometimes present large variations within them, may be also very similar to one another, and different images of the same subject may differ a lot. The work describes the architecture of the MSNP system, based on modules focusing on a specific class, their training phase, and the evaluation of the model both concerning result accuracy as well as energy consumption. Experimental results on image classification problems show that the model achieves good results, but is strongly connected to image quality, mainly depending on the frequency of images, remarkable changes of pose, images not centered, and subject mostly not shown.
Collapse
Affiliation(s)
- Iris Ermini
- Dipartimento di Informatica, Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Viale Sarca 336/14 Milano 20126, Italy
| | - Claudio Zandron
- Dipartimento di Informatica, Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Viale Sarca 336/14 Milano 20126, Italy
| |
Collapse
|
22
|
Avola D, Cinque L, Mambro AD, Fagioli A, Marini MR, Pannone D, Fanini B, Foresti GL. Spatio-Temporal Image-Based Encoded Atlases for EEG Emotion Recognition. Int J Neural Syst 2024; 34:2450024. [PMID: 38533631 DOI: 10.1142/s0129065724500242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Emotion recognition plays an essential role in human-human interaction since it is a key to understanding the emotional states and reactions of human beings when they are subject to events and engagements in everyday life. Moving towards human-computer interaction, the study of emotions becomes fundamental because it is at the basis of the design of advanced systems to support a broad spectrum of application areas, including forensic, rehabilitative, educational, and many others. An effective method for discriminating emotions is based on ElectroEncephaloGraphy (EEG) data analysis, which is used as input for classification systems. Collecting brain signals on several channels and for a wide range of emotions produces cumbersome datasets that are hard to manage, transmit, and use in varied applications. In this context, the paper introduces the Empátheia system, which explores a different EEG representation by encoding EEG signals into images prior to their classification. In particular, the proposed system extracts spatio-temporal image encodings, or atlases, from EEG data through the Processing and transfeR of Interaction States and Mappings through Image-based eNcoding (PRISMIN) framework, thus obtaining a compact representation of the input signals. The atlases are then classified through the Empátheia architecture, which comprises branches based on convolutional, recurrent, and transformer models designed and tuned to capture the spatial and temporal aspects of emotions. Extensive experiments were conducted on the Shanghai Jiao Tong University (SJTU) Emotion EEG Dataset (SEED) public dataset, where the proposed system significantly reduced its size while retaining high performance. The results obtained highlight the effectiveness of the proposed approach and suggest new avenues for data representation in emotion recognition from EEG signals.
Collapse
Affiliation(s)
- Danilo Avola
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Luigi Cinque
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Angelo Di Mambro
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Alessio Fagioli
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Marco Raoul Marini
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Daniele Pannone
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Bruno Fanini
- Institute of Heritage Science, National Research Council, Area della Ricerca Roma 1, SP35d, 9, Montelibretti 00010, Italy
| | - Gian Luca Foresti
- Department of Computer Science, Mathematics and Physics, University of Udine, Via delle Scienze 206, Udine 33100, Italy
| |
Collapse
|
23
|
Madni HA, Umer RM, Foresti GL. Robust Federated Learning for Heterogeneous Model and Data. Int J Neural Syst 2024; 34:2450019. [PMID: 38414421 DOI: 10.1142/s0129065724500199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
Data privacy and security is an essential challenge in medical clinical settings, where individual hospital has its own sensitive patients data. Due to recent advances in decentralized machine learning in Federated Learning (FL), each hospital has its own private data and learning models to collaborate with other trusted participating hospitals. Heterogeneous data and models among different hospitals raise major challenges in robust FL, such as gradient leakage, where participants can exploit model weights to infer data. Here, we proposed a robust FL method to efficiently tackle data and model heterogeneity, where we train our model using knowledge distillation and a novel weighted client confidence score on hematological cytomorphology data in clinical settings. In the knowledge distillation, each participant learns from other participants by a weighted confidence score so that knowledge from clean models is distributed other than the noisy clients possessing noisy data. Moreover, we use symmetric loss to reduce the negative impact of data heterogeneity and label diversity by reducing overfitting the model to noisy labels. In comparison to the current approaches, our proposed method performs the best, and this is the first demonstration of addressing both data and model heterogeneity in end-to-end FL that lays the foundation for robust FL in laboratories and clinical applications.
Collapse
Affiliation(s)
- Hussain Ahmad Madni
- Department of Mathematics, Computer Science and Physics (DMIF), University of Udine, Udine 33100, Italy
| | - Rao Muhammad Umer
- Institute of AI for Health, Helmholtz Zentrum München - German Research, Center for Environmental Health, Neuherberg 85764, Germany
| | - Gian Luca Foresti
- Department of Mathematics, Computer Science and Physics (DMIF), University of Udine, Udine 33100, Italy
| |
Collapse
|
24
|
Chen L, Leng L, Yang Z, Teoh ABJ. Enhanced Multitask Learning for Hash Code Generation of Palmprint Biometrics. Int J Neural Syst 2024; 34:2450020. [PMID: 38414422 DOI: 10.1142/s0129065724500205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
This paper presents a novel multitask learning framework for palmprint biometrics, which optimizes classification and hashing branches jointly. The classification branch within our framework facilitates the concurrent execution of three distinct tasks: identity recognition and classification of soft biometrics, encompassing gender and chirality. On the other hand, the hashing branch enables the generation of palmprint hash codes, optimizing for minimal storage as templates and efficient matching. The hashing branch derives the complementary information from these tasks by amalgamating knowledge acquired from the classification branch. This approach leads to superior overall performance compared to individual tasks in isolation. To enhance the effectiveness of multitask learning, two additional modules, an attention mechanism module and a customized gate control module, are introduced. These modules are vital in allocating higher weights to crucial channels and facilitating task-specific expert knowledge integration. Furthermore, an automatic weight adjustment module is incorporated to optimize the learning process further. This module fine-tunes the weights assigned to different tasks, improving performance. Integrating the three modules above has shown promising accuracies across various classification tasks and has notably improved authentication accuracy. The extensive experimental results validate the efficacy of our proposed framework.
Collapse
Affiliation(s)
- Lin Chen
- Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition, Nanchang Hangkong University, Nanchang, Jiangxi, P. R. China
| | - Lu Leng
- Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition, Nanchang Hangkong University, Nanchang, Jiangxi, P. R. China
| | - Ziyuan Yang
- College of Computer Science, Sichuan University, Chengdu, Sichuan, P. R. China
| | - Andrew Beng Jin Teoh
- School of Electrical and Electronic Engineering, College of Engineering, Yonsei University Seoul, Republic of Korea
| |
Collapse
|
25
|
Niu H, Yi Z, He T. A Bidirectional Feedforward Neural Network Architecture Using the Discretized Neural Memory Ordinary Differential Equation. Int J Neural Syst 2024; 34:2450015. [PMID: 38318709 DOI: 10.1142/s0129065724500151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Deep Feedforward Neural Networks (FNNs) with skip connections have revolutionized various image recognition tasks. In this paper, we propose a novel architecture called bidirectional FNN (BiFNN), which utilizes skip connections to aggregate features between its forward and backward paths. The BiFNN accepts any FNN as a plugin that can incorporate any general FNN model into its forward path, introducing only a few additional parameters in the cross-path connections. The backward path is implemented as a nonparameter layer, utilizing a discretized form of the neural memory Ordinary Differential Equation (nmODE), which is named [Formula: see text]-net. We provide a proof of convergence for the [Formula: see text]-net and evaluate its initial value problem. Our proposed architecture is evaluated on diverse image recognition datasets, including Fashion-MNIST, SVHN, CIFAR-10, CIFAR-100, and Tiny-ImageNet. The results demonstrate that BiFNNs offer significant improvements compared to embedded models such as ConvMixer, ResNet, ResNeXt, and Vision Transformer. Furthermore, BiFNNs can be fine-tuned to achieve comparable performance with embedded models on Tiny-ImageNet and ImageNet-1K datasets by loading the same pretrained parameters.
Collapse
Affiliation(s)
- Hao Niu
- College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Zhang Yi
- College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Tao He
- College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| |
Collapse
|
26
|
Zhang C, Xue Y, Neri F, Cai X, Slowik A. Multi-Objective Self-Adaptive Particle Swarm Optimization for Large-Scale Feature Selection in Classification. Int J Neural Syst 2024; 34:2450014. [PMID: 38352979 DOI: 10.1142/s012906572450014x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Feature selection (FS) is recognized for its role in enhancing the performance of learning algorithms, especially for high-dimensional datasets. In recent times, FS has been framed as a multi-objective optimization problem, leading to the application of various multi-objective evolutionary algorithms (MOEAs) to address it. However, the solution space expands exponentially with the dataset's dimensionality. Simultaneously, the extensive search space often results in numerous local optimal solutions due to a large proportion of unrelated and redundant features [H. Adeli and H. S. Park, Fully automated design of super-high-rise building structures by a hybrid ai model on a massively parallel machine, AI Mag. 17 (1996) 87-93]. Consequently, existing MOEAs struggle with local optima stagnation, particularly in large-scale multi-objective FS problems (LSMOFSPs). Different LSMOFSPs generally exhibit unique characteristics, yet most existing MOEAs rely on a single candidate solution generation strategy (CSGS), which may be less efficient for diverse LSMOFSPs [H. S. Park and H. Adeli, Distributed neural dynamics algorithms for optimization of large steel structures, J. Struct. Eng. ASCE 123 (1997) 880-888; M. Aldwaik and H. Adeli, Advances in optimization of highrise building structures, Struct. Multidiscip. Optim. 50 (2014) 899-919; E. G. González, J. R. Villar, Q. Tan, J. Sedano and C. Chira, An efficient multi-robot path planning solution using a* and coevolutionary algorithms, Integr. Comput. Aided Eng. 30 (2022) 41-52]. Moreover, selecting an appropriate MOEA and determining its corresponding parameter values for a specified LSMOFSP is time-consuming. To address these challenges, a multi-objective self-adaptive particle swarm optimization (MOSaPSO) algorithm is proposed, combined with a rapid nondominated sorting approach. MOSaPSO employs a self-adaptive mechanism, along with five modified efficient CSGSs, to generate new solutions. Experiments were conducted on ten datasets, and the results demonstrate that the number of features is effectively reduced by MOSaPSO while lowering the classification error rate. Furthermore, superior performance is observed in comparison to its counterparts on both the training and test sets, with advantages becoming increasingly evident as the dimensionality increases.
Collapse
Affiliation(s)
- Chenyi Zhang
- School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing 210044, P. R. China
| | - Yu Xue
- School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing 210044, P. R. China
| | - Ferrante Neri
- NICE Research Group, School of Computer Science and Electronic Engineering, University of Surrey Guildford, GU2 7XS, UK
| | - Xu Cai
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, P. R. China
| | - Adam Slowik
- Department of Electronics and Computer Science, Koszalin University of Technology, Koszalin 75-453, Poland
| |
Collapse
|
27
|
Zhao R, Xie Z, Zhuang Y, L H Yu P. Automated Quality Evaluation of Large-Scale Benchmark Datasets for Vision-Language Tasks. Int J Neural Syst 2024; 34:2450009. [PMID: 38318751 DOI: 10.1142/s0129065724500096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Large-scale benchmark datasets are crucial in advancing research within the computer science communities. They enable the development of more sophisticated AI models and serve as "golden" benchmarks for evaluating their performance. Thus, ensuring the quality of these datasets is of utmost importance for academic research and the progress of AI systems. For the emerging vision-language tasks, some datasets have been created and frequently used, such as Flickr30k, COCO, and NoCaps, which typically contain a large number of images paired with their ground-truth textual descriptions. In this paper, an automatic method is proposed to assess the quality of large-scale benchmark datasets designed for vision-language tasks. In particular, a new cross-modal matching model is developed, which is capable of automatically scoring the textual descriptions of visual images. Subsequently, this model is employed to evaluate the quality of vision-language datasets by automatically assigning a score to each 'ground-truth' description for every image picture. With a good agreement between manual and automated scoring results on the datasets, our findings reveal significant disparities in the quality of the ground-truth descriptions included in the benchmark datasets. Even more surprising, it is evident that a small portion of the descriptions are unsuitable for serving as reliable ground-truth references. These discoveries emphasize the need for careful utilization of these publicly accessible benchmark databases.
Collapse
Affiliation(s)
- Ruibin Zhao
- Department of Mathematics and Information Technology, The Education University of Hong Kong, Hong Kong SAR, P. R. China
- School of Computer Science and Information Engineering, Chuzhou University, Chuzhou, P. R. China
| | - Zhiwei Xie
- Department of Mathematics and Information Technology, The Education University of Hong Kong, Hong Kong SAR, P. R. China
| | - Yipeng Zhuang
- Department of Mathematics and Information Technology, The Education University of Hong Kong, Hong Kong SAR, P. R. China
| | - Philip L H Yu
- Department of Mathematics and Information Technology, The Education University of Hong Kong, Hong Kong SAR, P. R. China
| |
Collapse
|
28
|
Zhu H, Xu Y, Wu Y, Shen N, Wang L, Chen C, Chen W. A Sequential End-to-End Neonatal Sleep Staging Model with Squeeze and Excitation Blocks and Sequential Multi-Scale Convolution Neural Networks. Int J Neural Syst 2024; 34:2450013. [PMID: 38369905 DOI: 10.1142/s0129065724500138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Automatic sleep staging offers a quick and objective assessment for quantitatively interpreting sleep stages in neonates. However, most of the existing studies either do not encompass any temporal information, or simply apply neural networks to exploit temporal information at the expense of high computational overhead and modeling ambiguity. This limits the application of these methods to multiple scenarios. In this paper, a sequential end-to-end sleep staging model, SeqEESleepNet, which is competent for parallelly processing sequential epochs and has a fast training rate to adapt to different scenarios, is proposed. SeqEESleepNet consists of a sequence epoch generation (SEG) module, a sequential multi-scale convolution neural network (SMSCNN) and squeeze and excitation (SE) blocks. The SEG module expands independent epochs into sequential signals, enabling the model to learn the temporal information between sleep stages. SMSCNN is a multi-scale convolution neural network that can extract both multi-scale features and temporal information from the signal. Subsequently, the followed SE block can reassign the weights of features through mapping and pooling. Experimental results exhibit that in a clinical dataset, the proposed method outperforms the state-of-the-art approaches, achieving an overall accuracy, F1-score, and Kappa coefficient of 71.8%, 71.8%, and 0.684 on a three-class classification task with a single channel EEG signal. Based on our overall results, we believe the proposed method could pave the way for convenient multi-scenario neonatal sleep staging methods.
Collapse
Affiliation(s)
- Hangyu Zhu
- Center for Intelligent Medical Electronics, School of Information Science and Technology, Fudan University, Shanghai 200433, P. R. China
| | - Yan Xu
- Department of Neurology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, P. R. China
| | - Yonglin Wu
- Center for Intelligent Medical Electronics, School of Information Science and Technology, Fudan University, Shanghai 200433, P. R. China
| | - Ning Shen
- Center for Intelligent Medical Electronics, School of Information Science and Technology, Fudan University, Shanghai 200433, P. R. China
| | - Laishuan Wang
- Department of Neurology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, P. R. China
| | - Chen Chen
- Human Phenome Institute, Fudan University, 825 Zhangheng Road, Shanghai 201203, P. R. China
| | - Wei Chen
- Center for Intelligent Medical Electronics, School of Information Science and Technology, Fudan University, Shanghai 200433, P. R. China
| |
Collapse
|
29
|
Mammone N, Ieracitano C, Spataro R, Guger C, Cho W, Morabito FC. A Few-Shot Transfer Learning Approach for Motion Intention Decoding from Electroencephalographic Signals. Int J Neural Syst 2024; 34:2350068. [PMID: 38073546 DOI: 10.1142/s0129065723500685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
In this study, a few-shot transfer learning approach was introduced to decode movement intention from electroencephalographic (EEG) signals, allowing to recognize new tasks with minimal adaptation. To this end, a dataset of EEG signals recorded during the preparation of complex sub-movements was created from a publicly available data collection. The dataset was divided into two parts: the source domain dataset (including 5 classes) and the support (target domain) dataset, (including 2 classes) with no overlap between the two datasets in terms of classes. The proposed methodology consists in projecting EEG signals into the space-frequency-time domain, in processing such projections (rearranged in channels × frequency frames) by means of a custom EEG-based deep neural network (denoted as EEGframeNET5), and then adapting the system to recognize new tasks through a few-shot transfer learning approach. The proposed method achieved an average accuracy of 72.45 ± 4.19% in the 5-way classification of samples from the source domain dataset, outperforming comparable studies in the literature. In the second phase of the study, a few-shot transfer learning approach was proposed to adapt the neural system and make it able to recognize new tasks in the support dataset. The results demonstrated the system's ability to adapt and recognize new tasks with an average accuracy of 80 ± 0.12% in discriminating hand opening/closing preparation and outperforming reported results in the literature. This study suggests the effectiveness of EEG in capturing information related to the motor preparation of complex movements, potentially paving the way for BCI systems based on motion planning decoding. The proposed methodology could be straightforwardly extended to advanced EEG signal processing in other scenarios, such as motor imagery or neural disorder classification.
Collapse
Affiliation(s)
- Nadia Mammone
- DICEAM, University Mediterranea of Reggio Calabria Via Zehender, Loc. Feo di Vito, Reggio Calabria, 89122, Italy
| | - Cosimo Ieracitano
- DICEAM, University Mediterranea of Reggio Calabria Via Zehender, Loc. Feo di Vito, Reggio Calabria, 89122, Italy
| | - Rossella Spataro
- ALS Clinical Research Center, BiND, University of Palermo, Palermo, Italy
- Intensive Rehabilitation Unit, Villa delle Ginestre Hospital, Palermo, Italy
| | | | - Woosang Cho
- g.tec Medical Engineering GmbH, 4521, Schiedlberg, Austria
| | - Francesco Carlo Morabito
- DICEAM, University Mediterranea of Reggio Calabria Via Zehender, Loc. Feo di Vito, Reggio Calabria, 89122, Italy
| |
Collapse
|
30
|
Rafiei MH, Gauthier LV, Adeli H, Takabi D. Self-Supervised Learning for Electroencephalography. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1457-1471. [PMID: 35867362 DOI: 10.1109/tnnls.2022.3190448] [Citation(s) in RCA: 48] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Decades of research have shown machine learning superiority in discovering highly nonlinear patterns embedded in electroencephalography (EEG) records compared with conventional statistical techniques. However, even the most advanced machine learning techniques require relatively large, labeled EEG repositories. EEG data collection and labeling are costly. Moreover, combining available datasets to achieve a large data volume is usually infeasible due to inconsistent experimental paradigms across trials. Self-supervised learning (SSL) solves these challenges because it enables learning from EEG records across trials with variable experimental paradigms, even when the trials explore different phenomena. It aggregates multiple EEG repositories to increase accuracy, reduce bias, and mitigate overfitting in machine learning training. In addition, SSL could be employed in situations where there is limited labeled training data, and manual labeling is costly. This article: 1) provides a brief introduction to SSL; 2) describes some SSL techniques employed in recent studies, including EEG; 3) proposes current and potential SSL techniques for future investigations in EEG studies; 4) discusses the cons and pros of different SSL techniques; and 5) proposes holistic implementation tips and potential future directions for EEG SSL practices.
Collapse
|
31
|
Nogay HS, Adeli H. Multiple Classification of Brain MRI Autism Spectrum Disorder by Age and Gender Using Deep Learning. J Med Syst 2024; 48:15. [PMID: 38252192 PMCID: PMC10803393 DOI: 10.1007/s10916-023-02032-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 12/31/2023] [Indexed: 01/23/2024]
Abstract
The fact that the rapid and definitive diagnosis of autism cannot be made today and that autism cannot be treated provides an impetus to look into novel technological solutions. To contribute to the resolution of this problem through multiple classifications by considering age and gender factors, in this study, two quadruple and one octal classifications were performed using a deep learning (DL) approach. Gender in one of the four classifications and age groups in the other were considered. In the octal classification, classes were created considering gender and age groups. In addition to the diagnosis of ASD (Autism Spectrum Disorders), another goal of this study is to find out the contribution of gender and age factors to the diagnosis of ASD by making multiple classifications based on age and gender for the first time. Brain structural MRI (sMRI) scans of participators with ASD and TD (Typical Development) were pre-processed in the system originally designed for this purpose. Using the Canny Edge Detection (CED) algorithm, the sMRI image data was cropped in the data pre-processing stage, and the data set was enlarged five times with the data augmentation (DA) techniques. The most optimal convolutional neural network (CNN) models were developed using the grid search optimization (GSO) algorism. The proposed DL prediction system was tested with the five-fold cross-validation technique. Three CNN models were designed to be used in the system. The first of these models is the quadruple classification model created by taking gender into account (model 1), the second is the quadruple classification model created by taking into account age (model 2), and the third is the eightfold classification model created by taking into account both gender and age (model 3). ). The accuracy rates obtained for all three designed models are 80.94, 85.42 and 67.94, respectively. These obtained accuracy rates were compared with pre-trained models by using the transfer learning approach. As a result, it was revealed that age and gender factors were effective in the diagnosis of ASD with the system developed for ASD multiple classifications, and higher accuracy rates were achieved compared to pre-trained models.
Collapse
Affiliation(s)
- Hidir Selcuk Nogay
- Electrical and Energy Department, Bursa Uludag University, Bursa, Turkey
| | - Hojjat Adeli
- Departments of Biomedical Informatics and Neuroscience, College of Medicine, The Ohio State University Neurology, 370 W. 9th Avenue, Columbus, OH, 43210, USA.
| |
Collapse
|
32
|
Ganjali M, Mehridehnavi A, Rakhshani S, Khorasani A. Unsupervised Neural Manifold Alignment for Stable Decoding of Movement from Cortical Signals. Int J Neural Syst 2024; 34:2450006. [PMID: 38063378 DOI: 10.1142/s0129065724500060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The stable decoding of movement parameters using neural activity is crucial for the success of brain-machine interfaces (BMIs). However, neural activity can be unstable over time, leading to changes in the parameters used for decoding movement, which can hinder accurate movement decoding. To tackle this issue, one approach is to transfer neural activity to a stable, low-dimensional manifold using dimensionality reduction techniques and align manifolds across sessions by maximizing correlations of the manifolds. However, the practical use of manifold stabilization techniques requires knowledge of the true subject intentions such as target direction or behavioral state. To overcome this limitation, an automatic unsupervised algorithm is proposed that determines movement target intention before manifold alignment in the presence of manifold rotation and scaling across sessions. This unsupervised algorithm is combined with a dimensionality reduction and alignment method to overcome decoder instabilities. The effectiveness of the BMI stabilizer method is represented by decoding the two-dimensional (2D) hand velocity of two rhesus macaque monkeys during a center-out-reaching movement task. The performance of the proposed method is evaluated using correlation coefficient and R-squared measures, demonstrating higher decoding performance compared to a state-of-the-art unsupervised BMI stabilizer. The results offer benefits for the automatic determination of movement intents in long-term BMI decoding. Overall, the proposed method offers a promising automatic solution for achieving stable and accurate movement decoding in BMI applications.
Collapse
Affiliation(s)
- Mohammadali Ganjali
- Department of Biomedical Engineering, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Alireza Mehridehnavi
- Department of Biomedical Engineering, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Sajed Rakhshani
- Medical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Abed Khorasani
- Department of Neurology, Northwestern University, Chicago, IL, 60611, USA
- Neuroscience Research Center, Institute of Neuropharmacology, Kerman University of Medical Sciences, Kerman, Iran
| |
Collapse
|
33
|
Dózsa T, Deuschle F, Cornelis B, Kovács P. Variable Projection Support Vector Machines and Some Applications Using Adaptive Hermite Expansions. Int J Neural Syst 2024; 34:2450004. [PMID: 38073547 DOI: 10.1142/s0129065724500047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
In this paper, we develop the so-called variable projection support vector machine (VP-SVM) algorithm that is a generalization of the classical SVM. In fact, the VP block serves as an automatic feature extractor to the SVM, which are trained simultaneously. We consider the primal form of the arising optimization task and investigate the use of nonlinear kernels. We show that by choosing the so-called adaptive Hermite function system as the basis of the orthogonal projections in our classification scheme, several real-world signal processing problems can be successfully solved. In particular, we test the effectiveness of our method in two case studies corresponding to anomaly detection. First, we consider the detection of abnormal peaks in accelerometer data caused by sensor malfunction. Then, we show that the proposed classification algorithm can be used to detect abnormalities in ECG data. Our experiments show that the proposed method produces comparable results to the state-of-the-art while retaining desired properties of SVM classification such as light weight architecture and interpretability. We implement the proposed method on a microcontroller and demonstrate its ability to be used for real-time applications. To further minimize computational cost, discrete orthogonal adaptive Hermite functions are introduced for the first time.
Collapse
Affiliation(s)
- Tamás Dózsa
- Department of Numerical Analysis, HUN-REN Institute for Computer Science and Control, Eötvös Loránd University, Budapest H-1111, Hungary
| | - Federico Deuschle
- Siemens Digital Industries Software, 68 Interleuvenlaan KU Leuven, Department of Mechanical Engineering, Leuven B-3001, Belgium
| | - Bram Cornelis
- Siemens Digital Industries Software, 68 Interleuvenlaan KU Leuven, Department of Mechanical Engineering, Leuven B-3001, Belgium
| | - Péter Kovács
- Department of Numerical Analysis, Eötvös Loránd University, Pázmány Péter sétány 1/C Budapest 1117, Hungary
| |
Collapse
|
34
|
Leng J, Zhu J, Yan Y, Yu X, Liu M, Lou Y, Liu Y, Gao L, Sun Y, He T, Yang Q, Feng C, Wang D, Zhang Y, Xu Q, Xu F. Multilevel Laser-Induced Pain Measurement with Wasserstein Generative Adversarial Network - Gradient Penalty Model. Int J Neural Syst 2024; 34:2350067. [PMID: 38149912 DOI: 10.1142/s0129065723500673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Pain is an experience of unpleasant sensations and emotions associated with actual or potential tissue damage. In the global context, billions of people are affected by pain disorders. There are particular challenges in the measurement and assessment of pain, and the commonly used pain measuring tools include traditional subjective scoring methods and biomarker-based measures. The main tools for biomarker-based analysis are electroencephalography (EEG), electrocardiography and functional magnetic resonance. The EEG-based quantitative pain measurements are of immense value in clinical pain management and can provide objective assessments of pain intensity. The assessment of pain is now primarily limited to the identification of the presence or absence of pain, with less research on multilevel pain. High power laser stimulation pain experimental paradigm and five pain level classification methods based on EEG data augmentation are presented. First, the EEG features are extracted using modified S-transform, and the time-frequency information of the features is retained. Based on the pain recognition effect, the 20-40[Formula: see text]Hz frequency band features are optimized. Afterwards the Wasserstein generative adversarial network with gradient penalty is used for feature data augmentation. It can be inferred from the good classification performance of features in the parietal region of the brain that the sensory function of the parietal lobe region is effectively activated during the occurrence of pain. By comparing the latest data augmentation methods and classification algorithms, the proposed method has significant advantages for the five-level pain dataset. This research provides new ways of thinking and research methods related to pain recognition, which is essential for the study of neural mechanisms and regulatory mechanisms of pain.
Collapse
Affiliation(s)
- Jiancai Leng
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Jianqun Zhu
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Yihao Yan
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Xin Yu
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Ming Liu
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Yitai Lou
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Yanbing Liu
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Licai Gao
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Yuan Sun
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Tianzheng He
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Qingbo Yang
- School of Mathematics and Statistics, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Chao Feng
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| | - Dezheng Wang
- Rehabilitation Center, Qilu Hospital of Shandong University, Jinan 250012, P. R. China
| | - Yang Zhang
- Rehabilitation Center, Qilu Hospital of Shandong University, Jinan 250012, P. R. China
| | - Qing Xu
- Shandong Institute of Scientific and Technical Information, Jinan 250101, P. R. China
| | - Fangzhou Xu
- International School for Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, P. R. China
| |
Collapse
|
35
|
Villarrubia-Martin EA, Rodriguez-Benitez L, Jimenez-Linares L, Muñoz-Valero D, Liu J. A Hybrid Online Off-Policy Reinforcement Learning Agent Framework Supported by Transformers. Int J Neural Syst 2023; 33:2350065. [PMID: 37857407 DOI: 10.1142/s012906572350065x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
Reinforcement learning (RL) is a powerful technique that allows agents to learn optimal decision-making policies through interactions with an environment. However, traditional RL algorithms suffer from several limitations such as the need for large amounts of data and long-term credit assignment, i.e. the problem of determining which actions actually produce a certain reward. Recently, Transformers have shown their capacity to address these constraints in this area of learning in an offline setting. This paper proposes a framework that uses Transformers to enhance the training of online off-policy RL agents and address the challenges described above through self-attention. The proposal introduces a hybrid agent with a mixed policy that combines an online off-policy agent with an offline Transformer agent using the Decision Transformer architecture. By sequentially exchanging the experience replay buffer between the agents, the agent's learning training efficiency is improved in the first iterations and so is the training of Transformer-based RL agents in situations with limited data availability or unknown environments.
Collapse
Affiliation(s)
- Enrique Adrian Villarrubia-Martin
- Department of Technologies and Information Systems, Universidad de Castilla-La Mancha, Paseo de la Universidad 4, 13005 Ciudad Real, Spain
| | - Luis Rodriguez-Benitez
- Department of Technologies and Information Systems, Universidad de Castilla-La Mancha, Paseo de la Universidad 4, 13005 Ciudad Real, Spain
| | - Luis Jimenez-Linares
- Department of Technologies and Information Systems, Universidad de Castilla-La Mancha, Paseo de la Universidad 4, 13005 Ciudad Real, Spain
| | - David Muñoz-Valero
- Department of Technologies and Information Systems, Universidad de Castilla-La Mancha, Avenida Carlos III, s/n, 45004 Toledo, Spain
| | - Jun Liu
- School of Computing, University of Ulster, Northern Ireland, UK
| |
Collapse
|
36
|
Hu J, Yu C, Yi Z, Zhang H. Enhancing Robustness of Medical Image Segmentation Model with Neural Memory Ordinary Differential Equation. Int J Neural Syst 2023; 33:2350060. [PMID: 37743765 DOI: 10.1142/s0129065723500600] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Deep neural networks (DNNs) have emerged as a prominent model in medical image segmentation, achieving remarkable advancements in clinical practice. Despite the promising results reported in the literature, the effectiveness of DNNs necessitates substantial quantities of high-quality annotated training data. During experiments, we observe a significant decline in the performance of DNNs on the test set when there exists disruption in the labels of the training dataset, revealing inherent limitations in the robustness of DNNs. In this paper, we find that the neural memory ordinary differential equation (nmODE), a recently proposed model based on ordinary differential equations (ODEs), not only addresses the robustness limitation but also enhances performance when trained by the clean training dataset. However, it is acknowledged that the ODE-based model tends to be less computationally efficient compared to the conventional discrete models due to the multiple function evaluations required by the ODE solver. Recognizing the efficiency limitation of the ODE-based model, we propose a novel approach called the nmODE-based knowledge distillation (nmODE-KD). The proposed method aims to transfer knowledge from the continuous nmODE to a discrete layer, simultaneously enhancing the model's robustness and efficiency. The core concept of nmODE-KD revolves around enforcing the discrete layer to mimic the continuous nmODE by minimizing the KL divergence between them. Experimental results on 18 organs-at-risk segmentation tasks demonstrate that nmODE-KD exhibits improved robustness compared to ODE-based models while also mitigating the efficiency limitation.
Collapse
Affiliation(s)
- Junjie Hu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Chengrong Yu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Zhang Yi
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Haixian Zhang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| |
Collapse
|
37
|
Laport F, Dapena A, Castro PM, Iglesias DI, Vazquez-Araujo FJ. Eye State Detection Using Frequency Features from 1 or 2-Channel EEG. Int J Neural Syst 2023; 33:2350062. [PMID: 37822240 DOI: 10.1142/s0129065723500624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Brain-computer interfaces (BCIs) establish a direct communication channel between the human brain and external devices. Among various methods, electroencephalography (EEG) stands out as the most popular choice for BCI design due to its non-invasiveness, ease of use, and cost-effectiveness. This paper aims to present and compare the accuracy and robustness of an EEG system employing one or two channels. We present both hardware and algorithms for the detection of open and closed eyes. Firstly, we utilize a low-cost hardware device to capture EEG activity from one or two channels. Next, we apply the discrete Fourier transform to analyze the signals in the frequency domain, extracting features from each channel. For classification, we test various well-known techniques, including Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Decision Tree (DT), or Logistic Regression (LR). To evaluate the system, we conduct experiments, acquiring signals associated with open and closed eyes, and compare the performance between one and two channels. The results demonstrate that employing a system with two channels and using SVM, DT, or LR classifiers enhances robustness compared to a single-channel setup and allows us to achieve an accuracy percentage greater than 95% for both eye states.
Collapse
Affiliation(s)
- Francisco Laport
- CITIC Research Centre & University of A Coruña, Campus de Elviña, s/n A Coruña, 15071, Spain
| | - Adriana Dapena
- CITIC Research Centre & University of A Coruña, Campus de Elviña, s/n A Coruña, 15071, Spain
| | - Paula M Castro
- CITIC Research Centre & University of A Coruña, Campus de Elviña, s/n A Coruña, 15071, Spain
| | - Daniel I Iglesias
- CITIC Research Centre & University of A Coruña, Campus de Elviña, s/n A Coruña, 15071, Spain
| | | |
Collapse
|
38
|
Teran-Pineda D, Thurnhofer-Hemsi K, Domínguez E. Human Gait Activity Recognition Using Multimodal Sensors. Int J Neural Syst 2023; 33:2350058. [PMID: 37779221 DOI: 10.1142/s0129065723500582] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/03/2023]
Abstract
Human activity recognition is an application of machine learning with the aim of identifying activities from the gathered activity raw data acquired by different sensors. In medicine, human gait is commonly analyzed by doctors to detect abnormalities and determine possible treatments for the patient. Monitoring the patient's activity is paramount in evaluating the treatment's evolution. This type of classification is still not enough precise, which may lead to unfavorable reactions and responses. A novel methodology that reduces the complexity of extracting features from multimodal sensors is proposed to improve human activity classification based on accelerometer data. A sliding window technique is used to demarcate the first dominant spectral amplitude, decreasing dimensionality and improving feature extraction. In this work, we compared several state-of-art machine learning classifiers evaluated on the HuGaDB dataset and validated on our dataset. Several configurations to reduce features and training time were analyzed using multimodal sensors: all-axis spectrum, single-axis spectrum, and sensor reduction.
Collapse
Affiliation(s)
- Diego Teran-Pineda
- Department of Computer Languages and Computer Science, University of Málaga Bulevar Louis Pasteur, 35, 29071, Málaga, Spain
- Biomedical Research Institute of Málaga (IBIMA), C/ Doctor Miguel Díaz Recio, 28, 29010, Málaga, Spain
| | - Karl Thurnhofer-Hemsi
- Department of Computer Languages and Computer Science, University of Málaga Bulevar Louis Pasteur, 35, 29071, Málaga, Spain
- Biomedical Research Institute of Málaga (IBIMA), C/ Doctor Miguel Díaz Recio, 28, 29010, Málaga, Spain
| | - Enrique Domínguez
- Department of Computer Languages and Computer Science, University of Málaga Bulevar Louis Pasteur, 35, 29071, Málaga, Spain
- Biomedical Research Institute of Málaga (IBIMA), C/ Doctor Miguel Díaz Recio, 28, 29010, Málaga, Spain
| |
Collapse
|
39
|
Vernikos I, Spyrou E, Kostis IA, Mathe E, Mylonas P. A Deep Regression Approach for Human Activity Recognition Under Partial Occlusion. Int J Neural Syst 2023; 33:2350047. [PMID: 37602705 DOI: 10.1142/s0129065723500478] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2023]
Abstract
In real-life scenarios, Human Activity Recognition (HAR) from video data is prone to occlusion of one or more body parts of the human subjects involved. Although it is common sense that the recognition of the majority of activities strongly depends on the motion of some body parts, which when occluded compromise the performance of recognition approaches, this problem is often underestimated in contemporary research works. Currently, training and evaluation is based on datasets that have been shot under laboratory (ideal) conditions, i.e. without any kind of occlusion. In this work, we propose an approach for HAR in the presence of partial occlusion, in cases wherein up to two body parts are involved. We assume that human motion is modeled using a set of 3D skeletal joints and also that occluded body parts remain occluded during the whole duration of the activity. We solve this problem using regression, performed by a novel deep Convolutional Recurrent Neural Network (CRNN). Specifically, given a partially occluded skeleton, we attempt to reconstruct the missing information regarding the motion of its occluded part(s). We evaluate our approach using four publicly available human motion datasets. Our experimental results indicate a significant increase of performance, when compared to baseline approaches, wherein networks that have been trained using only nonoccluded or both occluded and nonoccluded samples are evaluated using occluded samples. To the best of our knowledge, this is the first research work that formulates and copes with the problem of HAR under occlusion as a regression task.
Collapse
Affiliation(s)
- Ioannis Vernikos
- Department of Informatics and Telecommunications, University of Thessaly, 3rd Km Old National Road Lamia-Athens, Lamia 35132, Greece
| | - Evaggelos Spyrou
- Department of Informatics and Telecommunications, University of Thessaly, 3rd Km Old National Road Lamia-Athens, Lamia 35132, Greece
| | - Ioannis-Aris Kostis
- Department of Informatics and Telecommunications, University of Thessaly, 3rd Km Old National Road Lamia-Athens, Lamia 35132, Greece
| | - Eirini Mathe
- Department of Informatics, Ionian University, 7 Tsirigoti Square, Corfu 49100, Greece
| | - Phivos Mylonas
- Department of Informatics and Computer Engineering, University of West Attica, Egaleo Park, Agiou Spyridonos Street, 12243 Egaleo, Athens, Greece
| |
Collapse
|
40
|
Haralabopoulos G, Razis G, Anagnostopoulos I. A Modified Long Short-Term Memory Cell. Int J Neural Syst 2023; 33:2350039. [PMID: 37300815 DOI: 10.1142/s0129065723500399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Machine Learning (ML), among other things, facilitates Text Classification, the task of assigning classes to textual items. Classification performance in ML has been significantly improved due to recent developments, including the rise of Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), and Transformer Models. Internal memory states with dynamic temporal behavior can be found in these kinds of cells. This temporal behavior in the LSTM cell is stored in two different states: "Current" and "Hidden". In this work, we define a modification layer within the LSTM cell which allows us to perform additional state adjustments for either state, or even simultaneously alter both. We perform 17 state alterations. Out of these 17 single-state alteration experiments, 12 involve the Current state whereas five involve the Hidden one. These alterations are evaluated using seven datasets related to sentiment analysis, document classification, hate speech detection, and human-to-robot interaction. Our results showed that the highest performing alteration for Current and Hidden state can achieve an average F1 improvement of 0.5% and 0.3%, respectively. We also compare our modified cell performance to two Transformer models, where our modified LSTM cell is outperformed in classification metrics in 4/6 datasets, but improves upon the simple Transformer model and clearly has a better cost efficiency than both Transformer models.
Collapse
Affiliation(s)
- Giannis Haralabopoulos
- Business Informatics Systems & Accounting Department, Henley Business School, University of Reading, Reading, UK
| | - Gerasimos Razis
- Department of Computer Science and Biomedical Informatics, School of Science, University of Thessaly, Lamia, Greece
| | - Ioannis Anagnostopoulos
- Department of Computer Science and Biomedical Informatics, School of Science, University of Thessaly, Lamia, Greece
| |
Collapse
|
41
|
Qin X, Niu Y, Zhou H, Li X, Jia W, Zheng Y. Driver Drowsiness EEG Detection Based on Tree Federated Learning and Interpretable Network. Int J Neural Syst 2023; 33:2350009. [PMID: 36655401 DOI: 10.1142/s0129065723500090] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Accurate identification of driver's drowsiness state through Electroencephalogram (EEG) signals can effectively reduce traffic accidents, but EEG signals are usually stored in various clients in the form of small samples. This study attempts to construct an efficient and accurate privacy-preserving drowsiness monitoring system, and proposes a fusion model based on tree Federated Learning (FL) and Convolutional Neural Network (CNN), which can not only identify and explain the driver's drowsiness state, but also integrate the information of different clients under the premise of privacy protection. Each client uses CNN with the Global Average Pooling (GAP) layer and shares model parameters. The tree FL transforms communication relationships into a graph structure, and model parameters are transmitted in parallel along connected branches of the graph. Moreover, the Class Activation Mapping (CAM) is used to find distinctive EEG features for representing specific classes. On EEG data of 11 subjects, it is found that this method has higher average accuracy, F1-score and AUC than the traditional classification method, reaching 73.56%, 73.26% and 78.23%, respectively. Compared with the traditional FL algorithm, this method better protects the driver's privacy and improves communication efficiency.
Collapse
Affiliation(s)
- Xue Qin
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, P. R. China
| | - Yi Niu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, P. R. China
| | - Huiyu Zhou
- School of Computing and Mathematical Sciences, University of Leicester, Leicester, LE1 7RH, UK
| | - Xiaojie Li
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, P. R. China
| | - Weikuan Jia
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, P. R. China
| | - Yuanjie Zheng
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, P. R. China
| |
Collapse
|
42
|
Maisano R, Foresti GL. A Sentiment Analysis Anomaly Detection System for Cyber Intelligence. Int J Neural Syst 2023; 33:2350003. [PMID: 36585854 DOI: 10.1142/s012906572350003x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Considering the 2030 United Nations intent of world connection, Cyber Intelligence becomes the main area of the human dimension able of inflicting changes in geopolitical dynamics. In cyberspace, the new battlefield is the mind of people including new weapons like abuse of social media with information manipulation, deception by activists and misinformation. In this paper, a Sentiment Analysis system with Anomaly Detection (SAAD) capability is proposed. The system, scalable and modular, uses an OSINT-Deep Learning approach to investigate on social media sentiment in order to predict suspicious anomaly trend in Twitter posts. Anomaly detection is investigated with a new semi-supervised process that is able to detect potentially dangerous situations in critical areas. The main contributions of the paper are the system suitability for working in different areas and domains, the anomaly detection procedure in sentiment context and a time-dependent confusion matrix to address model evaluation with unbalanced dataset. Real experiments and tests were performed on Sahel Region. The detected anomalies in negative sentiment have been checked by experts of Sahel area, proving true links between the models results and real situations observable from the tweets.
Collapse
Affiliation(s)
- Roberta Maisano
- Computer Science Centre, University of Messina, Piazza Antonello, 2, 98122 Messina, Italy
| | - Gian Luca Foresti
- Department of Mathematics, Computer Science and Physics, University of Udine, Viale delle Scienze, 206, 33100 Udine, Italy
| |
Collapse
|
43
|
Kumoi G, Yagi H, Kobayashi M, Goto M, Hirasawa S. Performance Evaluation of Error-Correcting Output Coding Based on Noisy and Noiseless Binary Classifiers. Int J Neural Syst 2023; 33:2350004. [PMID: 36624957 DOI: 10.1142/s0129065723500041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Error-correcting output coding (ECOC) is a method for constructing a multi-valued classifier using a combination of given binary classifiers. ECOC can estimate the correct category by other binary classifiers even if the output of some binary classifiers is incorrect based on the framework of the coding theory. The code word table representing the combination of these binary classifiers is important in ECOC. ECOC is known to perform well experimentally on real data. However, the complexity of the classification problem makes it difficult to analyze the classification performance in detail. For this reason, theoretical analysis of ECOC has not been conducted. In this study, if a binary classifier outputs the estimated posterior probability with errors, then this binary classifier is said to be noisy. In contrast, if a binary classifier outputs the true posterior probability, then this binary classifier is said to be noiseless. For a theoretical analysis of ECOC, we discuss the optimality for the code word table with noiseless binary classifiers and the error rate for one with noisy binary classifiers. This evaluation result shows that the Hamming distance of the code word table is an important indicator.
Collapse
Affiliation(s)
- Gendo Kumoi
- Center for Data Science, Waseda University, 1-6-1, Nishiwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
| | - Hideki Yagi
- Department of Computer and Network Engineering, The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu, Tokyo 182-8585, Japan
| | - Manabu Kobayashi
- Center for Data Science, Waseda University, 1-6-1, Nishiwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
| | - Masayuki Goto
- School of Creative Science and Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| | - Shigeichi Hirasawa
- Center for Data Science, Waseda University, 1-6-1, Nishiwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
| |
Collapse
|
44
|
Fei N, Li R, Cui H, Hu Y. A Prediction Model for Normal Variation of Somatosensory Evoked Potential During Scoliosis Surgery. Int J Neural Syst 2023; 33:2350005. [PMID: 36581320 DOI: 10.1142/s0129065723500053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Somatosensory evoked potential (SEP) has been commonly used as intraoperative monitoring to detect the presence of neurological deficits during scoliosis surgery. However, SEP usually presents an enormous variation in response to patient-specific factors such as physiological parameters leading to the false warning. This study proposes a prediction model to quantify SEP amplitude variation due to noninjury-related physiological changes of the patient undergoing scoliosis surgery. Based on a hybrid network of attention-based long-short-term memory (LSTM) and convolutional neural networks (CNNs), we develop a deep learning-based framework for predicting the SEP value in response to variation of physiological variables. The training and selection of model parameters were based on a 5-fold cross-validation scheme using mean square error (MSE) as evaluation metrics. The proposed model obtained MSE of 0.027[Formula: see text][Formula: see text] on left cortical SEP, MSE of 0.024[Formula: see text][Formula: see text] on left subcortical SEP, MSE of 0.031[Formula: see text][Formula: see text] on right cortical SEP, and MSE of 0.025[Formula: see text][Formula: see text] on right subcortical SEP based on the test set. The proposed model could quantify the affection from physiological parameters to the SEP amplitude in response to normal variation of physiology during scoliosis surgery. The prediction of SEP amplitude provides a potential varying reference for intraoperative SEP monitoring.
Collapse
Affiliation(s)
- Ningbo Fei
- Department of Orthopaedics and Traumatology, The University of Hong Kong - Shenzhen Hospital, Shenzhen 518058, Guangdong, P. R. China.,Department of Orthopeadics and Traumatology, The University of Hong Kong, Pokfulam, Hong Kong
| | - Rong Li
- Department of Orthopaedics and Traumatology, The University of Hong Kong - Shenzhen Hospital, Shenzhen 518058, Guangdong, P. R. China.,Department of Orthopeadics and Traumatology, The University of Hong Kong, Pokfulam, Hong Kong
| | - Hongyan Cui
- Institute of Biomedical Engineering, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300192, P. R. China
| | - Yong Hu
- Department of Orthopaedics and Traumatology, The University of Hong Kong - Shenzhen Hospital, Shenzhen 518058, Guangdong, P. R. China.,Department of Orthopeadics and Traumatology, The University of Hong Kong, Pokfulam, Hong Kong
| |
Collapse
|
45
|
Selcuk Nogay H, Adeli H. Diagnostic of autism spectrum disorder based on structural brain MRI images using, grid search optimization, and convolutional neural networks. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
46
|
Koutrintzes D, Spyrou E, Mathe E, Mylonas P. A Multimodal Fusion Approach for Human Activity Recognition. Int J Neural Syst 2023; 33:2350002. [PMID: 36573880 DOI: 10.1142/s0129065723500028] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The problem of human activity recognition (HAR) has been increasingly attracting the efforts of the research community, having several applications. It consists of recognizing human motion and/or behavior within a given image or a video sequence, using as input raw sensor measurements. In this paper, a multimodal approach addressing the task of video-based HAR is proposed. It is based on 3D visual data that are collected using an RGB + depth camera, resulting to both raw video and 3D skeletal sequences. These data are transformed into six different 2D image representations; four of them are in the spectral domain, another is a pseudo-colored image. The aforementioned representations are based on skeletal data. The last representation is a "dynamic" image which is actually an artificially created image that summarizes RGB data of the whole video sequence, in a visually comprehensible way. In order to classify a given activity video, first, all the aforementioned 2D images are extracted and then six trained convolutional neural networks are used so as to extract visual features. The latter are fused so as to form a single feature vector and are fed into a support vector machine for classification into human activities. For evaluation purposes, a challenging motion activity recognition dataset is used, while single-view, cross-view and cross-subject experiments are performed. Moreover, the proposed approach is compared to three other state-of-the-art methods, demonstrating superior performance in most experiments.
Collapse
Affiliation(s)
- Dimitrios Koutrintzes
- Institute of Informatics and Telecommunications, National Center for Scientific Research - "Demokritos", Athens, Greece
| | - Evaggelos Spyrou
- Department of Informatics and Telecommunication, University of Thessaly, Lamia, Greece
| | - Eirini Mathe
- Department of Informatics, Ionian University, Corfu, Greece
| | - Phivos Mylonas
- Department of Informatics, Ionian University, Corfu, Greece
| |
Collapse
|
47
|
Wang J, Ge X, Shi Y, Sun M, Gong Q, Wang H, Huang W. Dual-Modal Information Bottleneck Network for Seizure Detection. Int J Neural Syst 2023; 33:2250061. [PMID: 36599663 DOI: 10.1142/s0129065722500617] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
In recent years, deep learning has shown very competitive performance in seizure detection. However, most of the currently used methods either convert electroencephalogram (EEG) signals into spectral images and employ 2D-CNNs, or split the one-dimensional (1D) features of EEG signals into many segments and employ 1D-CNNs. Moreover, these investigations are further constrained by the absence of consideration for temporal links between time series segments or spectrogram images. Therefore, we propose a Dual-Modal Information Bottleneck (Dual-modal IB) network for EEG seizure detection. The network extracts EEG features from both time series and spectrogram dimensions, allowing information from different modalities to pass through the Dual-modal IB, requiring the model to gather and condense the most pertinent information in each modality and only share what is necessary. Specifically, we make full use of the information shared between the two modality representations to obtain key information for seizure detection and to remove irrelevant feature between the two modalities. In addition, to explore the intrinsic temporal dependencies, we further introduce a bidirectional long-short-term memory (BiLSTM) for Dual-modal IB model, which is used to model the temporal relationships between the information after each modality is extracted by convolutional neural network (CNN). For CHB-MIT dataset, the proposed framework can achieve an average segment-based sensitivity of 97.42%, specificity of 99.32%, accuracy of 98.29%, and an average event-based sensitivity of 96.02%, false detection rate (FDR) of 0.70/h. We release our code at https://github.com/LLLL1021/Dual-modal-IB.
Collapse
Affiliation(s)
- Jiale Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, P. R. China
| | - Xinting Ge
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, P. R. China
| | - Yunfeng Shi
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, P. R. China
| | - Mengxue Sun
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, P. R. China
| | - Qingtao Gong
- Ulsan Ship and Ocean College, Ludong University, Yantai 264025, P. R. China
| | - Haipeng Wang
- Institute of Information Fusion, Naval, Aviation University, Yantai 264001, P. R. China
| | - Wenhui Huang
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, P. R. China
| |
Collapse
|
48
|
Kasseropoulos DP, Koukaras P, Tjortjis C. Exploiting Textual Information for Fake News Detection. Int J Neural Syst 2022; 32:2250058. [PMID: 36328968 DOI: 10.1142/s0129065722500587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
"Fake news" refers to the deliberate dissemination of news with the purpose to deceive and mislead the public. This paper assesses the accuracy of several Machine Learning (ML) algorithms, using a style-based technique that relies on textual information extracted from news, such as part of speech counts. To expand the already proposed styled-based techniques, a new method of enhancing a linguistic feature set is proposed. It combines Named Entity Recognition (NER) with the Frequent Pattern (FP) Growth association rule mining algorithm, aiming to provide better insight into the papers' sentence level structure. Recursive feature elimination was used to identify a subset of the highest performing linguistic characteristics, which turned out to align with the literature. Using pre-trained word embeddings, document embeddings and weighted document embeddings were constructed using each word's TF-IDF value as the weight factor. The document embeddings were mixed with the linguistic features providing a variety of training/test feature sets. For each model, the best performing feature set was identified and fine-tuned regarding its hyper parameters to improve accuracy. ML algorithms' results were compared with two Neural Networks: Convolutional Neural Network (CNN) and Long-Short-Term Memory (LSTM). The results indicate that CNN outperformed all other methods in terms of accuracy, when companied with pre-trained word embeddings, yet SVM performs almost the same with a wider variety of input feature sets. Although style-based technique scores lower accuracy, it provides explainable results about the author's writing style decisions. Our work points out how new technologies and combinations of existing techniques can enhance the style-based approach capturing more information.
Collapse
Affiliation(s)
- Dimitrios Panagiotis Kasseropoulos
- The Data Mining and Analytics Research Group, School of Science and Technology International, Hellenic University, 14th km Thessaloniki - N. Moudania, 57001 Thermi, Thessaloniki, Greece
| | - Paraskevas Koukaras
- The Data Mining and Analytics Research Group, School of Science and Technology International, Hellenic University, 14th km Thessaloniki - N. Moudania, 57001 Thermi, Thessaloniki, Greece
| | - Christos Tjortjis
- The Data Mining and Analytics Research Group, School of Science and Technology International, Hellenic University, 14th km Thessaloniki - N. Moudania, 57001 Thermi, Thessaloniki, Greece
| |
Collapse
|
49
|
Wang K, Wang Y, Zhan B, Yang Y, Zu C, Wu X, Zhou J, Nie D, Zhou L. An Efficient Semi-Supervised Framework with Multi-Task and Curriculum Learning for Medical Image Segmentation. Int J Neural Syst 2022; 32:2250043. [DOI: 10.1142/s0129065722500435] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
50
|
Li S, Tang Z, Jin N, Yang Q, Liu G, Liu T, Hu J, Liu S, Wang P, Hao J, Zhang Z, Zhang X, Li J, Wang X, Li Z, Wang Y, Yang B, Ma L. Uncovering Brain Differences in Preschoolers and Young Adolescents with Autism Spectrum Disorder using Deep Learning. Int J Neural Syst 2022; 32:2250044. [DOI: 10.1142/s0129065722500447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|