1
|
Contreras RC, Viana MS, Bernardino VJS, Santos FLD, Toygar Ö, Guido RC. A multi-filter deep transfer learning framework for image-based autism spectrum disorder detection. Sci Rep 2025; 15:14253. [PMID: 40274878 PMCID: PMC12022319 DOI: 10.1038/s41598-025-97708-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Accepted: 04/07/2025] [Indexed: 04/26/2025] Open
Abstract
Autism Spectrum Disorder (ASD) affects approximately [Formula: see text] of the global population and is characterized by difficulties in social communication and repetitive or obsessive behaviors. Early detection of autism is crucial, as it allows therapeutic interventions to be initiated earlier, significantly increasing the effectiveness of treatments. However, diagnosing ASD remains a challenge, as it is traditionally carried out through methods that are often subjective and based on interviews and clinical observations. With the advancement of computer vision and pattern recognition techniques, new possibilities are emerging to automate and enhance the detection of characteristics associated with ASD, particularly in the analysis of facial features. In this context, image-based computational approaches must address challenges such as low data availability, variability in image acquisition conditions, and high-dimensional feature representations generated by deep learning models. This study proposes a novel framework that integrates data augmentation, multi-filtering routines, histogram equalization, and a two-stage dimensionality reduction process to enrich the representation in pre-trained and frozen deep learning neural network models applied to image pattern recognition. The framework design is guided by practical needs specific to ASD detection scenarios: data augmentation aims to compensate for limited dataset sizes; image enhancement routines improve robustness to noise and lighting variability while potentially highlighting facial traits associated with ASD; feature scaling standardizes representations prior to classification; and dimensionality reduction compresses high-dimensional deep features while preserving discriminative power. The use of frozen pre-trained networks allows for a lightweight, deterministic pipeline without the need for fine-tuning. Experiments are conducted using eight pre-trained models on a well-established benchmark facial dataset in the literature, comprising samples of autistic and non-autistic individuals. The results show that the proposed framework improves classification accuracy by up to [Formula: see text] points when compared to baseline models using pre-trained networks without any preprocessing strategies - as evidenced by the ResNet-50 architecture, which increased from [Formula: see text] to [Formula: see text]. Moreover, Transformer-based models, such as ViTSwin, reached up to [Formula: see text] accuracy, highlighting the robustness of the proposed approach. These improvements were observed consistently across different network architectures and datasets, under varying data augmentation, filtering, and dimensionality reduction configurations. A systematic ablation study further confirms the individual and collective benefits of each component in the pipeline, reinforcing the contribution of the integrated approach. These findings suggest that the framework is a promising tool for the automated detection of autism, offering an efficient improvement in traditional deep learning-based approaches to assist in early and more accurate diagnosis.
Collapse
Affiliation(s)
- Rodrigo Colnago Contreras
- Department of Science and Technology, Institute of Science and Technology, Federal University of São Paulo (UNIFESP), São José dos Campos, SP, 12247-014, Brazil.
- Department of Computer Science and Statistics, Institute of Biosciences, Letters and Exact Sciences, São Paulo State University (UNESP), São José do Rio Preto, SP, 15054-000, Brazil.
- São Paulo State Technological College, Paula Souza State Center for Technological Education (CEETEPS), São José do Rio Preto, SP, 15043-020, Brazil.
| | | | - Victor José Souza Bernardino
- São Paulo State Technological College, Paula Souza State Center for Technological Education (CEETEPS), São José do Rio Preto, SP, 15043-020, Brazil
| | | | - Önsen Toygar
- Computer Engineering Department, Faculty of Engineering, Eastern Mediterranean University, 99628, Famagusta, North Cyprus, via Mersin 10, Turkey
| | - Rodrigo Capobianco Guido
- Department of Computer Science and Statistics, Institute of Biosciences, Letters and Exact Sciences, São Paulo State University (UNESP), São José do Rio Preto, SP, 15054-000, Brazil
| |
Collapse
|
2
|
P V, A P. Virtual Reality-Based Attention Prediction Model in Gaming for Autistic Children. Int J Dev Neurosci 2025; 85:e70000. [PMID: 39873320 DOI: 10.1002/jdn.70000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Revised: 12/26/2024] [Accepted: 01/08/2025] [Indexed: 01/30/2025] Open
Abstract
Nowadays, virtual reality (VR) has emerged as a successful new therapeutic strategy in a variety of sectors of the health profession, including rehabilitation, the promotion of inpatients' emotional wellness, diagnostics, surgeon training and mental health therapy. This study develops a new model VRAPMG for children with ASD with the following steps that consider 3D gaming. In this work, the face image is considered to evaluate the attention of the children. In the data acquisition, the input face image is converted into a noncoloured image called a greyscale image. The preprocessing phase is carried out with a median filter and Viola-Jones (VJ) algorithm-based face detection is carried out. Then, the improved active shape model (ASM), shape local binary texture (SLBT) and eye position localization-based features are extracted. In the detection phase, DMO and Bi-GRU models are combined to form the hybrid classification model. Then, improved SLF is done, and the output is detected. Depending on the detected emotions, it is determined whether the children are attentive or not via entropy-based attention prediction.
Collapse
Affiliation(s)
- Valarmathi P
- Department of Computer Science and Engineering, Vels Institute of Science & Technology & Advanced Studies, Chennai, Tamilnadu, India
| | - Packialatha A
- Department of Computer Science and Engineering, Vels Institute of Science & Technology & Advanced Studies, Chennai, Tamilnadu, India
| |
Collapse
|
3
|
Zaher M, Ghoneim AS, Abdelhamid L, Atia A. Fusing CNNs and attention-mechanisms to improve real-time indoor Human Activity Recognition for classifying home-based physical rehabilitation exercises. Comput Biol Med 2025; 184:109399. [PMID: 39591669 DOI: 10.1016/j.compbiomed.2024.109399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 10/20/2024] [Accepted: 11/07/2024] [Indexed: 11/28/2024]
Abstract
Physical rehabilitation plays a critical role in enhancing health outcomes globally. However, the shortage of physiotherapists, particularly in developing countries where the ratio is approximately ten physiotherapists per million people, poses a significant challenge to effective rehabilitation services. The existing literature on rehabilitation often falls short in data representation and the employment of diverse modalities, limiting the potential for advanced therapeutic interventions. To address this gap, This study integrates Computer Vision and Human Activity Recognition (HAR) technologies to support home-based rehabilitation. The study mitigates this gap by exploring various modalities and proposing a framework for data representation. We introduce a novel framework that leverages both Continuous Wavelet Transform (CWT) and Mel-Frequency Cepstral Coefficients (MFCC) for skeletal data representation. CWT is particularly valuable for capturing the time-frequency characteristics of dynamic movements involved in rehabilitation exercises, enabling a comprehensive depiction of both temporal and spectral features. This dual capability is crucial for accurately modelling the complex and variable nature of rehabilitation exercises. In our analysis, we evaluate 20 CNN-based models and one Vision Transformer (ViT) model. Additionally, we propose 12 hybrid architectures that combine CNN-based models with ViT in bi-model and tri-model configurations. These models are rigorously tested on the UI-PRMD and KIMORE benchmark datasets using key evaluation metrics, including accuracy, precision, recall, and F1-score, with 5-fold cross-validation. Our evaluation also considers real-time performance, model size, and efficiency on low-power devices, emphasising practical applicability. The proposed fused tri-model architectures outperform both single-architectures and bi-model configurations, demonstrating robust performance across both datasets and making the fused models the preferred choice for rehabilitation tasks. Our proposed hybrid model, DenMobVit, consistently surpasses state-of-the-art methods, achieving accuracy improvements of 2.9% and 1.97% on the UI-PRMD and KIMORE datasets, respectively. These findings highlight the effectiveness of our approach in advancing rehabilitation technologies and bridging the gap in physiotherapy services.
Collapse
Affiliation(s)
- Moamen Zaher
- Faculty of Computer Science, October University for Modern Sciences and Arts (MSA), Egypt; Human-Computer Interaction (HCI-LAB), Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| | - Amr S Ghoneim
- Computer Science Department, Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| | - Laila Abdelhamid
- Information Systems Department, Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| | - Ayman Atia
- Faculty of Computer Science, October University for Modern Sciences and Arts (MSA), Egypt; Human-Computer Interaction (HCI-LAB), Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| |
Collapse
|
4
|
Kanwal A, Javed K, Ali S, Khan MA, Alsenan S, Alasiry A, Marzougui M, Rubab S. ALATT-network: automated LSTM-based framework for classification and monitoring of autism spectrum disorder therapy tasks. SIGNAL, IMAGE AND VIDEO PROCESSING 2024; 18:9205-9221. [DOI: 10.1007/s11760-024-03540-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 08/18/2024] [Accepted: 08/23/2024] [Indexed: 01/12/2025]
|
5
|
Öztürk D, Aydoğan S, Kök İ, Akın Bülbül I, Özdemir S, Özdemir S, Akay D. Linguistic summarization of visual attention and developmental functioning of young children with autism spectrum disorder. Health Inf Sci Syst 2024; 12:39. [PMID: 39022602 PMCID: PMC11252111 DOI: 10.1007/s13755-024-00297-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 07/06/2024] [Indexed: 07/20/2024] Open
Abstract
Diagnosing autism spectrum disorder (ASD) in children poses significant challenges due to its complex nature and impact on social communication development. While numerous data analytics techniques have been proposed for ASD evaluation, the process remains time-consuming and lacks clarity. Eye tracking (ET) data has emerged as a valuable resource for ASD risk assessment, yet existing literature predominantly focuses on predictive methods rather than descriptive techniques that offer human-friendly insights. Interpretation of ET data and Bayley scales, a widely used assessment tool, is challenging for ASD assessment of children. It should be understood clearly to perform better analytic tasks on ASD screening. Therefore, this study addresses this gap by employing linguistic summarization techniques to generate easily understandable summaries from raw ET data and Bayley scales. By integrating ET data and Bayley scores, the study aims to improve the identification of children with ASD from typically developing children (TD). Notably, this research represents one of the pioneering efforts to linguistically summarize ET data alongside Bayley scales, presenting comparative results between children with ASD and TD. Through linguistic summarization, this study facilitates the creation of simple, natural language statements, offering a first and unique approach to enhance ASD screening and contribute to our understanding of neurodevelopmental disorders.
Collapse
Affiliation(s)
- Demet Öztürk
- Department of Industrial Engineering, Gazi University, Ankara, Turkey
| | - Sena Aydoğan
- Department of Industrial Engineering, Gazi University, Ankara, Turkey
| | - İbrahim Kök
- Department of Computer Engineering, Pamukkale University, Denizli, Turkey
| | - Işık Akın Bülbül
- Department of Special Education, Gazi University, Ankara, Turkey
| | - Selda Özdemir
- Department of Special Education, Hacettepe University, Ankara, Turkey
| | - Suat Özdemir
- Department of Computer Engineering, Hacettepe University, Ankara, Turkey
| | - Diyar Akay
- Department of Industrial Engineering, Hacettepe University, Ankara, Turkey
| |
Collapse
|
6
|
Banos O, Comas-González Z, Medina J, Polo-Rodríguez A, Gil D, Peral J, Amador S, Villalonga C. Sensing technologies and machine learning methods for emotion recognition in autism: Systematic review. Int J Med Inform 2024; 187:105469. [PMID: 38723429 DOI: 10.1016/j.ijmedinf.2024.105469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/05/2024] [Accepted: 04/28/2024] [Indexed: 05/23/2024]
Abstract
BACKGROUND Human Emotion Recognition (HER) has been a popular field of study in the past years. Despite the great progresses made so far, relatively little attention has been paid to the use of HER in autism. People with autism are known to face problems with daily social communication and the prototypical interpretation of emotional responses, which are most frequently exerted via facial expressions. This poses significant practical challenges to the application of regular HER systems, which are normally developed for and by neurotypical people. OBJECTIVE This study reviews the literature on the use of HER systems in autism, particularly with respect to sensing technologies and machine learning methods, as to identify existing barriers and possible future directions. METHODS We conducted a systematic review of articles published between January 2011 and June 2023 according to the 2020 PRISMA guidelines. Manuscripts were identified through searching Web of Science and Scopus databases. Manuscripts were included when related to emotion recognition, used sensors and machine learning techniques, and involved children with autism, young, or adults. RESULTS The search yielded 346 articles. A total of 65 publications met the eligibility criteria and were included in the review. CONCLUSIONS Studies predominantly used facial expression techniques as the emotion recognition method. Consequently, video cameras were the most widely used devices across studies, although a growing trend in the use of physiological sensors was observed lately. Happiness, sadness, anger, fear, disgust, and surprise were most frequently addressed. Classical supervised machine learning techniques were primarily used at the expense of unsupervised approaches or more recent deep learning models. Studies focused on autism in a broad sense but limited efforts have been directed towards more specific disorders of the spectrum. Privacy or security issues were seldom addressed, and if so, at a rather insufficient level of detail.
Collapse
Affiliation(s)
- Oresti Banos
- Department of Computer Engineering, Automation and Robotics, University of Granada, Granada, Spain.
| | - Zhoe Comas-González
- Department of Computer Engineering, Automation and Robotics, University of Granada, Granada, Spain; Department of Computer Science and Electronics, Universidad de la Costa, Barranquilla, Colombia
| | - Javier Medina
- Department of Computer Engineering, Automation and Robotics, University of Granada, Granada, Spain
| | - Aurora Polo-Rodríguez
- Department of Computer Engineering, Automation and Robotics, University of Granada, Granada, Spain; Department of Computer Science, University of Jaén, Jaén, Spain
| | - David Gil
- Department of Computer Technology and Computation, University of Alicante, Alicante, Spain
| | - Jesús Peral
- Department of Sotware and Computing Systems, University of Alicante, Alicante, Spain.
| | - Sandra Amador
- Department of Computer Technology and Computation, University of Alicante, Alicante, Spain
| | - Claudia Villalonga
- Department of Computer Engineering, Automation and Robotics, University of Granada, Granada, Spain
| |
Collapse
|
7
|
Sha M, Alqahtani A, Alsubai S, Dutta AK. Modified Meta Heuristic BAT with ML Classifiers for Detection of Autism Spectrum Disorder. Biomolecules 2023; 14:48. [PMID: 38254648 PMCID: PMC10813510 DOI: 10.3390/biom14010048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/22/2023] [Accepted: 12/27/2023] [Indexed: 01/24/2024] Open
Abstract
ASD (autism spectrum disorder) is a complex developmental and neurological disorder that impacts the social life of the affected person by disturbing their capability for interaction and communication. As it is a behavioural disorder, early treatment will improve the quality of life of ASD patients. Traditional screening is carried out with behavioural assessment through trained physicians, which is expensive and time-consuming. To resolve the issue, several conventional methods strive to achieve an effective ASD identification system, but are limited by handling large data sets, accuracy, and speed. Therefore, the proposed identification system employed the MBA (modified bat) algorithm based on ANN (artificial neural networks), modified ANN (modified artificial neural networks), DT (decision tree), and KNN (k-nearest neighbours) for the classification of ASD in children and adolescents. A BA (bat algorithm) is utilised for the automatic zooming capability, which improves the system's efficacy by excellently finding the solutions in the identification system. Conversely, BA is effective in the identification, it still has certain drawbacks like speed, accuracy, and falls into local extremum. Therefore, the proposed identification system modifies the BA optimisation with random perturbation of trends and optimal orientation. The dataset utilised in the respective model is the Q-chat-10 dataset. This dataset contains data of four stages of age groups such as toddlers, children, adolescents, and adults. To analyse the quality of the dataset, dataset evaluation mechanism, such as the Chi-Squared Statistic and p-value, are used in the respective research. The evaluation signifies the relation of the dataset with respect to the proposed model. Further, the performance of the proposed detection system is examined with certain performance metrics to calculate its efficiency. The outcome revealed that the modified ANN classifier model attained an accuracy of 1.00, ensuring improved performance when compared with other state-of-the-art methods. Thus, the proposed model was intended to assist physicians and researchers in enhancing the diagnosis of ASD to improve the standard of life of ASD patients.
Collapse
Affiliation(s)
- Mohemmed Sha
- Department of Software Engineering, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia;
| | - Abdullah Alqahtani
- Department of Software Engineering, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia;
| | - Shtwai Alsubai
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia;
| | - Ashit Kumar Dutta
- Department of Computer Science and Information Systems, College of Applied Sciences, Almaarefa University, Riyadh 11597, Saudi Arabia;
| |
Collapse
|