Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Elharrouss O, Almaadeed N, Al-Maadeed S, Bouridane A, Beghdadi A. A combined multiple action recognition and summarization for surveillance video sequences. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01823-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

For:	Elharrouss O, Almaadeed N, Al-Maadeed S, Bouridane A, Beghdadi A. A combined multiple action recognition and summarization for surveillance video sequences. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01823-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Number

Cited by Other Article(s)

Camarena F, Gonzalez-Mendoza M, Chang L. Knowledge Distillation in Video-Based Human Action Recognition: An Intuitive Approach to Efficient and Flexible Model Training. J Imaging 2024;10:85. [PMID: 38667983 PMCID: PMC11051277 DOI: 10.3390/jimaging10040085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 03/23/2024] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open

Argade D, Khairnar V, Vora D, Patil S, Kotecha K, Alfarhood S. Multimodal Abstractive Summarization using bidirectional encoder representations from transformers with attention mechanism. Heliyon 2024;10:e26162. [PMID: 38420442 PMCID: PMC10900395 DOI: 10.1016/j.heliyon.2024.e26162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 01/28/2024] [Accepted: 02/08/2024] [Indexed: 03/02/2024] Open

Abstract

In recent decades, abstractive text summarization using multimodal input has attracted many researchers due to the capability of gathering information from various sources to create a concise summary. However, the existing methodologies based on multimodal summarization provide only a summary for the short videos and poor results for the lengthy videos. To address the aforementioned issues, this research presented the Multimodal Abstractive Summarization using Bidirectional Encoder Representations from Transformers (MAS-BERT) with an attention mechanism. The purpose of the video summarization is to increase the speed of searching for a large collection of videos so that the users can quickly decide whether the video is relevant or not by reading the summary. Initially, the data is obtained from the publicly available How2 dataset and is encoded using the Bidirectional Gated Recurrent Unit (Bi-GRU) encoder and the Long Short Term Memory (LSTM) encoder. The textual data which is embedded in the embedding layer is encoded using a bidirectional GRU encoder and the features with audio and video data are encoded with LSTM encoder. After this, BERT based attention mechanism is used to combine the modalities and finally, the BI-GRU based decoder is used for summarizing the multimodalities. The results obtained through the experiments that show the proposed MAS-BERT has achieved a better result of 60.2 for Rouge-1 whereas, the existing Decoder-only Multimodal Transformer (D-MmT) and the Factorized Multimodal Transformer based Decoder Only Language model (FLORAL) has achieved 49.58 and 56.89 respectively. Our work facilitates users by providing better contextual information and user experience and would help video-sharing platforms for customer retention by allowing users to search for relevant videos by looking at its summary.

Collapse

Paramasivam K, Sindha MMR, Balakrishnan SB. KNN-Based Machine Learning Classifier Used on Deep Learned Spatial Motion Features for Human Action Recognition. ENTROPY (BASEL, SWITZERLAND) 2023;25:844. [PMID: 37372188 DOI: 10.3390/e25060844] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 05/04/2023] [Accepted: 05/09/2023] [Indexed: 06/29/2023]

Ottakath N, Al-Maadeed S. Vehicle Instance Segmentation Polygonal Dataset for a Private Surveillance System. SENSORS (BASEL, SWITZERLAND) 2023;23:3642. [PMID: 37050701 PMCID: PMC10098633 DOI: 10.3390/s23073642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/19/2023] [Accepted: 03/20/2023] [Indexed: 06/19/2023]

Atif O, Lee J, Park D, Chung Y. Behavior-Based Video Summarization System for Dog Health and Welfare Monitoring. SENSORS (BASEL, SWITZERLAND) 2023;23:2892. [PMID: 36991606 PMCID: PMC10054391 DOI: 10.3390/s23062892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/02/2023] [Accepted: 03/04/2023] [Indexed: 06/19/2023]

Yue R, Tian Z, Du S. Action recognition based on RGB and skeleton data sets: A survey. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]

Robust appearance modeling for object detection and tracking: a survey of deep learning approaches. PROGRESS IN ARTIFICIAL INTELLIGENCE 2022. [DOI: 10.1007/s13748-022-00290-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]

Wang F, Chen J, Xie Z, Ai Y, Zhang W. Local sharpness failure detection of camera module lens based on image blur assessment. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03948-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Arshad MH, Bilal M, Gani A. Human Activity Recognition: Review, Taxonomy and Open Challenges. SENSORS (BASEL, SWITZERLAND) 2022;22:s22176463. [PMID: 36080922 PMCID: PMC9460866 DOI: 10.3390/s22176463] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/23/2022] [Accepted: 08/24/2022] [Indexed: 06/12/2023]

Zhou D, Chen G, Xu F. Application of Deep Learning Technology in Strength Training of Football Players and Field Line Detection of Football Robots. Front Neurorobot 2022;16:867028. [PMID: 35845757 PMCID: PMC9278879 DOI: 10.3389/fnbot.2022.867028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 04/19/2022] [Indexed: 11/25/2022] Open

Abstract

The purpose of the study is to improve the performance of intelligent football training. Based on deep learning (DL), the training of football players and detection of football robots are analyzed. First, the research status of the training of football players and football robots is introduced, and the basic structure of the neuron model and convolutional neural network (CNN) and the mainstream framework of DL are mainly expounded. Second, combined with the spatial stream network, a CNN-based action recognition system is constructed in the context of artificial intelligence (AI). Finally, by the football robot, a field line detection model based on a fully convolutional network (FCN) is proposed, and the effective applicability of the system is evaluated. The results demonstrate that the recognition effect of the dual-stream network is the best, reaching 92.8%. The recognition rate of the timestream network is lower than that of the dual-stream network, and the maximum recognition rate is 88%. The spatial stream network has the lowest recognition rate of 86.5%. The processing power of the four different algorithms on the dataset is stronger than that of the ordinary video set. The recognition rate of the time-segmented dual-stream fusion network is the highest, which is second only to the designed network. The recognition rate of the basic dual-stream network is 88.6%, and the recognition rate of the 3D CNN is the lowest, which is 86.2%. Under the intelligent training system, the recognition accuracy rates of jumping, kicking, grabbing, and starting actions range to 97.6, 94.5, 92.5, and 89.8% respectively, which are slightly lower than other actions. The recognition accuracy rate of passing action is 91.3%, and the maximum upgrade rate of intelligent training is 25.7%. The pixel accuracy of the improved field line detection of the model and the mean intersection over union (MIoU) are both improved by 5%. Intelligent training systems and the field line detection of football robots are more feasible. The research provides a reference for the development of AI in the field of sports training.

Collapse

The Design of the Lightweight Smart Home System and Interaction Experience of Products for Middle-Aged and Elderly Users in Smart Cities. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:1279351. [PMID: 35755765 PMCID: PMC9217567 DOI: 10.1155/2022/1279351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/07/2022] [Accepted: 04/29/2022] [Indexed: 11/18/2022]

Xing Y, Zhu J, Li Y, Huang J, Song J. An improved spatial temporal graph convolutional network for robust skeleton-based action recognition. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03589-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Teng Y, Song C, Wu B. Toward jointly understanding social relationships and characters from videos. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02738-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Lee I, Kim D, Wee D, Lee S. An Efficient Human Instance-Guided Framework for Video Action Recognition. SENSORS (BASEL, SWITZERLAND) 2021;21:8309. [PMID: 34960404 PMCID: PMC8709376 DOI: 10.3390/s21248309] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 12/02/2021] [Accepted: 12/10/2021] [Indexed: 11/25/2022]

Al-Ali A, Elharrouss O, Qidwai U, Al-Maaddeed S. ANFIS-Net for automatic detection of COVID-19. Sci Rep 2021;11:17318. [PMID: 34453082 PMCID: PMC8397755 DOI: 10.1038/s41598-021-96601-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 08/04/2021] [Indexed: 12/24/2022] Open

Applications, databases and open computer vision research from drone videos and images: a survey. Artif Intell Rev 2021. [DOI: 10.1007/s10462-020-09943-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]