Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

405
(from Reference Citation Analysis)

Article PDFs (58)

Cited by > 0 (174)

Searched Name

Attention mechanism

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Kong F, Wang X, Xiang J, Yang S, Wang X, Yue M, Zhang J, Zhao J, Han X, Dong Y, Zhu B, Wang F, Liu Y. Federated attention consistent learning models for prostate cancer diagnosis and Gleason grading. Comput Struct Biotechnol J 2024;23:1439-1449. [PMID: 38623561 PMCID: PMC11016961 DOI: 10.1016/j.csbj.2024.03.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 03/29/2024] [Accepted: 03/29/2024] [Indexed: 04/17/2024] Open

Tian C, Xiao J, Zhang B, Zuo W, Zhang Y, Lin CW. A self-supervised network for image denoising and watermark removal. Neural Netw 2024;174:106218. [PMID: 38518709 DOI: 10.1016/j.neunet.2024.106218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 10/18/2023] [Accepted: 02/27/2024] [Indexed: 03/24/2024]

Li Q, Feng B, Tang X, Yu H, Song H. MuLAN: Multi-level attention-enhanced matching network for few-shot knowledge graph completion. Neural Netw 2024;174:106222. [PMID: 38442490 DOI: 10.1016/j.neunet.2024.106222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 01/23/2024] [Accepted: 02/27/2024] [Indexed: 03/07/2024]

Fu K, Li H, Shi X. CTF-former: A novel simplified multi-task learning strategy for simultaneous multivariate chaotic time series prediction. Neural Netw 2024;174:106234. [PMID: 38521015 DOI: 10.1016/j.neunet.2024.106234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 02/22/2024] [Accepted: 03/11/2024] [Indexed: 03/25/2024]

Deng R, Cui C, Remedios LW, Bao S, Womick RM, Chiron S, Li J, Roland JT, Lau KS, Liu Q, Wilson KT, Wang Y, Coburn LA, Landman BA, Huo Y. Cross-scale multi-instance learning for pathological image diagnosis. Med Image Anal 2024;94:103124. [PMID: 38428271 PMCID: PMC11016375 DOI: 10.1016/j.media.2024.103124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 02/16/2024] [Accepted: 02/26/2024] [Indexed: 03/03/2024]

Bao LL, Zhang JS, Zhang CX. Spatial multi-attention conditional neural processes. Neural Netw 2024;173:106201. [PMID: 38447305 DOI: 10.1016/j.neunet.2024.106201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 01/03/2024] [Accepted: 02/20/2024] [Indexed: 03/08/2024]

Abstract

Spatial prediction tasks are challenging when observed samples are sparse and prediction samples are abundant. Gaussian processes (GPs) are commonly used in spatial prediction tasks and have the advantage of measuring the uncertainty of the interpolation result. However, as the sample size increases, GPs suffer from significant overhead. Standard neural networks (NNs) provide a powerful and scalable solution for modeling spatial data, but they often overfit small sample data. Based on conditional neural processes (CNPs), which combine the advantages of GPs and NNs, we propose a new framework called Spatial Multi-Attention Conditional Neural Processes (SMACNPs) for spatial small sample prediction tasks. SMACNPs are a modular model that can predict targets by employing different attention mechanisms to extract relevant information from different forms of sample data. The task representation is inferred by measuring the spatial correlation contained in different sample points and the relationship contained in attribute variables, respectively. The distribution of the target variable is predicted by GPs parameterized by NNs. SMACNPs allow us to obtain accurate predictions of the target value while quantifying the prediction uncertainty. Experiments on spatial prediction tasks on simulated and real-world datasets demonstrate that this framework flexibly incorporates spatial context and correlation into the model, achieving state-of-the-art results in spatial small sample prediction tasks in terms of both predictive performance and reliability. For example, on the California housing dataset, our method reduces MAE by 8% and MSE by 7% compared to the second-best method. In addition, a spatiotemporal prediction task to forecast traffic speed further confirms the effectiveness and generality of our method.

Collapse

Zhang S, He C, Wan Z, Shi N, Wang B, Liu X, Hou D. Diagnosis of pulmonary tuberculosis with 3D neural network based on multi-scale attention mechanism. Med Biol Eng Comput 2024;62:1589-1600. [PMID: 38319503 DOI: 10.1007/s11517-024-03022-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 01/03/2024] [Indexed: 02/07/2024]

Nissar I, Alam S, Masood S, Kashif M. MOB-CBAM: A dual-channel attention-based deep learning generalizable model for breast cancer molecular subtypes prediction using mammograms. Comput Methods Programs Biomed 2024;248:108121. [PMID: 38531147 DOI: 10.1016/j.cmpb.2024.108121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 02/15/2024] [Accepted: 03/06/2024] [Indexed: 03/28/2024]

Abstract

BACKGROUND AND OBJECTIVE

Deep Learning models have emerged as a significant tool in generating efficient solutions for complex problems including cancer detection, as they can analyze large amounts of data with high efficiency and performance. Recent medical studies highlight the significance of molecular subtype detection in breast cancer, aiding the development of personalized treatment plans as different subtypes of cancer respond better to different therapies.

METHODS

In this work, we propose a novel lightweight dual-channel attention-based deep learning model MOB-CBAM that utilizes the backbone of MobileNet-V3 architecture with a Convolutional Block Attention Module to make highly accurate and precise predictions about breast cancer. We used the CMMD mammogram dataset to evaluate the proposed model in our study. Nine distinct data subsets were created from the original dataset to perform coarse and fine-grained predictions, enabling it to identify masses, calcifications, benign, malignant tumors and molecular subtypes of cancer, including Luminal A, Luminal B, HER-2 Positive, and Triple Negative. The pipeline incorporates several image pre-processing techniques, including filtering, enhancement, and normalization, for enhancing the model's generalization ability.

RESULTS

While identifying benign versus malignant tumors, i.e., coarse-grained classification, the MOB-CBAM model produced exceptional results with 99 % accuracy, precision, recall, and F1-score values of 0.99 and MCC of 0.98. In terms of fine-grained classification, the MOB-CBAM model has proven to be highly efficient in accurately identifying mass with (benign/malignant) and calcification with (benign/malignant) classification tasks with an impressive accuracy rate of 98 %. We have also cross-validated the efficiency of the proposed MOB-CBAM deep learning architecture on two datasets: MIAS and CBIS-DDSM. On the MIAS dataset, an accuracy of 97 % was reported for the task of classifying benign, malignant, and normal images, while on the CBIS-DDSM dataset, an accuracy of 98 % was achieved for the classification of mass with either benign or malignant, and calcification with benign and malignant tumors.

CONCLUSION

This study presents lightweight MOB-CBAM, a novel deep learning framework, to address breast cancer diagnosis and subtype prediction. The model's innovative incorporation of the CBAM enhances precise predictions. The extensive evaluation of the CMMD dataset and cross-validation on other datasets affirm the model's efficacy.

Collapse

Ren J, An N, Zhang Y, Wang D, Sun Z, Lin C, Cui W, Wang W, Zhou Y, Zhang W, Hu Q, Zhang P, Hu D, Wang D, Liu H. SUGAR: Spherical ultrafast graph attention framework for cortical surface registration. Med Image Anal 2024;94:103122. [PMID: 38428270 DOI: 10.1016/j.media.2024.103122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 01/25/2024] [Accepted: 02/22/2024] [Indexed: 03/03/2024]

Abstract

Cortical surface registration plays a crucial role in aligning cortical functional and anatomical features across individuals. However, conventional registration algorithms are computationally inefficient. Recently, learning-based registration algorithms have emerged as a promising solution, significantly improving processing efficiency. Nonetheless, there remains a gap in the development of a learning-based method that exceeds the state-of-the-art conventional methods simultaneously in computational efficiency, registration accuracy, and distortion control, despite the theoretically greater representational capabilities of deep learning approaches. To address the challenge, we present SUGAR, a unified unsupervised deep-learning framework for both rigid and non-rigid registration. SUGAR incorporates a U-Net-based spherical graph attention network and leverages the Euler angle representation for deformation. In addition to the similarity loss, we introduce fold and multiple distortion losses to preserve topology and minimize various types of distortions. Furthermore, we propose a data augmentation strategy specifically tailored for spherical surface registration to enhance the registration performance. Through extensive evaluation involving over 10,000 scans from 7 diverse datasets, we showed that our framework exhibits comparable or superior registration performance in accuracy, distortion, and test-retest reliability compared to conventional and learning-based methods. Additionally, SUGAR achieves remarkable sub-second processing times, offering a notable speed-up of approximately 12,000 times in registering 9,000 subjects from the UK Biobank dataset in just 32 min. This combination of high registration performance and accelerated processing time may greatly benefit large-scale neuroimaging studies.

Collapse

Kim SH, Kim DY, Chun SW, Kim J, Woo J. Impartial feature selection using multi-agent reinforcement learning for adverse glycemic event prediction. Comput Biol Med 2024;173:108257. [PMID: 38520922 DOI: 10.1016/j.compbiomed.2024.108257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 02/02/2024] [Accepted: 03/06/2024] [Indexed: 03/25/2024]

Gao M, Zhang D, Chen Y, Zhang Y, Wang Z, Wang X, Li S, Guo Y, Webb GI, Nguyen ATN, May L, Song J. GraphormerDTI: A graph transformer-based approach for drug-target interaction prediction. Comput Biol Med 2024;173:108339. [PMID: 38547658 DOI: 10.1016/j.compbiomed.2024.108339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 03/05/2024] [Accepted: 03/17/2024] [Indexed: 04/17/2024]

Abstract

The application of Artificial Intelligence (AI) to screen drug molecules with potential therapeutic effects has revolutionized the drug discovery process, with significantly lower economic cost and time consumption than the traditional drug discovery pipeline. With the great power of AI, it is possible to rapidly search the vast chemical space for potential drug-target interactions (DTIs) between candidate drug molecules and disease protein targets. However, only a small proportion of molecules have labelled DTIs, consequently limiting the performance of AI-based drug screening. To solve this problem, a machine learning-based approach with great ability to generalize DTI prediction across molecules is desirable. Many existing machine learning approaches for DTI identification failed to exploit the full information with respect to the topological structures of candidate molecules. To develop a better approach for DTI prediction, we propose GraphormerDTI, which employs the powerful Graph Transformer neural network to model molecular structures. GraphormerDTI embeds molecular graphs into vector-format representations through iterative Transformer-based message passing, which encodes molecules' structural characteristics by node centrality encoding, node spatial encoding and edge encoding. With a strong structural inductive bias, the proposed GraphormerDTI approach can effectively infer informative representations for out-of-sample molecules and as such, it is capable of predicting DTIs across molecules with an exceptional performance. GraphormerDTI integrates the Graph Transformer neural network with a 1-dimensional Convolutional Neural Network (1D-CNN) to extract the drugs' and target proteins' representations and leverages an attention mechanism to model the interactions between them. To examine GraphormerDTI's performance for DTI prediction, we conduct experiments on three benchmark datasets, where GraphormerDTI achieves a superior performance than five state-of-the-art baselines for out-of-molecule DTI prediction, including GNN-CPI, GNN-PT, DeepEmbedding-DTI, MolTrans and HyperAttentionDTI, and is on a par with the best baseline for transductive DTI prediction. The source codes and datasets are publicly accessible at https://github.com/mengmeng34/GraphormerDTI.

Collapse

Chen Y, Zhan W, Jiang Y, Zhu D, Xu X, Hao Z, Li J, Guo J. A feature refinement and adaptive generative adversarial network for thermal infrared image colorization. Neural Netw 2024;173:106184. [PMID: 38387204 DOI: 10.1016/j.neunet.2024.106184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 02/13/2024] [Accepted: 02/15/2024] [Indexed: 02/24/2024]

Ma W, Chen H, Zhang W, Huang H, Wu J, Peng X, Sun Q. DSYOLO-trash: An attention mechanism-integrated and object tracking algorithm for solid waste detection. Waste Manag 2024;178:46-56. [PMID: 38377768 DOI: 10.1016/j.wasman.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 12/29/2023] [Accepted: 02/07/2024] [Indexed: 02/22/2024]

Liang W, Muhammad Rehan Afzal H, Qiao Y, Fan A, Wang F, Hu Y, Yang P. Estimation of electrical muscle activity during gait using inertial measurement units with convolution attention neural network and small-scale dataset. J Biomech 2024;167:112093. [PMID: 38615480 DOI: 10.1016/j.jbiomech.2024.112093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 04/07/2024] [Accepted: 04/09/2024] [Indexed: 04/16/2024]

Abstract

In general, muscle activity can be directly measured using Electromyography (EMG) or calculated with musculoskeletal models. However, both methods are not suitable for non-technical users and unstructured environments. It is desired to establish more portable and easy-to-use muscle activity estimation methods. Deep learning (DL) models combined with inertial measurement units (IMUs) have shown great potential to estimate muscle activity. However, it frequently occurs in clinical scenarios that a very small amount of data is available and leads to limited performance of the DL models, while the augmentation techniques to efficiently expand a small sample size for DL model training are rarely used. The primary aim of the present study was to develop a novel DL model to estimate the EMG envelope during gait using IMUs with high accuracy. A secondary aim was to develop a novel model-based data augmentation method to improve the performance of the estimation model with small-scale dataset. Therefore, in the present study, a time convolutional network-based generative adversarial network, namely MuscleGAN, was proposed for data augmentation. Moreover, a subject-independent regression DL model was developed to estimate EMG envelope. Results suggested that the proposed two-stage method has better generalization and estimation performance than the commonly used existing methods. Pearson correlation coefficient and normalized root-mean-square errors derived from the proposed method reached up to 0.72 and 0.13, respectively. It was indicated that the MuscleGAN indeed improved the estimation accuracy of lower limb EMG envelope from 70% to 72%. Thus, even using only two IMUs and a very small-scale dataset, the proposed model is still capable of accurately estimating lower limb EMG envelope, demonstrating considerable potential for its application in clinical and daily life scenarios.

Collapse

Gao Y, Lv G, Xiao D, Han X, Sun T, Li Z. Research on steel surface defect classification method based on deep learning. Sci Rep 2024;14:8254. [PMID: 38589514 PMCID: PMC11001973 DOI: 10.1038/s41598-024-58643-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 04/01/2024] [Indexed: 04/10/2024] Open

Zhu Q, Zhuang H, Zhao M, Xu S, Meng R. A study on expression recognition based on improved mobilenetV2 network. Sci Rep 2024;14:8121. [PMID: 38582772 PMCID: PMC10998880 DOI: 10.1038/s41598-024-58736-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 04/02/2024] [Indexed: 04/08/2024] Open

Romero-Oraá R, Herrero-Tudela M, López MI, Hornero R, García M. Attention-based deep learning framework for automatic fundus image processing to aid in diabetic retinopathy grading. Comput Methods Programs Biomed 2024;249:108160. [PMID: 38583290 DOI: 10.1016/j.cmpb.2024.108160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 01/26/2024] [Accepted: 03/30/2024] [Indexed: 04/09/2024]

Abstract

BACKGROUND AND OBJECTIVE

Early detection and grading of Diabetic Retinopathy (DR) is essential to determine an adequate treatment and prevent severe vision loss. However, the manual analysis of fundus images is time consuming and DR screening programs are challenged by the availability of human graders. Current automatic approaches for DR grading attempt the joint detection of all signs at the same time. However, the classification can be optimized if red lesions and bright lesions are independently processed since the task gets divided and simplified. Furthermore, clinicians would greatly benefit from explainable artificial intelligence (XAI) to support the automatic model predictions, especially when the type of lesion is specified. As a novelty, we propose an end-to-end deep learning framework for automatic DR grading (5 severity degrees) based on separating the attention of the dark structures from the bright structures of the retina. As the main contribution, this approach allowed us to generate independent interpretable attention maps for red lesions, such as microaneurysms and hemorrhages, and bright lesions, such as hard exudates, while using image-level labels only.

METHODS

Our approach is based on a novel attention mechanism which focuses separately on the dark and the bright structures of the retina by performing a previous image decomposition. This mechanism can be seen as a XAI approach which generates independent attention maps for red lesions and bright lesions. The framework includes an image quality assessment stage and deep learning-related techniques, such as data augmentation, transfer learning and fine-tuning. We used the architecture Xception as a feature extractor and the focal loss function to deal with data imbalance.

RESULTS

The Kaggle DR detection dataset was used for method development and validation. The proposed approach achieved 83.7 % accuracy and a Quadratic Weighted Kappa of 0.78 to classify DR among 5 severity degrees, which outperforms several state-of-the-art approaches. Nevertheless, the main result of this work is the generated attention maps, which reveal the pathological regions on the image distinguishing the red lesions and the bright lesions. These maps provide explainability to the model predictions.

CONCLUSIONS

Our results suggest that our framework is effective to automatically grade DR. The separate attention approach has proven useful for optimizing the classification. On top of that, the obtained attention maps facilitate visual interpretation for clinicians. Therefore, the proposed method could be a diagnostic aid for the early detection and grading of DR.

Collapse

Li J, Ai L, Yao R. NVAM-Net: deep learning networks for reconstructing high-quality fiber orientation distributions. Neuroradiology 2024:10.1007/s00234-024-03341-y. [PMID: 38563964 DOI: 10.1007/s00234-024-03341-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 03/19/2024] [Indexed: 04/04/2024]

Bai X, Wei X, Wang Z, Zhang M. CONet: Crowd and occlusion-aware network for occluded human pose estimation. Neural Netw 2024;172:106109. [PMID: 38232431 DOI: 10.1016/j.neunet.2024.106109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 11/30/2023] [Accepted: 01/05/2024] [Indexed: 01/19/2024]

Serrão MKM, Costa MGF, Fujimoto LBM, Ogusku MM, Costa Filho CFF. Automatic bright-field smear microscopy for diagnosis of pulmonary tuberculosis. Comput Biol Med 2024;172:108167. [PMID: 38461699 DOI: 10.1016/j.compbiomed.2024.108167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 01/19/2024] [Accepted: 02/15/2024] [Indexed: 03/12/2024]

Sun S, Mei Z, Li X, Tang T, Su Z, Wu Y. A label information fused medical image report generation framework. Artif Intell Med 2024;150:102823. [PMID: 38553163 DOI: 10.1016/j.artmed.2024.102823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 02/21/2024] [Accepted: 02/21/2024] [Indexed: 04/02/2024]

Abstract

Medical imaging is an important tool for clinical diagnosis. Nevertheless, it is very time-consuming and error-prone for physicians to prepare imaging diagnosis reports. Therefore, it is necessary to develop some methods to generate medical imaging reports automatically. Currently, the task of medical imaging report generation is challenging in at least two aspects: (1) medical images are very similar to each other. The differences between normal and abnormal images and between different abnormal images are usually trivial; (2) unrelated or incorrect keywords describing abnormal findings in the generated reports lead to mis-communications. In this paper, we propose a medical image report generation framework composed of four modules, including a Transformer encoder, a MIX-MLP multi-label classification network, a co-attention mechanism (CAM) based semantic and visual feature fusion, and a hierarchical LSTM decoder. The Transformer encoder can be used to learn long-range dependencies between images and labels, effectively extract visual and semantic features of images, and establish long-term dependent relationships between visual and semantic information to accurately extract abnormal features from images. The MIX-MLP multi-label classification network, the co-attention mechanism and the hierarchical LSTM network can better identify abnormalities, achieving visual and text alignment fusion and multi-label diagnostic classification to better facilitate report generation. The results of the experiments performed on two widely used radiology report datasets, IU X-RAY and MIMIC-CXR, show that our proposed framework outperforms current report generation models in terms of both natural linguistic generation metrics and clinical efficacy assessment metrics. The code of this work is available online at https://github.com/watersunhznu/LIFMRG.

Collapse

Cao Z, Wang K, Wen J, Li C, Wu Y, Wang X, Yu W. Fine-grained image classification on bats using VGG16-CBAM: a practical example with 7 horseshoe bats taxa (CHIROPTERA: Rhinolophidae: Rhinolophus) from Southern China. Front Zool 2024;21:10. [PMID: 38561769 PMCID: PMC10983684 DOI: 10.1186/s12983-024-00531-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 03/18/2024] [Indexed: 04/04/2024] Open

Zhou W, Zheng F, Zhao Y, Pang Y, Yi J. MSDCNN: A multiscale dilated convolution neural network for fine-grained 3D shape classification. Neural Netw 2024;172:106141. [PMID: 38301340 DOI: 10.1016/j.neunet.2024.106141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 01/17/2024] [Accepted: 01/21/2024] [Indexed: 02/03/2024]

Zhang X, Ding T. Style classification of media painting images by integrating ResNet and attention mechanism. Heliyon 2024;10:e27178. [PMID: 38496868 PMCID: PMC10944206 DOI: 10.1016/j.heliyon.2024.e27178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 02/24/2024] [Accepted: 02/26/2024] [Indexed: 03/19/2024] Open

Wei X, Wang Z. TCN-attention-HAR: human activity recognition based on attention mechanism time convolutional network. Sci Rep 2024;14:7414. [PMID: 38548859 PMCID: PMC10978978 DOI: 10.1038/s41598-024-57912-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Accepted: 03/22/2024] [Indexed: 04/01/2024] Open

Wei L, Liu P, Ren H, Xiao D. Research on helmet wearing detection method based on deep learning. Sci Rep 2024;14:7010. [PMID: 38528034 DOI: 10.1038/s41598-024-57433-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 03/18/2024] [Indexed: 03/27/2024] Open

Wang L, Zhang X, Tian C, Chen S, Deng Y, Liao X, Wang Q, Si W. PlaqueNet: deep learning enabled coronary artery plaque segmentation from coronary computed tomography angiography. Vis Comput Ind Biomed Art 2024;7:6. [PMID: 38514491 DOI: 10.1186/s42492-024-00157-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 03/03/2024] [Indexed: 03/23/2024] Open

Yuan X, Fu Z, Zhang B, Xie Z, Gan R. Research on lightweight algorithm for gangue detection based on improved Yolov5. Sci Rep 2024;14:6707. [PMID: 38509164 PMCID: PMC10954748 DOI: 10.1038/s41598-024-57259-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 03/15/2024] [Indexed: 03/22/2024] Open

Chen L, Zhu J. Water surface garbage detection based on lightweight YOLOv5. Sci Rep 2024;14:6133. [PMID: 38480741 PMCID: PMC10937728 DOI: 10.1038/s41598-024-55051-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 02/20/2024] [Indexed: 03/17/2024] Open

Ma J, Zhao Z, Li T, Liu Y, Ma J, Zhang R. GraphsformerCPI: Graph Transformer for Compound-Protein Interaction Prediction. Interdiscip Sci 2024:10.1007/s12539-024-00609-y. [PMID: 38457109 DOI: 10.1007/s12539-024-00609-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 01/01/2024] [Accepted: 01/08/2024] [Indexed: 03/09/2024]

Abstract

Accurately predicting compound-protein interactions (CPI) is a critical task in computer-aided drug design. In recent years, the exponential growth of compound activity and biomedical data has highlighted the need for efficient and interpretable prediction approaches. In this study, we propose GraphsformerCPI, an end-to-end deep learning framework that improves prediction performance and interpretability. GraphsformerCPI treats compounds and proteins as sequences of nodes with spatial structures, and leverages novel structure-enhanced self-attention mechanisms to integrate semantic and graph structural features within molecules for deep molecule representations. To capture the vital association between compound atoms and protein residues, we devise a dual-attention mechanism to effectively extract relational features through .cross-mapping. By extending the powerful learning capabilities of Transformers to spatial structures and extensively utilizing attention mechanisms, our model offers strong interpretability, a significant advantage over most black-box deep learning methods. To evaluate GraphsformerCPI, extensive experiments were conducted on benchmark datasets including human, C. elegans, Davis and KIBA datasets. We explored the impact of model depth and dropout rate on performance and compared our model against state-of-the-art baseline models. Our results demonstrate that GraphsformerCPI outperforms baseline models in classification datasets and achieves competitive performance in regression datasets. Specifically, on the human dataset, GraphsformerCPI achieves an average improvement of 1.6% in AUC, 0.5% in precision, and 5.3% in recall. On the KIBA dataset, the average improvement in Concordance index (CI) and mean squared error (MSE) is 3.3% and 7.2%, respectively. Molecular docking shows that our model provides novel insights into the intrinsic interactions and binding mechanisms. Our research holds practical significance in effectively predicting CPIs and binding affinities, identifying key atoms and residues, enhancing model interpretability.

Collapse

Yin Y, Tang Z, Weng H. Application of visual transformer in renal image analysis. Biomed Eng Online 2024;23:27. [PMID: 38439100 PMCID: PMC10913284 DOI: 10.1186/s12938-024-01209-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 01/22/2024] [Indexed: 03/06/2024] Open

Yao X, Jiang X, Luo H, Liang H, Ye X, Wei Y, Cong S. MOCAT: multi-omics integration with auxiliary classifiers enhanced autoencoder. BioData Min 2024;17:9. [PMID: 38444019 PMCID: PMC10916109 DOI: 10.1186/s13040-024-00360-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 02/29/2024] [Indexed: 03/07/2024] Open

Wang S, Qiao J, Feng S. Prediction of lncRNA and disease associations based on residual graph convolutional networks with attention mechanism. Sci Rep 2024;14:5185. [PMID: 38431702 DOI: 10.1038/s41598-024-55957-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/29/2024] [Indexed: 03/05/2024] Open

Abstract

LncRNAs are non-coding RNAs with a length of more than 200 nucleotides. More and more evidence shows that lncRNAs are inextricably linked with diseases. To make up for the shortcomings of traditional methods, researchers began to collect relevant biological data in the database and used bioinformatics prediction tools to predict the associations between lncRNAs and diseases, which greatly improved the efficiency of the study. To improve the prediction accuracy of current methods, we propose a new lncRNA-disease associations prediction method with attention mechanism, called ResGCN-A. Firstly, we integrated lncRNA functional similarity, lncRNA Gaussian interaction profile kernel similarity, disease semantic similarity, and disease Gaussian interaction profile kernel similarity to obtain lncRNA comprehensive similarity and disease comprehensive similarity. Secondly, the residual graph convolutional network was used to extract the local features of lncRNAs and diseases. Thirdly, the new attention mechanism was used to assign the weight of the above features to further obtain the potential features of lncRNAs and diseases. Finally, the training set required by the Extra-Trees classifier was obtained by concatenating potential features, and the potential associations between lncRNAs and diseases were obtained by the trained Extra-Trees classifier. ResGCN-A combines the residual graph convolutional network with the attention mechanism to realize the local and global features fusion of lncRNA and diseases, which is beneficial to obtain more accurate features and improve the prediction accuracy. In the experiment, ResGCN-A was compared with five other methods through 5-fold cross-validation. The results show that the AUC value and AUPR value obtained by ResGCN-A are 0.9916 and 0.9951, which are superior to the other five methods. In addition, case studies and robustness evaluation have shown that ResGCN-A is an effective method for predicting lncRNA-disease associations. The source code for ResGCN-A will be available at https://github.com/Wangxiuxiun/ResGCN-A .

Collapse

Zhang Y, Chen Z, Yang X. Light-M: An efficient lightweight medical image segmentation framework for resource-constrained IoMT. Comput Biol Med 2024;170:108088. [PMID: 38320339 DOI: 10.1016/j.compbiomed.2024.108088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/22/2023] [Accepted: 01/27/2024] [Indexed: 02/08/2024]

Abstract

The Internet of Medical Things (IoMT) is being incorporated into current healthcare systems. This technology intends to connect patients, IoMT devices, and hospitals over mobile networks, allowing for more secure, quick, and convenient health monitoring and intelligent healthcare services. However, existing intelligent healthcare applications typically rely on large-scale AI models, and standard IoMT devices have significant resource constraints. To alleviate this paradox, in this paper, we propose a Knowledge Distillation (KD)-based IoMT end-edge-cloud orchestrated architecture for medical image segmentation tasks, called Light-M, aiming to deploy a lightweight medical model in resource-constrained IoMT devices. Specifically, Light-M trains a large teacher model in the cloud server and employs computation in local nodes through imitation of the performance of the teacher model using knowledge distillation. Light-M contains two KD strategies: (1) active exploration and passive transfer (AEPT) and (2) self-attention-based inter-class feature variation (AIFV) distillation for the medical image segmentation task. The AEPT encourages the student model to learn undiscovered knowledge/features of the teacher model without additional feature layers, aiming to explore new features and outperform the teacher. To improve the distinguishability of the student for different classes, the student learns the self-attention-based feature variation (AIFV) between classes. Since the proposed AEPT and AIFV only appear in the training process, our framework does not involve any additional computation burden for a student model during the segmentation task deployment. Extensive experiments on cardiac images and public real-scene datasets demonstrate that our approach improves student model learning representations and outperforms state-of-the-art methods by combining two knowledge distillation strategies. Moreover, when deployed on the IoT device, the distilled student model takes only 29.6 ms for one sample at the inference step.

Collapse

Lin Y, Wang J, Liu Q, Zhang K, Liu M, Wang Y. CFANet: Context fusing attentional network for preoperative CT image segmentation in robotic surgery. Comput Biol Med 2024;171:108115. [PMID: 38402837 DOI: 10.1016/j.compbiomed.2024.108115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/30/2024] [Accepted: 02/04/2024] [Indexed: 02/27/2024]

Zhou Y, Zheng Y, Tian Y, Bai Y, Cai N, Wang P. SCAN: sequence-based context-aware association network for hepatic vessel segmentation. Med Biol Eng Comput 2024;62:817-827. [PMID: 38032458 DOI: 10.1007/s11517-023-02975-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 11/22/2023] [Indexed: 12/01/2023]

Huang Z, Xiao Q, Xiong T, Shi W, Yang Y, Li G. Predicting Drug-Protein Interactions through Branch-Chain Mining and multi-dimensional attention network. Comput Biol Med 2024;171:108127. [PMID: 38350397 DOI: 10.1016/j.compbiomed.2024.108127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 01/26/2024] [Accepted: 02/06/2024] [Indexed: 02/15/2024]

Abstract

Identifying drug-protein interactions (DPIs) is crucial in drug discovery and repurposing. Computational methods for precise DPI identification can expedite development timelines and reduce expenses compared with conventional experimental methods. Lately, deep learning techniques have been employed for predicting DPIs, enhancing these processes. Nevertheless, the limitations observed in prior studies, where many extract features from complete drug and protein entities, overlooking the crucial theoretical foundation that pharmacological responses are often correlated with specific substructures, can lead to poor predictive performance. Furthermore, certain substructure-focused research confines its exploration to a solitary fragment category, such as a functional group. In this study, addressing these constraints, we present an end-to-end framework termed BCMMDA for predicting DPIs. The framework considers various substructure types, including branch chains, common substructures, and specific fragments. We designed a specific feature learning module by combining our proposed multi-dimensional attention mechanism with convolutional neural networks (CNNs). Deep CNNs assist in capturing the synergistic effects among these fragment sets, enabling the extraction of relevant features of drugs and proteins. Meanwhile, the multi-dimensional attention mechanism refines the relationship between drug and protein features by assigning attention vectors to each drug compound and amino acid. This mechanism empowers the model to further concentrate on pivotal substructures and elements, thereby improving its ability to identify essential interactions in DPI prediction. We evaluated the performance of BCMMDA on four well-known benchmark datasets. The results indicated that BCMMDA outperformed state-of-the-art baseline models, demonstrating significant improvement in performance.

Collapse

Wei K, Kong W, Liu L, Wang J, Li B, Zhao B, Li Z, Zhu J, Yu G. CT synthesis from MR images using frequency attention conditional generative adversarial network. Comput Biol Med 2024;170:107983. [PMID: 38286104 DOI: 10.1016/j.compbiomed.2024.107983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 12/24/2023] [Accepted: 01/13/2024] [Indexed: 01/31/2024]

Yang K, Song J, Liu M, Xue L, Liu S, Yin X, Liu K. TBACkp: HER2 expression status classification network focusing on intrinsic subenvironmental characteristics of breast cancer liver metastases. Comput Biol Med 2024;170:108002. [PMID: 38277921 DOI: 10.1016/j.compbiomed.2024.108002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 12/24/2023] [Accepted: 01/13/2024] [Indexed: 01/28/2024]

Abstract

The HER2 expression status in breast cancer liver metastases is a crucial indicator for the diagnosis, treatment, and prognosis assessment of patients. And typical diagnosis involves assessing the HER2 expression status through invasive procedures like biopsy. However, this method has certain drawbacks, such as being difficult in obtaining tissue samples and requiring long examination periods. To address these limitations, we propose an AI-aided diagnostic model. This model enables rapid diagnosis. It diagnoses a patient's HER2 expression status on the basis of preprocessed images, which is the region of the lesion extracted from a CT image rather than from an actual tissue sample. The algorithm of the model adopts a parallel structure, including a Branch Block and a Trunk Block. The Branch Block is responsible for extracting the gradient characteristics between the tumor sub-environments, and the Trunk Block is for fusing the characteristics extracted by the Branch Block. The Branch Block contains CNN with self-attention, which combines the advantages of CNN and self-attention to extract more meticulous and comprehensive image features. And the Trunk Block is so designed that it fuses the extracted image feature information without affecting the transmission of the original image features. The Conv-Attention is used to calculate the attention in the Trunk Block, which uses kernel dot product and is responsible for providing the weight for the self-attention in the process of using convolution induced deviation calculation. Combined with the structure of the model and the method used, we refer to this model as TBACkp. The dataset comprises the enhanced abdominal CT images of 151 patients with liver metastases from breast cancer, together with the corresponding HER2 expression levels for each patient. The experimental results are as follows: (AUC: 0.915, ACC: 0.854, specificity: 0.809, precision: 0.863, recall: 0.881, F1-score: 0.872). The results demonstrate that this method can accurately assess the HER2 expression status in patients when compared with other advanced deep learning model.

Collapse

Wen W, Zhang H, Wang Z, Gao X, Wu P, Lin J, Zeng N. Enhanced multi-label cardiology diagnosis with channel-wise recurrent fusion. Comput Biol Med 2024;171:108210. [PMID: 38417383 DOI: 10.1016/j.compbiomed.2024.108210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/08/2024] [Accepted: 02/25/2024] [Indexed: 03/01/2024]

Tang X, Luo L, Wang S. TSE-ARF: An adaptive prediction method of effectors across secretion system types. Anal Biochem 2024;686:115407. [PMID: 38030053 DOI: 10.1016/j.ab.2023.115407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 11/12/2023] [Accepted: 11/20/2023] [Indexed: 12/01/2023]

Wang Z, Yu L, Tian S, Huo X. CRMEFNet: A coupled refinement, multiscale exploration and fusion network for medical image segmentation. Comput Biol Med 2024;171:108202. [PMID: 38402839 DOI: 10.1016/j.compbiomed.2024.108202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 12/22/2023] [Accepted: 02/18/2024] [Indexed: 02/27/2024]

Abstract

Accurate segmentation of target areas in medical images, such as lesions, is essential for disease diagnosis and clinical analysis. In recent years, deep learning methods have been intensively researched and have generated significant progress in medical image segmentation tasks. However, most of the existing methods have limitations in modeling multilevel feature representations and identification of complex textured pixels at contrasting boundaries. This paper proposes a novel coupled refinement and multiscale exploration and fusion network (CRMEFNet) for medical image segmentation, which explores in the optimization and fusion of multiscale features to address the abovementioned limitations. The CRMEFNet consists of three main innovations: a coupled refinement module (CRM), a multiscale exploration and fusion module (MEFM), and a cascaded progressive decoder (CPD). The CRM decouples features into low-frequency body features and high-frequency edge features, and performs targeted optimization of both to enhance intraclass uniformity and interclass differentiation of features. The MEFM performs a two-stage exploration and fusion of multiscale features using our proposed multiscale aggregation attention mechanism, which explores the differentiated information within the cross-level features, and enhances the contextual connections between the features, to achieves adaptive feature fusion. Compared to existing complex decoders, the CPD decoder (consisting of the CRM and MEFM) can perform fine-grained pixel recognition while retaining complete semantic location information. It also has a simple design and excellent performance. The experimental results from five medical image segmentation tasks, ten datasets and twelve comparison models demonstrate the state-of-the-art performance, interpretability, flexibility and versatility of our CRMEFNet.

Collapse

Farkhani S, Demnitz N, Boraxbekk CJ, Lundell H, Siebner HR, Petersen ET, Madsen KH. End-to-end volumetric segmentation of white matter hyperintensities using deep learning. Comput Methods Programs Biomed 2024;245:108008. [PMID: 38290291 DOI: 10.1016/j.cmpb.2024.108008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 12/08/2023] [Accepted: 01/03/2024] [Indexed: 02/01/2024]

Abstract

BACKGROUND AND OBJECTIVES

Reliable detection of white matter hyperintensities (WMH) is crucial for studying the impact of diffuse white-matter pathology on brain health and monitoring changes in WMH load over time. However, manual annotation of 3D high-dimensional neuroimages is laborious and can be prone to biases and errors in the annotation procedure. In this study, we evaluate the performance of deep learning (DL) segmentation tools and propose a novel volumetric segmentation model incorporating self-attention via a transformer-based architecture. Ultimately, we aim to evaluate diverse factors that influence WMH segmentation, aiming for a comprehensive analysis of the state-of-the-art algorithms in a broader context.

METHODS

We trained state-of-the-art DL algorithms, and incorporated advanced attention mechanisms, using structural fluid-attenuated inversion recovery (FLAIR) image acquisitions. The anatomical MRI data utilized for model training was obtained from healthy individuals aged 62-70 years in the Live active Successful Aging (LISA) project. Given the potential sparsity of lesion volume among healthy aging individuals, we explored the impact of incorporating a weighted loss function and ensemble models. To assess the generalizability of the studied DL models, we applied the trained algorithm to an independent subset of data sourced from the MICCAI WMH challenge (MWSC). Notably, this subset had vastly different acquisition parameters compared to the LISA dataset used for training.

RESULTS

Consistently, DL approaches exhibited commendable segmentation performance, achieving the level of inter-rater agreement comparable to expert performance, ensuring superior quality segmentation outcomes. On the out of sample dataset, the ensemble models exhibited the most outstanding performance.

CONCLUSIONS

DL methods generally surpassed conventional approaches in our study. While all DL methods performed comparably, incorporating attention mechanisms could prove advantageous in future applications with a wider availability of training data. As expected, our experiments indicate that the use of ensemble-based models enables the superior generalization in out-of-distribution settings. We believe that introducing DL methods in the WHM annotation workflow in heathy aging cohorts is promising, not only for reducing the annotation time required, but also for eventually improving accuracy and robustness via incorporating the automatic segmentations in the evaluation procedure.

Collapse

Affiliation(s)

Sadaf Farkhani Danish Research Center for Magnetic Resonance, Center for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital-Amager and Hvidovre, Kattegaard Alle 30, Hvidovre, Denmark.
Naiara Demnitz Danish Research Center for Magnetic Resonance, Center for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital-Amager and Hvidovre, Kattegaard Alle 30, Hvidovre, Denmark
Carl-Johan Boraxbekk Danish Research Center for Magnetic Resonance, Center for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital-Amager and Hvidovre, Kattegaard Alle 30, Hvidovre, Denmark; Institute for Clinical Medicine, Faculty of Medical and Health Sciences, University of Copenhagen, Denmark; Department of Neurology, Copenhagen University Hospital Bispebjerg and Frederiksberg, Copenhagen, Denmark; Institute of Sports Medicine Copenhagen (ISMC), Copenhagen University Hospital Bispebjerg and Frederiksberg, Copenhagen, Denmark
Henrik Lundell Danish Research Center for Magnetic Resonance, Center for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital-Amager and Hvidovre, Kattegaard Alle 30, Hvidovre, Denmark; Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
Hartwig Roman Siebner Danish Research Center for Magnetic Resonance, Center for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital-Amager and Hvidovre, Kattegaard Alle 30, Hvidovre, Denmark; Institute for Clinical Medicine, Faculty of Medical and Health Sciences, University of Copenhagen, Denmark; Department of Neurology, Copenhagen University Hospital Bispebjerg and Frederiksberg, Copenhagen, Denmark
Esben Thade Petersen Danish Research Center for Magnetic Resonance, Center for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital-Amager and Hvidovre, Kattegaard Alle 30, Hvidovre, Denmark; Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
Kristoffer Hougaard Madsen Danish Research Center for Magnetic Resonance, Center for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital-Amager and Hvidovre, Kattegaard Alle 30, Hvidovre, Denmark; Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark

Collapse

Qi X, Wang H, Ji Y, Li Y, Luo X, Nie R, Liang X. Daily natural gas load prediction method based on APSO optimization and Attention-BiLSTM. PeerJ Comput Sci 2024;10:e1890. [PMID: 38435580 PMCID: PMC10909168 DOI: 10.7717/peerj-cs.1890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 01/29/2024] [Indexed: 03/05/2024]

Abstract

As the economy continues to develop and technology advances, there is an increasing societal need for an environmentally friendly ecosystem. Consequently, natural gas, known for its minimal greenhouse gas emissions, has been widely adopted as a clean energy alternative. The accurate prediction of short-term natural gas demand poses a significant challenge within this context, as precise forecasts have important implications for gas dispatch and pipeline safety. The incorporation of intelligent algorithms into prediction methodologies has resulted in notable progress in recent times. Nevertheless, certain limitations persist. However, there exist certain limitations, including the tendency to easily fall into local optimization and inadequate search capability. To address the challenge of accurately predicting daily natural gas loads, we propose a novel methodology that integrates the adaptive particle swarm optimization algorithm, attention mechanism, and bidirectional long short-term memory (BiLSTM) neural networks. The initial step involves utilizing the BiLSTM network to conduct bidirectional data learning. Following this, the attention mechanism is employed to calculate the weights of the hidden layer in the BiLSTM, with a specific focus on weight distribution. Lastly, the adaptive particle swarm optimization algorithm is utilized to comprehensively optimize and design the network structure, initial learning rate, and learning rounds of the BiLSTM network model, thereby enhancing the accuracy of the model. The findings revealed that the combined model achieved a mean absolute percentage error (MAPE) of 0.90% and a coefficient of determination (R2) of 0.99. These results surpassed those of the other comparative models, demonstrating superior prediction accuracy, as well as exhibiting favorable generalization and prediction stability.

Collapse

Argade D, Khairnar V, Vora D, Patil S, Kotecha K, Alfarhood S. Multimodal Abstractive Summarization using bidirectional encoder representations from transformers with attention mechanism. Heliyon 2024;10:e26162. [PMID: 38420442 PMCID: PMC10900395 DOI: 10.1016/j.heliyon.2024.e26162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 01/28/2024] [Accepted: 02/08/2024] [Indexed: 03/02/2024] Open

Abstract

In recent decades, abstractive text summarization using multimodal input has attracted many researchers due to the capability of gathering information from various sources to create a concise summary. However, the existing methodologies based on multimodal summarization provide only a summary for the short videos and poor results for the lengthy videos. To address the aforementioned issues, this research presented the Multimodal Abstractive Summarization using Bidirectional Encoder Representations from Transformers (MAS-BERT) with an attention mechanism. The purpose of the video summarization is to increase the speed of searching for a large collection of videos so that the users can quickly decide whether the video is relevant or not by reading the summary. Initially, the data is obtained from the publicly available How2 dataset and is encoded using the Bidirectional Gated Recurrent Unit (Bi-GRU) encoder and the Long Short Term Memory (LSTM) encoder. The textual data which is embedded in the embedding layer is encoded using a bidirectional GRU encoder and the features with audio and video data are encoded with LSTM encoder. After this, BERT based attention mechanism is used to combine the modalities and finally, the BI-GRU based decoder is used for summarizing the multimodalities. The results obtained through the experiments that show the proposed MAS-BERT has achieved a better result of 60.2 for Rouge-1 whereas, the existing Decoder-only Multimodal Transformer (D-MmT) and the Factorized Multimodal Transformer based Decoder Only Language model (FLORAL) has achieved 49.58 and 56.89 respectively. Our work facilitates users by providing better contextual information and user experience and would help video-sharing platforms for customer retention by allowing users to search for relevant videos by looking at its summary.

Collapse

Wei W, Zhang L, Yang K, Li J, Cui N, Han Y, Zhang N, Yang X, Tan H, Wang K. A lightweight network for traffic sign recognition based on multi-scale feature and attention mechanism. Heliyon 2024;10:e26182. [PMID: 38420439 PMCID: PMC10900943 DOI: 10.1016/j.heliyon.2024.e26182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 01/29/2024] [Accepted: 02/08/2024] [Indexed: 03/02/2024] Open

Jakkaladiki SP, Maly F. Integrating hybrid transfer learning with attention-enhanced deep learning models to improve breast cancer diagnosis. PeerJ Comput Sci 2024;10:e1850. [PMID: 38435578 PMCID: PMC10909230 DOI: 10.7717/peerj-cs.1850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 01/10/2024] [Indexed: 03/05/2024]

Abstract

Cancer, with its high fatality rate, instills fear in countless individuals worldwide. However, effective diagnosis and treatment can often lead to a successful cure. Computer-assisted diagnostics, especially in the context of deep learning, have become prominent methods for primary screening of various diseases, including cancer. Deep learning, an artificial intelligence technique that enables computers to reason like humans, has recently gained significant attention. This study focuses on training a deep neural network to predict breast cancer. With the advancements in medical imaging technologies such as X-ray, magnetic resonance imaging (MRI), and computed tomography (CT) scans, deep learning has become essential in analyzing and managing extensive image datasets. The objective of this research is to propose a deep-learning model for the identification and categorization of breast tumors. The system's performance was evaluated using the breast cancer identification (BreakHis) classification datasets from the Kaggle repository and the Wisconsin Breast Cancer Dataset (WBC) from the UCI repository. The study's findings demonstrated an impressive accuracy rate of 100%, surpassing other state-of-the-art approaches. The suggested model was thoroughly evaluated using F1-score, recall, precision, and accuracy metrics on the WBC dataset. Training, validation, and testing were conducted using pre-processed datasets, leading to remarkable results of 99.8% recall rate, 99.06% F1-score, and 100% accuracy rate on the BreakHis dataset. Similarly, on the WBC dataset, the model achieved a 99% accuracy rate, a 98.7% recall rate, and a 99.03% F1-score. These outcomes highlight the potential of deep learning models in accurately diagnosing breast cancer. Based on our research, it is evident that the proposed system outperforms existing approaches in this field.

Collapse

Chen Y, Li X, Lv N, He Z, Wu B. Automatic detection method for tobacco beetles combining multi-scale global residual feature pyramid network and dual-path deformable attention. Sci Rep 2024;14:4862. [PMID: 38418868 PMCID: PMC10902385 DOI: 10.1038/s41598-024-55347-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 02/22/2024] [Indexed: 03/02/2024] Open

He X, Zhang H, Huang J, Zhao D, Li Y, Nie R, Liu X. [Research on fault diagnosis of patient monitor based on text mining]. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2024;41:168-176. [PMID: 38403618 PMCID: PMC10894744 DOI: 10.7507/1001-5515.202306017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]

Wang X, Liu J. Vegetable disease detection using an improved YOLOv8 algorithm in the greenhouse plant environment. Sci Rep 2024;14:4261. [PMID: 38383751 PMCID: PMC10881480 DOI: 10.1038/s41598-024-54540-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 02/14/2024] [Indexed: 02/23/2024] Open

Abstract

This study introduces YOLOv8n-vegetable, a model designed to address challenges related to imprecise detection of vegetable diseases in greenhouse plant environment using existing network models. The model incorporates several improvements and optimizations to enhance its effectiveness. Firstly, a novel C2fGhost module replaces partial C2f. with GhostConv based on Ghost lightweight convolution, reducing the model's parameters and improving detection performance. Second, the Occlusion Perception Attention Module (OAM) is integrated into the Neck section to better preserve feature information after fusion, enhancing vegetable disease detection in greenhouse settings. To address challenges associated with detecting small-sized objects and the depletion of semantic knowledge due to varying scales, an additional layer for detecting small-sized objects is included. This layer improves the amalgamation of extensive and basic semantic knowledge, thereby enhancing overall detection accuracy. Finally, the HIoU boundary loss function is introduced, leading to improved convergence speed and regression accuracy. These improvement strategies were validated through experiments using a self-built vegetable disease detection dataset in a greenhouse environment. Multiple experimental comparisons have demonstrated the model's effectiveness, achieving the objectives of improving detection speed while maintaining accuracy and real-time detection capability. According to experimental findings, the enhanced model exhibited a 6.46% rise in mean average precision (mAP) over the original model on the self-built vegetable disease detection dataset under greenhouse conditions. Additionally, the parameter quantity and model size decreased by 0.16G and 0.21 MB, respectively. The proposed model demonstrates significant advancements over the original algorithm and exhibits strong competitiveness when compared with other advanced object detection models. The lightweight and fast detection of vegetable diseases offered by the proposed model presents promising applications in vegetable disease detection tasks.

Collapse