1
|
Zeng X, Liao T, Xu L, Wang Z. AERMNet: Attention-enhanced relational memory network for medical image report generation. Comput Methods Programs Biomed 2024; 244:107979. [PMID: 38113805 DOI: 10.1016/j.cmpb.2023.107979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 11/26/2023] [Accepted: 12/12/2023] [Indexed: 12/21/2023]
Abstract
BACKGROUND AND OBJECTIVES The automatic generation of medical image diagnostic reports can assist doctors in reducing their workload and improving the efficiency and accuracy of diagnosis. However, among the most existing report generation models, there are problems that the weak correlation between generated words and the lack of contextual information in the report generation process. METHODS To address the above problems, we propose an Attention-Enhanced Relational Memory Network (AERMNet) model, where the relational memory module is continuously updated by the words generated in the previous time step to strengthen the correlation between words in generated medical image report. And the double LSTM with interaction module reduces the loss of context information and makes full use of feature information. Thus, more accurate disease information can be generated by AERMNet for medical image reports. RESULTS Experimental results on four medical datasets Fetal heart (FH), Ultrasound, IU X-Ray and MIMIC-CXR, show that our proposed method outperforms some of the previous models with respect to language generation metrics (Cider improving by 2.4% on FH, Bleu1 improving by 2.4% on Ultrasound, Cider improving by 16.4% on IU X-Ray, Bleu2 improving by 9.7% on MIMIC-CXR). CONCLUSIONS This work promotes the development of medical image report generation and expands the prospects of computer-aided diagnosis applications. Our code is released at https://github.com/llttxx/AERMNET.
Collapse
Affiliation(s)
- Xianhua Zeng
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
| | - Tianxing Liao
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Liming Xu
- College of Computer Science, China West Normal University, Nanchong, Sichuan, 637000, China
| | - Zhiqiang Wang
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| |
Collapse
|
2
|
Hui J, Lian G, Wu J, Ge S, Yang J. Proportional feature pyramid network based on weight fusion for lane detection. PeerJ Comput Sci 2024; 10:e1824. [PMID: 38435568 PMCID: PMC10909180 DOI: 10.7717/peerj-cs.1824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 12/27/2023] [Indexed: 03/05/2024]
Abstract
Lane detection under extreme conditions presents a highly challenging task that requires capturing each crucial pixel to predict the complex topology of lane lines and differentiate the various lane types. Existing methods predominantly rely on deep feature extraction networks with substantial parameters or the fusion of multiple prediction modules, resulting in large model sizes, embedding difficulties, and slow detection speeds. This article proposes a Proportional Feature Pyramid Network (P-FPN) through fusing the weights into the FPN for lane detection. For obtaining a more accurately detecting result, the cross refinement block is introduced in the P-FPN network. The cross refinement block takes the feature maps and anchors as inputs and gradually refines the anchors from high to low level feature maps. In our method, the high-level features are explored to predict lanes coarsely while local-detailed features are leveraged to improve localization accuracy. Extensive experiments on two widely used lane detection datasets, The Chinese Urban Scene Benchmark for Lane Detection (CULane) and the TuSimple Lane Detection Challenge (TuSimple) datasets, demonstrate that the proposed method achieves competitive results compared with several state-of-the-art approaches.
Collapse
Affiliation(s)
- Jiapeng Hui
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic University, Shenzhen, Guangdong, China
- School of Computer and Software Engineering, University of Science and Technology Liaoning, Anshan, Liaoning, China
| | - Guoyun Lian
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic University, Shenzhen, Guangdong, China
| | - Jiansheng Wu
- School of Computer and Software Engineering, University of Science and Technology Liaoning, Anshan, Liaoning, China
| | - Shuting Ge
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic University, Shenzhen, Guangdong, China
- School of Computer and Software Engineering, University of Science and Technology Liaoning, Anshan, Liaoning, China
| | - Jinfeng Yang
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic University, Shenzhen, Guangdong, China
| |
Collapse
|
3
|
Knes AS, de Gruijter M, Zuidberg MC, de Poot CJ. CSI-CSI: Comparing several investigative approaches toward crime scene improvement. Sci Justice 2024; 64:63-72. [PMID: 38182314 DOI: 10.1016/j.scijus.2023.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 11/20/2023] [Accepted: 11/27/2023] [Indexed: 01/07/2024]
Abstract
Crime scene investigations are highly complex environments that require the CSI to engage in complex decision-making. CSIs must rely on personal experience, context information, and scientific knowledge about the fundamental principles of forensic science to both find and correctly interpret ambiguous traces and accurately reconstruct a scene. Differences in CSI decision making can arise in multiple stages of a crime scene investigation. Given its crucial role in forensic investigation, CSI decision-making must be further studied to understand how differences may arise during the stages of a crime scene investigation. The following exploratory research project is a first step at comparing how crime scene investigations of violent robberies are conducted between 25 crime scene investigators from nine countries across the world. Through a mock crime scene and semi-structured interview, we observed that CSIs have adopted a variety of investigation approaches. The results show that CSIs have different working strategies and make different decisions when it comes to the construction of relevant hypotheses, their search strategy, and the collection of traces. These different decisions may, amongst other factors, be due to the use of prior information, a CSI's knowledge and experience, and the perceived goal of their investigation. We suggest the development of more practical guidelines to aid CSIs through a hypothetico-deductive reasoning process, where (a) CSIs are supported in the correct use of contextual information, (b) outside knowledge and expertise are integrated into this process, and (c) CSIs are guided in the evaluation of the utility of their traces.
Collapse
Affiliation(s)
- Anna S Knes
- Institute for Interdisciplinary Studies, University of Amsterdam, 1012WX, Amsterdam, the Netherlands; Netherlands Forensic Institute, Laan van Ypenburg 6 2497 GB, The Hague, the Netherlands.
| | - Madeleine de Gruijter
- Netherlands Forensic Institute, Laan van Ypenburg 6 2497 GB, The Hague, the Netherlands.
| | - Matthijs C Zuidberg
- Netherlands Forensic Institute, Laan van Ypenburg 6 2497 GB, The Hague, the Netherlands.
| | - Christianne J de Poot
- Forensic Science Department, Amsterdam, University of Applied Sciences, Tafelbergweg 51 1105 BD, Amsterdam, the Netherlands; Department of Criminal Law and Criminology, Vrij Universiteit, De Boelelaan 1105 1081 HV, Amsterdam, the Netherlands; Police Academy, Arnhemseweg 348 7337 AC, Apeldoorn, the Netherlands.
| |
Collapse
|
4
|
Zhou W, Bai W, Ji J, Yi Y, Zhang N, Cui W. Dual-path multi-scale context dense aggregation network for retinal vessel segmentation. Comput Biol Med 2023; 164:107269. [PMID: 37562323 DOI: 10.1016/j.compbiomed.2023.107269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 06/22/2023] [Accepted: 07/16/2023] [Indexed: 08/12/2023]
Abstract
There has been steady progress in the field of deep learning-based blood vessel segmentation. However, several challenging issues still continue to limit its progress, including inadequate sample sizes, the neglect of contextual information, and the loss of microvascular details. To address these limitations, we propose a dual-path deep learning framework for blood vessel segmentation. In our framework, the fundus images are divided into concentric patches with different scales to alleviate the overfitting problem. Then, a Multi-scale Context Dense Aggregation Network (MCDAU-Net) is proposed to accurately extract the blood vessel boundaries from these patches. In MCDAU-Net, a Cascaded Dilated Spatial Pyramid Pooling (CDSPP) module is designed and incorporated into intermediate layers of the model, enhancing the receptive field and producing feature maps enriched with contextual information. To improve segmentation performance for low-contrast vessels, we propose an InceptionConv (IConv) module, which can explore deeper semantic features and suppress the propagation of non-vessel information. Furthermore, we design a Multi-scale Adaptive Feature Aggregation (MAFA) module to fuse the multi-scale feature by assigning adaptive weight coefficients to different feature maps through skip connections. Finally, to explore the complementary contextual information and enhance the continuity of microvascular structures, a fusion module is designed to combine the segmentation results obtained from patches of different sizes, achieving fine microvascular segmentation performance. In order to assess the effectiveness of our approach, we conducted evaluations on three widely-used public datasets: DRIVE, CHASE-DB1, and STARE. Our findings reveal a remarkable advancement over the current state-of-the-art (SOTA) techniques, with the mean values of Se and F1 scores being an increase of 7.9% and 4.7%, respectively. The code is available at https://github.com/bai101315/MCDAU-Net.
Collapse
Affiliation(s)
- Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Weiqi Bai
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Jianhang Ji
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang, China.
| | - Ningyi Zhang
- School of Software, Jiangxi Normal University, Nanchang, China
| | - Wei Cui
- Institute for Infocomm Research, The Agency for Science, Technology and Research (A*STAR), Singapore.
| |
Collapse
|
5
|
Hao Q, Wang C, Xiao Y, Lin H. IMGC-GNN: A multi-granularity coupled graph neural network recommendation method based on implicit relationships. APPL INTELL 2022; 53:14668-14689. [PMID: 36340421 PMCID: PMC9628402 DOI: 10.1007/s10489-022-04215-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2022] [Indexed: 11/23/2022]
Abstract
In the application recommendation field, collaborative filtering (CF) method is often considered to be one of the most effective methods. As the basis of CF-based recommendation methods, representation learning needs to learn two types of factors: attribute factors revealed by independent individuals (e.g., user attributes, application types) and interaction factors contained in collaborative signals (e.g., interactions influenced by others). However, existing CF-based methods fail to learn these two factors separately; therefore, it is difficult to understand the deeper motivation behind user behaviors, resulting in suboptimal performance. From this point of view, we propose a multi-granularity coupled graph neural network recommendation method based on implicit relationships (IMGC-GNN). Specifically, we introduce contextual information (time and space) into user-application interactions and construct a three-layer coupled graph. Then, the graph neural network approach is used to learn the attribute and interaction factors separately. For attribute representation learning, we decompose the coupled graph into three homogeneous graphs with users, applications, and contexts as nodes. Next, we use multilayer aggregation operations to learn features between users, between contexts, and between applications. For interaction representation learning, we construct a homogeneous graph with user-context-application interactions as nodes. Next, we use node similarity and structural similarity to learn the deep interaction features. Finally, according to the learned representations, IMGC-GNN makes accurate application recommendations to users in different contexts. To verify the validity of the proposed method, we conduct experiments on real-world interaction data from three cities and compare our model with seven baseline methods. The experimental results show that our method has the best performance in the top-k recommendation.
Collapse
Affiliation(s)
- Qingbo Hao
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| | - Chundong Wang
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| | - Yingyuan Xiao
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| | - Hao Lin
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| |
Collapse
|
6
|
Li Y, Lian G, Zhang W, Ma G, Ren J, Yang J. Heterogeneous feature-aware Transformer-CNN coupling network for person re-identification. PeerJ Comput Sci 2022; 8:e1098. [PMID: 36262129 PMCID: PMC9575868 DOI: 10.7717/peerj-cs.1098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 08/22/2022] [Indexed: 06/16/2023]
Abstract
Person re-identification plays an important role in the construction of the smart city. A reliable person re-identification system relieves users from the inefficient work of identifying the specific individual from enormous numbers of photos or videos captured by different surveillance devices. The most existing methods either focus on local discriminative features without global contextual information or scatter global features while ignoring the local features, resulting in ineffective attention to irregular pedestrian zones. In this article, a novel Transformer-CNN Coupling Network (TCCNet) is proposed to capture the fluctuant body region features in a heterogeneous feature-aware manner. We employ two bridging modules, the Low-level Feature Coupling Module (LFCM) and the High-level Feature Coupling Module (HFCM), to improve the complementary characteristics of the hybrid network. It is significantly helpful to enhance the capacity to distinguish between foreground and background features, thereby reducing the unfavorable impact of cluttered backgrounds on person re-identification. Furthermore, the duplicate loss for the two branches is employed to incorporate semantic information from distant preferences of the two branches into the resulting person representation. The experiments on two large-scale person re-identification benchmarks demonstrate that the proposed TCCNet achieves competitive results compared with several state-of-the-art approaches. The mean Average Precision (mAP) and Rank-1 identification rate on the MSMT17 dataset achieve 66.9% and 84.5%, respectively.
Collapse
Affiliation(s)
- Yanchao Li
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, Liaoning, China
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic, Shenzhen, Guangdong, China
| | - Guoyun Lian
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic, Shenzhen, Guangdong, China
| | - Wenyu Zhang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, Liaoning, China
| | - Guanglin Ma
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, Liaoning, China
| | - Jin Ren
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic, Shenzhen, Guangdong, China
| | - Jinfeng Yang
- Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic, Shenzhen, Guangdong, China
| |
Collapse
|
7
|
Schwab D, Schienle A. Facial emotion processing in pediatric social anxiety disorder: Relevance of situational context. J Anxiety Disord 2017; 50:40-46. [PMID: 28551394 DOI: 10.1016/j.janxdis.2017.05.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Revised: 03/13/2017] [Accepted: 05/13/2017] [Indexed: 11/19/2022]
Abstract
Social anxiety disorder (SAD) typically begins in childhood. Previous research has demonstrated that adult patients respond with elevated late positivity (LP) to negative facial expressions. In the present study on pediatric SAD, we investigated responses to negative facial expressions and the role of social context information. Fifteen children with SAD and 15 non-anxious controls were first presented with images of negative facial expressions with masked backgrounds. Following this, the complete images which included context information, were shown. The negative expressions were either a result of an emotion-relevant (e.g., social exclusion) or emotion-irrelevant elicitor (e.g., weight lifting). Relative to controls, the clinical group showed elevated parietal LP during face processing with and without context information. Both groups differed in their frontal LP depending on the type of context. In SAD patients, frontal LP was lower in emotion-relevant than emotion-irrelevant contexts. We conclude that SAD patients direct more automatic attention towards negative facial expressions (parietal effect) and are less capable in integrating affective context information (frontal effect).
Collapse
Affiliation(s)
- Daniela Schwab
- University of Graz, Institute of Psychology, Department of Clinical Psychology, Universitätsplatz 2/III, A-8010 Graz, Austria.
| | - Anne Schienle
- University of Graz, Institute of Psychology, Department of Clinical Psychology, Universitätsplatz 2/III, A-8010 Graz, Austria.
| |
Collapse
|