Zheng Z, Zhang Y, Liang E, Weng Z, Chai J, Li J. TRINet: Team Role Interaction Network for automatic radiology report generation.
Comput Biol Med 2024;
183:109275. [PMID:
39503110 DOI:
10.1016/j.compbiomed.2024.109275]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 10/10/2024] [Accepted: 10/13/2024] [Indexed: 11/20/2024]
Abstract
In recent years, the automatic generation of radiology reports as an auxiliary solution for expert diagnosis has garnered considerable attention from researchers. However, due to the complexity of medical image interpretation, current models exhibit issues such as aleatoric uncertainty and epistemic uncertainty, leading to a lack of stability in the content description of the generated medical reports. To address these issues, we propose a Team Role Interaction Network (TRINet) for the automatic generation of radiology reports, which is composed of multiple Team-member role models and a Team-leader role model. Specifically, we proposed a Cross-Modal Communication Mechanism (CMCM) among Team-members, enabling each Team-member to interact with image features and textual features output by its adjacent Team-member role model through a predefined grid-shaped parameter plane queries, thereby facilitating cross-modal information exchange among Team-members. Additionally, the Team-leader role model employs a Multi-Modal Fusion Mechanism (MMFM), performing sequence-to-sequence operations on the multidimensional outputs generated by the Team-member role models and the original inputs, aggregating the knowledge of each Team-member to produce the final medical report. We tested TRINet on two public benchmark datasets (IU X-ray and MIMIC-CXR), demonstrating the effectiveness and state-of-the-art performance of our model. The BLEU-4 indicator on the MIMIC-CXR test set reached the latest 0.144, 3.6 points higher than the previous best technique. Further research has shown that TRINet can effectively leverage the complementary information between the modalities perceived by each Team-member role model, simulating the collaborative process of an expert team, thereby significantly improving the system's accuracy and robustness.
Collapse