1
|
Li TZ, Xu K, Gao R, Tang Y, Lasko TA, Maldonado F, Sandler KL, Landman BA. Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomography. Proc SPIE Int Soc Opt Eng 2023; 12464:1246412. [PMID: 37465096 PMCID: PMC10353776 DOI: 10.1117/12.2653911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
Features learned from single radiologic images are unable to provide information about whether and how much a lesion may be changing over time. Time-dependent features computed from repeated images can capture those changes and help identify malignant lesions by their temporal behavior. However, longitudinal medical imaging presents the unique challenge of sparse, irregular time intervals in data acquisition. While self-attention has been shown to be a versatile and efficient learning mechanism for time series and natural images, its potential for interpreting temporal distance between sparse, irregularly sampled spatial features has not been explored. In this work, we propose two interpretations of a time-distance vision transformer (ViT) by using (1) vector embeddings of continuous time and (2) a temporal emphasis model to scale self-attention weights. The two algorithms are evaluated based on benign versus malignant lung cancer discrimination of synthetic pulmonary nodules and lung screening computed tomography studies from the National Lung Screening Trial (NLST). Experiments evaluating the time-distance ViTs on synthetic nodules show a fundamental improvement in classifying irregularly sampled longitudinal images when compared to standard ViTs. In cross-validation on screening chest CTs from the NLST, our methods (0.785 and 0.786 AUC respectively) significantly outperform a cross-sectional approach (0.734 AUC) and match the discriminative performance of the leading longitudinal medical imaging algorithm (0.779 AUC) on benign versus malignant classification. This work represents the first self-attention-based framework for classifying longitudinal medical images. Our code is available at https://github.com/tom1193/time-distance-transformer.
Collapse
Affiliation(s)
- Thomas Z Li
- Biomedical Engineering, Vanderbilt University, Nashville, TN, USA 37235
- School of Medicine, Vanderbilt University, Nashville, TN, US 37235
| | - Kaiwen Xu
- Computer Science, Vanderbilt University, Nashville, TN, USA 37235
| | - Riqiang Gao
- Computer Science, Vanderbilt University, Nashville, TN, USA 37235
| | - Yucheng Tang
- Electrical and Computer Engineering, Vanderbilt University, Nashville, TN, USA 37235
| | - Thomas A Lasko
- Computer Science, Vanderbilt University, Nashville, TN, USA 37235
- Biomedical Informatics, Vanderbilt University, Nashville, TN, USA 37235
| | - Fabien Maldonado
- Medicine, Vanderbilt University Medical Center, Nashville, TN, USA 37235
| | - Kim L Sandler
- Radiology & Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN, USA 37235
| | - Bennett A Landman
- Biomedical Engineering, Vanderbilt University, Nashville, TN, USA 37235
- Computer Science, Vanderbilt University, Nashville, TN, USA 37235
- Electrical and Computer Engineering, Vanderbilt University, Nashville, TN, USA 37235
- Radiology & Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN, USA 37235
| |
Collapse
|