1
|
Hoffmann M, Hoopes A, Greve DN, Fischl B, Dalca AV. Anatomy-aware and acquisition-agnostic joint registration with SynthMorph. IMAGING NEUROSCIENCE (CAMBRIDGE, MASS.) 2024; 2:1-33. [PMID: 39015335 PMCID: PMC11247402 DOI: 10.1162/imag_a_00197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 04/27/2024] [Accepted: 05/21/2024] [Indexed: 07/18/2024]
Abstract
Affine image registration is a cornerstone of medical-image analysis. While classical algorithms can achieve excellent accuracy, they solve a time-consuming optimization for every image pair. Deep-learning (DL) methods learn a function that maps an image pair to an output transform. Evaluating the function is fast, but capturing large transforms can be challenging, and networks tend to struggle if a test-image characteristic shifts from the training domain, such as the resolution. Most affine methods are agnostic to the anatomy the user wishes to align, meaning the registration will be inaccurate if algorithms consider all structures in the image. We address these shortcomings with SynthMorph, a fast, symmetric, diffeomorphic, and easy-to-use DL tool for joint affine-deformable registration of any brain image without preprocessing. First, we leverage a strategy that trains networks with widely varying images synthesized from label maps, yielding robust performance across acquisition specifics unseen at training. Second, we optimize the spatial overlap of select anatomical labels. This enables networks to distinguish anatomy of interest from irrelevant structures, removing the need for preprocessing that excludes content which would impinge on anatomy-specific registration. Third, we combine the affine model with a deformable hypernetwork that lets users choose the optimal deformation-field regularity for their specific data, at registration time, in a fraction of the time required by classical methods. This framework is applicable to learning anatomy-aware, acquisition-agnostic registration of any anatomy with any architecture, as long as label maps are available for training. We analyze how competing architectures learn affine transforms and compare state-of-the-art registration tools across an extremely diverse set of neuroimaging data, aiming to truly capture the behavior of methods in the real world. SynthMorph demonstrates high accuracy and is available at https://w3id.org/synthmorph, as a single complete end-to-end solution for registration of brain magnetic resonance imaging (MRI) data.
Collapse
Affiliation(s)
- Malte Hoffmann
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA, United States
- Department of Radiology, Massachusetts General Hospital, Boston, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
| | - Andrew Hoopes
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA, United States
- Department of Radiology, Massachusetts General Hospital, Boston, MA, United States
- Computer Science & Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Douglas N. Greve
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA, United States
- Department of Radiology, Massachusetts General Hospital, Boston, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
| | - Bruce Fischl
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA, United States
- Department of Radiology, Massachusetts General Hospital, Boston, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
- Computer Science & Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Adrian V. Dalca
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA, United States
- Department of Radiology, Massachusetts General Hospital, Boston, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
- Computer Science & Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, United States
| |
Collapse
|
2
|
Wang AQ, Yu EM, Dalca AV, Sabuncu MR. A robust and interpretable deep learning framework for multi-modal registration via keypoints. Med Image Anal 2023; 90:102962. [PMID: 37769550 PMCID: PMC10591968 DOI: 10.1016/j.media.2023.102962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 08/24/2023] [Accepted: 09/07/2023] [Indexed: 10/03/2023]
Abstract
We present KeyMorph, a deep learning-based image registration framework that relies on automatically detecting corresponding keypoints. State-of-the-art deep learning methods for registration often are not robust to large misalignments, are not interpretable, and do not incorporate the symmetries of the problem. In addition, most models produce only a single prediction at test-time. Our core insight which addresses these shortcomings is that corresponding keypoints between images can be used to obtain the optimal transformation via a differentiable closed-form expression. We use this observation to drive the end-to-end learning of keypoints tailored for the registration task, and without knowledge of ground-truth keypoints. This framework not only leads to substantially more robust registration but also yields better interpretability, since the keypoints reveal which parts of the image are driving the final alignment. Moreover, KeyMorph can be designed to be equivariant under image translations and/or symmetric with respect to the input image ordering. Finally, we show how multiple deformation fields can be computed efficiently and in closed-form at test time corresponding to different transformation variants. We demonstrate the proposed framework in solving 3D affine and spline-based registration of multi-modal brain MRI scans. In particular, we show registration accuracy that surpasses current state-of-the-art methods, especially in the context of large displacements. Our code is available at https://github.com/alanqrwang/keymorph.
Collapse
Affiliation(s)
- Alan Q Wang
- School of Electrical and Computer Engineering, Cornell University and Cornell Tech, New York, NY 10044, USA; Department of Radiology, Weill Cornell Medical School, New York, NY 10065, USA.
| | - Evan M Yu
- Iterative Scopes, Cambridge, MA 02139, USA
| | - Adrian V Dalca
- Computer Science and Artificial Intelligence Lab at the Massachusetts Institute of Technology, Cambridge, MA 02139, USA; A.A. Martinos Center for Biomedical Imaging at the Massachusetts General Hospital, Charlestown, MA 02129, USA
| | - Mert R Sabuncu
- School of Electrical and Computer Engineering, Cornell University and Cornell Tech, New York, NY 10044, USA; Department of Radiology, Weill Cornell Medical School, New York, NY 10065, USA
| |
Collapse
|
3
|
Whole-Body Keypoint and Skeleton Augmented RGB Networks for Video Action Recognition. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12126215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Incorporating multi-modality data is an effective way to improve action recognition performance. Based on this idea, we investigate a new data modality in which Whole-Body Keypoint and Skeleton (WKS) labels are used to capture refined body information. Unlike directly aggregated multi-modality, we leverage distillation to adapt an RGB network to classify action with the feature-extraction ability of the WKS network, which is only fed with RGB clips. Inspired by the success of transformers for vision tasks, we design an architecture that takes advantage of both three-dimensional (3D) convolutional neural networks (CNNs) and the Swin transformer to extract spatiotemporal features, resulting in advanced performance. Furthermore, considering the unequal discrimination among clips of a video, we also present a new method for aggregating the clip-level classification results, further improving the performance. The experimental results demonstrate that our framework achieves advanced accuracy of 93.4% with only RGB input on the UCF-101 dataset.
Collapse
|
4
|
Chauvin L, Kumar K, Desrosiers C, Wells W, Toews M. Efficient Pairwise Neuroimage Analysis Using the Soft Jaccard Index and 3D Keypoint Sets. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:836-845. [PMID: 34699353 PMCID: PMC9022638 DOI: 10.1109/tmi.2021.3123252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We propose a novel pairwise distance measure between image keypoint sets, for the purpose of large-scale medical image indexing. Our measure generalizes the Jaccard index to account for soft set equivalence (SSE) between keypoint elements, via an adaptive kernel framework modeling uncertainty in keypoint appearance and geometry. A new kernel is proposed to quantify the variability of keypoint geometry in location and scale. Our distance measure may be estimated between O (N 2) image pairs in [Formula: see text] operations via keypoint indexing. Experiments report the first results for the task of predicting family relationships from medical images, using 1010 T1-weighted MRI brain volumes of 434 families including monozygotic and dizygotic twins, siblings and half-siblings sharing 100%-25% of their polymorphic genes. Soft set equivalence and the keypoint geometry kernel improve upon standard hard set equivalence (HSE) and appearance kernels alone in predicting family relationships. Monozygotic twin identification is near 100%, and three subjects with uncertain genotyping are automatically paired with their self-reported families, the first reported practical application of image-based family identification. Our distance measure can also be used to predict group categories, sex is predicted with an AUC = 0.97. Software is provided for efficient fine-grained curation of large, generic image datasets.
Collapse
|
5
|
Grossiord E, Risser L, Kanoun S, Aziza R, Chiron H, Ysebaert L, Malgouyres F, Ken S. Semi-automatic segmentation of whole-body images in longitudinal studies. Biomed Phys Eng Express 2021; 7. [DOI: 10.1088/2057-1976/abce16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 11/26/2020] [Indexed: 11/12/2022]
Abstract
Abstract
We propose a semi-automatic segmentation pipeline designed for longitudinal studies considering structures with large anatomical variability, where expert interactions are required for relevant segmentations. Our pipeline builds on the regularized Fast Marching (rFM) segmentation approach by Risser et al (2018). It consists in transporting baseline multi-label FM seeds on follow-up images, selecting the relevant ones and finally performing the rFM approach. It showed increased, robust and faster results compared to clinical manual segmentation. Our method was evaluated on 3D synthetic images and patients’ whole-body MRI. It allowed a robust and flexible handling of organs longitudinal deformations while considerably reducing manual interventions.
Collapse
|