1
|
Wu C, Andaloussi MA, Hormuth DA, Lima EABF, Lorenzo G, Stowers CE, Ravula S, Levac B, Dimakis AG, Tamir JI, Brock KK, Chung C, Yankeelov TE. A critical assessment of artificial intelligence in magnetic resonance imaging of cancer. NPJ IMAGING 2025; 3:15. [PMID: 40226507 PMCID: PMC11981920 DOI: 10.1038/s44303-025-00076-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Accepted: 03/17/2025] [Indexed: 04/15/2025]
Abstract
Given the enormous output and pace of development of artificial intelligence (AI) methods in medical imaging, it can be challenging to identify the true success stories to determine the state-of-the-art of the field. This report seeks to provide the magnetic resonance imaging (MRI) community with an initial guide into the major areas in which the methods of AI are contributing to MRI in oncology. After a general introduction to artificial intelligence, we proceed to discuss the successes and current limitations of AI in MRI when used for image acquisition, reconstruction, registration, and segmentation, as well as its utility for assisting in diagnostic and prognostic settings. Within each section, we attempt to present a balanced summary by first presenting common techniques, state of readiness, current clinical needs, and barriers to practical deployment in the clinical setting. We conclude by presenting areas in which new advances must be realized to address questions regarding generalizability, quality assurance and control, and uncertainty quantification when applying MRI to cancer to maintain patient safety and practical utility.
Collapse
Affiliation(s)
- Chengyue Wu
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Breast Imaging, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
| | | | - David A. Hormuth
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Livestrong Cancer Institutes, The University of Texas at Austin, Austin, TX USA
| | - Ernesto A. B. F. Lima
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX USA
| | - Guillermo Lorenzo
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Health Research Institute of Santiago de Compostela, Santiago de Compostela, Spain
| | - Casey E. Stowers
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
| | - Sriram Ravula
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
| | - Brett Levac
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
| | - Alexandros G. Dimakis
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
| | - Jonathan I. Tamir
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
- Department of Diagnostic Medicine, The University of Texas at Austin, Austin, TX USA
| | - Kristy K. Brock
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Caroline Chung
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Neuroradiology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Thomas E. Yankeelov
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Livestrong Cancer Institutes, The University of Texas at Austin, Austin, TX USA
- Department of Diagnostic Medicine, The University of Texas at Austin, Austin, TX USA
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, TX USA
- Department of Oncology, The University of Texas at Austin, Austin, TX USA
| |
Collapse
|
2
|
Woollard G, Zhou W, Thiede EH, Lin C, Grigorieff N, Cossio P, Dao Duc K, Hanson SM. InstaMap: instant-NGP for cryo-EM density maps. Acta Crystallogr D Struct Biol 2025; 81:147-169. [PMID: 40135651 PMCID: PMC11966239 DOI: 10.1107/s2059798325002025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Accepted: 03/03/2025] [Indexed: 03/27/2025] Open
Abstract
Despite the parallels between problems in computer vision and cryo-electron microscopy (cryo-EM), many state-of-the-art approaches from computer vision have yet to be adapted for cryo-EM. Within the computer-vision research community, implicits such as neural radiance fields (NeRFs) have enabled the detailed reconstruction of 3D objects from few images at different camera-viewing angles. While other neural implicits, specifically density fields, have been used to map conformational heterogeneity from noisy cryo-EM projection images, most approaches represent volume with an implicit function in Fourier space, which has disadvantages compared with solving the problem in real space, complicating, for instance, masking, constraining physics or geometry, and assessing local resolution. In this work, we build on a recent development in neural implicits, a multi-resolution hash-encoding framework called instant-NGP, that we use to represent the scalar volume directly in real space and apply it to the cryo-EM density-map reconstruction problem (InstaMap). We demonstrate that for both synthetic and real data, InstaMap for homogeneous reconstruction achieves higher resolution at shorter training stages than five other real-spaced representations. We propose a solution to noise overfitting, demonstrate that InstaMap is both lightweight and fast to train, implement masking from a user-provided input mask and extend it to molecular-shape heterogeneity via bending space using a per-image vector field.
Collapse
Affiliation(s)
- Geoffrey Woollard
- Center for Computational Biology, Flatiron Institute, New York, NY10010, USA
- Center for Computational Mathematics, Flatiron Institute, New York, NY10010, USA
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| | - Wenda Zhou
- Center for Computational Mathematics, Flatiron Institute, New York, NY10010, USA
| | - Erik H. Thiede
- Center for Computational Biology, Flatiron Institute, New York, NY10010, USA
- Center for Computational Mathematics, Flatiron Institute, New York, NY10010, USA
- Cornell University, Ithaca, New York, USA
| | - Chen Lin
- Center for Computational Biology, Flatiron Institute, New York, NY10010, USA
- Center for Computational Mathematics, Flatiron Institute, New York, NY10010, USA
| | - Nikolaus Grigorieff
- University of Massachusetts Chan Medical School, Worcester, Massachusetts, USA
| | - Pilar Cossio
- Center for Computational Biology, Flatiron Institute, New York, NY10010, USA
- Center for Computational Mathematics, Flatiron Institute, New York, NY10010, USA
| | - Khanh Dao Duc
- Department of Mathematics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Sonya M. Hanson
- Center for Computational Biology, Flatiron Institute, New York, NY10010, USA
- Center for Computational Mathematics, Flatiron Institute, New York, NY10010, USA
| |
Collapse
|
3
|
Janjušević N, Khalilian-Gourtani A, Flinker A, Feng L, Wang Y. GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING 2025; 11:201-212. [PMID: 40124211 PMCID: PMC11928013 DOI: 10.1109/tci.2025.3539021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/25/2025]
Abstract
Nonlocal self-similarity within images has become an increasingly popular prior in deep-learning models. Despite their successful image restoration performance, such models remain largely uninterpretable due to their black-box construction. Our previous studies have shown that interpretable construction of a fully convolutional denoiser (CDLNet), with performance on par with state-of-the-art black-box counterparts, is achievable by unrolling a convolutional dictionary learning algorithm. In this manuscript, we seek an interpretable construction of a convolutional network with a nonlocal self-similarity prior that performs on par with black-box nonlocal models. We show that such an architecture can be effectively achieved by up-grading theℓ 1 sparsity prior (soft-thresholding) of CDLNet to an image-adaptive group-sparsity prior (group-thresholding). The proposed learned group-thresholding makes use of nonlocal attention to perform spatially varying soft-thresholding on the latent representation. To enable effective training and inference on large images with global artifacts, we propose a novel circulant-sparse attention. We achieve competitive natural-image denoising performance compared to black-box nonlocal DNNs and transformers. The interpretable construction of our network allows for a straightforward extension to Compressed Sensing MRI (CS-MRI), yielding state-of-the-art performance. Lastly, we show robustness to noise-level mismatches between training and inference for denoising and CS-MRI reconstruction.
Collapse
Affiliation(s)
- Nikola Janjušević
- New York University Tandon School of Engineering, Electrical and Computer Engineering Department, Brooklyn, NY 11201, USA
- New York University Grossman School of Medicine, Radiology Department, New York, NY 10016, USA
| | | | - Adeen Flinker
- New York University Grossman School of Medicine, Neurology Department, New York, NY 10016, USA
| | - Li Feng
- New York University Grossman School of Medicine, Radiology Department, New York, NY 10016, USA
| | - Yao Wang
- New York University Tandon School of Engineering, Electrical and Computer Engineering Department, Brooklyn, NY 11201, USA
| |
Collapse
|
4
|
Chu J, Du C, Lin X, Zhang X, Wang L, Zhang Y, Wei H. Highly accelerated MRI via implicit neural representation guided posterior sampling of diffusion models. Med Image Anal 2025; 100:103398. [PMID: 39608250 DOI: 10.1016/j.media.2024.103398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Revised: 11/10/2024] [Accepted: 11/15/2024] [Indexed: 11/30/2024]
Abstract
Reconstructing high-fidelity magnetic resonance (MR) images from under-sampled k-space is a commonly used strategy to reduce scan time. The posterior sampling of diffusion models based on the real measurement data holds significant promise of improved reconstruction accuracy. However, traditional posterior sampling methods often lack effective data consistency guidance, leading to inaccurate and unstable reconstructions. Implicit neural representation (INR) has emerged as a powerful paradigm for solving inverse problems by modeling a signal's attributes as a continuous function of spatial coordinates. In this study, we present a novel posterior sampler for diffusion models using INR, named DiffINR. The INR-based component incorporates both the diffusion prior distribution and the MRI physical model to ensure high data fidelity. DiffINR demonstrates superior performance on in-distribution datasets with remarkable accuracy, even under high acceleration factors (up to R = 12 in single-channel reconstruction). Furthermore, DiffINR exhibits excellent generalizability across various tissue contrasts and anatomical structures with low uncertainty. Overall, DiffINR significantly improves MRI reconstruction in terms of accuracy, generalizability and stability, paving the way for further accelerating MRI acquisition. Notably, our proposed framework can be a generalizable framework to solve inverse problems in other medical imaging tasks.
Collapse
Affiliation(s)
- Jiayue Chu
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Chenhe Du
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Xiyue Lin
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Xiaoqun Zhang
- Institute of Natural Sciences and School of Mathematical Sciences and MOE-LSC and SJTU-GenSci Joint Laboratory, Shanghai Jiao Tong University, Shanghai, China
| | - Lihui Wang
- Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, School of Computer Science and Technology, Guizhou University, Guiyang, China
| | - Yuyao Zhang
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Hongjiang Wei
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy (NERC-AMRT), Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
5
|
Didonna A, Ramos Lopez D, Iaselli G, Amoroso N, Ferrara N, Pugliese GMI. Deep Convolutional Framelets for Dose Reconstruction in Boron Neutron Capture Therapy with Compton Camera Detector. Cancers (Basel) 2025; 17:130. [PMID: 39796757 PMCID: PMC11719915 DOI: 10.3390/cancers17010130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Revised: 12/28/2024] [Accepted: 12/31/2024] [Indexed: 01/13/2025] Open
Abstract
BACKGROUND Boron neutron capture therapy (BNCT) is an innovative binary form of radiation therapy with high selectivity towards cancer tissue based on the neutron capture reaction 10B(n,α)7Li, consisting in the exposition of patients to neutron beams after administration of a boron compound with preferential accumulation in cancer cells. The high linear energy transfer products of the ensuing reaction deposit their energy at the cell level, sparing normal tissue. Although progress in accelerator-based BNCT has led to renewed interest in this cancer treatment modality, in vivo dose monitoring during treatment still remains not feasible and several approaches are under investigation. While Compton imaging presents various advantages over other imaging methods, it typically requires long reconstruction times, comparable with BNCT treatment duration. METHODS This study aims to develop deep neural network models to estimate the dose distribution by using a simulated dataset of BNCT Compton camera images. The models pursue the avoidance of the iteration time associated with the maximum-likelihood expectation-maximization algorithm (MLEM), enabling a prompt dose reconstruction during the treatment. The U-Net architecture and two variants based on the deep convolutional framelets framework have been used for noise and artifact reduction in few-iteration reconstructed images. RESULTS This approach has led to promising results in terms of reconstruction accuracy and processing time, with a reduction by a factor of about 6 with respect to classical iterative algorithms. CONCLUSIONS This can be considered a good reconstruction time performance, considering typical BNCT treatment times. Further enhancements may be achieved by optimizing the reconstruction of input images with different deep learning techniques.
Collapse
Affiliation(s)
- Angelo Didonna
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy (N.F.)
- Scuola di Specializzazione in Fisica Medica, Università degli Studi di Milano, 20133 Milan, Italy
| | - Dayron Ramos Lopez
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy (N.F.)
- Dipartimento Interateneo di Fisica, Politecnico di Bari, 70125 Bari, Italy
| | - Giuseppe Iaselli
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy (N.F.)
- Dipartimento Interateneo di Fisica, Politecnico di Bari, 70125 Bari, Italy
| | - Nicola Amoroso
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy (N.F.)
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Ferrara
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy (N.F.)
- Dipartimento Interateneo di Fisica, Politecnico di Bari, 70125 Bari, Italy
| | - Gabriella Maria Incoronata Pugliese
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy (N.F.)
- Dipartimento Interateneo di Fisica, Politecnico di Bari, 70125 Bari, Italy
| |
Collapse
|
6
|
Cohen O, Kargar S, Woo S, Vargas A, Otazo R. DCE-Qnet: deep network quantification of dynamic contrast enhanced (DCE) MRI. MAGMA (NEW YORK, N.Y.) 2024; 37:1077-1090. [PMID: 39112813 PMCID: PMC11996236 DOI: 10.1007/s10334-024-01189-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Accepted: 06/20/2024] [Indexed: 11/22/2024]
Abstract
INTRODUCTION Quantification of dynamic contrast-enhanced (DCE)-MRI has the potential to provide valuable clinical information, but robust pharmacokinetic modeling remains a challenge for clinical adoption. METHODS A 7-layer neural network called DCE-Qnet was trained on simulated DCE-MRI signals derived from the Extended Tofts model with the Parker arterial input function. Network training incorporated B1 inhomogeneities to estimate perfusion (Ktrans, vp, ve), tissue T1 relaxation, proton density and bolus arrival time (BAT). The accuracy was tested in a digital phantom in comparison to a conventional nonlinear least-squares fitting (NLSQ). In vivo testing was conducted in ten healthy subjects. Regions of interest in the cervix and uterine myometrium were used to calculate the inter-subject variability. The clinical utility was demonstrated on a cervical cancer patient. Test-retest experiments were used to assess reproducibility of the parameter maps in the tumor. RESULTS The DCE-Qnet reconstruction outperformed NLSQ in the phantom. The coefficient of variation (CV) in the healthy cervix varied between 5 and 51% depending on the parameter. Parameter values in the tumor agreed with previous studies despite differences in methodology. The CV in the tumor varied between 1 and 47%. CONCLUSION The proposed approach provides comprehensive DCE-MRI quantification from a single acquisition. DCE-Qnet eliminates the need for separate T1 scan or BAT processing, leading to a reduction of 10 min per scan and more accurate quantification.
Collapse
Affiliation(s)
- Ouri Cohen
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 320 East 61st St 10025, USA.
| | - Soudabeh Kargar
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 320 East 61st St 10025, USA
| | - Sungmin Woo
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Alberto Vargas
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Ricardo Otazo
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 320 East 61st St 10025, USA
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
7
|
Alberti GS, Santacesaria M, Sciutto S. Continuous Generative Neural Networks: A Wavelet-Based Architecture in Function Spaces. NUMERICAL FUNCTIONAL ANALYSIS AND OPTIMIZATION 2024; 46:1-44. [PMID: 39691281 PMCID: PMC11649217 DOI: 10.1080/01630563.2024.2422064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Revised: 08/28/2024] [Accepted: 10/02/2024] [Indexed: 12/19/2024]
Abstract
In this work, we present and study Continuous Generative Neural Networks (CGNNs), namely, generative models in the continuous setting: the output of a CGNN belongs to an infinite-dimensional function space. The architecture is inspired by DCGAN, with one fully connected layer, several convolutional layers and nonlinear activation functions. In the continuous L 2 setting, the dimensions of the spaces of each layer are replaced by the scales of a multiresolution analysis of a compactly supported wavelet. We present conditions on the convolutional filters and on the nonlinearity that guarantee that a CGNN is injective. This theory finds applications to inverse problems, and allows for deriving Lipschitz stability estimates for (possibly nonlinear) infinite-dimensional inverse problems with unknowns belonging to the manifold generated by a CGNN. Several numerical simulations, including signal deblurring, illustrate and validate this approach.
Collapse
Affiliation(s)
| | | | - Silvia Sciutto
- MaLGa Center, Department of Mathematics, University of Genoa, Genova, Italy
| |
Collapse
|
8
|
Debarnot V, Weiss P. Deep-blur: Blind identification and deblurring with convolutional neural networks. BIOLOGICAL IMAGING 2024; 4:e13. [PMID: 39776610 PMCID: PMC11704139 DOI: 10.1017/s2633903x24000096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 05/21/2024] [Accepted: 06/15/2024] [Indexed: 01/11/2025]
Abstract
We propose a neural network architecture and a training procedure to estimate blurring operators and deblur images from a single degraded image. Our key assumption is that the forward operators can be parameterized by a low-dimensional vector. The models we consider include a description of the point spread function with Zernike polynomials in the pupil plane or product-convolution expansions, which incorporate space-varying operators. Numerical experiments show that the proposed method can accurately and robustly recover the blur parameters even for large noise levels. For a convolution model, the average signal-to-noise ratio of the recovered point spread function ranges from 13 dB in the noiseless regime to 8 dB in the high-noise regime. In comparison, the tested alternatives yield negative values. This operator estimate can then be used as an input for an unrolled neural network to deblur the image. Quantitative experiments on synthetic data demonstrate that this method outperforms other commonly used methods both perceptually and in terms of SSIM. The algorithm can process a 512 512 image under a second on a consumer graphics card and does not require any human interaction once the operator parameterization has been set up.1.
Collapse
Affiliation(s)
- Valentin Debarnot
- Departement Mathematics and computer science, Basel University, Basel, Switzerland
| | - Pierre Weiss
- Institut de Recherche en Informatique de Toulouse (IRIT), CNRS & Université de Toulouse, Toulouse, France
- Centre de Biologie Intégrative (CBI), Laboratoire de biologie Moléculaire, Cellulaire et du Développement (MCD), CNRS & Université de Toulouse, Toulouse, France
| |
Collapse
|
9
|
Chen Y, Zhang H, Ma J, Cui TJ, del Hougne P, Li L. Semantic-Electromagnetic Inversion With Pretrained Multimodal Generative Model. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2406793. [PMID: 39246254 PMCID: PMC11558082 DOI: 10.1002/advs.202406793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 07/28/2024] [Indexed: 09/10/2024]
Abstract
Across diverse domains of science and technology, electromagnetic (EM) inversion problems benefit from the ability to account for multimodal prior information to regularize their inherent ill-posedness. Indeed, besides priors that are formulated mathematically or learned from quantitative data, valuable prior information may be available in the form of text or images. Besides handling semantic multimodality, it is furthermore important to minimize the cost of adapting to a new physical measurement operator and to limit the requirements for costly labeled data. Here, these challenges are tackled with a frugal and multimodal semantic-EM inversion technique. The key ingredient is a multimodal generator of reconstruction results that can be pretrained, being agnostic to the physical measurement operator. The generator is fed by a multimodal foundation model encoding the multimodal semantic prior and a physical adapter encoding the measured data. For a new physical setting, only the lightweight physical adapter is retrained. The authors' architecture also enables a flexible iterative step-by-step solution to the inverse problem where each step can be semantically controlled. The feasibility and benefits of this methodology are demonstrated for three EM inverse problems: a canonical two-dimensional inverse-scattering problem in numerics, as well as three-dimensional and four-dimensional compressive microwave meta-imaging experiments.
Collapse
Affiliation(s)
- Yanjin Chen
- State Key Laboratory of Advanced Optical Communication Systems and NetworksSchool of ElectronicsPeking UniversityBeijing100871China
| | - Hongrui Zhang
- State Key Laboratory of Advanced Optical Communication Systems and NetworksSchool of ElectronicsPeking UniversityBeijing100871China
| | - Jie Ma
- State Key Laboratory of Advanced Optical Communication Systems and NetworksSchool of ElectronicsPeking UniversityBeijing100871China
| | - Tie Jun Cui
- State Key Laboratory of Millimeter WavesSoutheast UniversityNanjing210096China
- Pazhou Laboratory (Huangpu)Guangzhou510555China
| | | | - Lianlin Li
- State Key Laboratory of Advanced Optical Communication Systems and NetworksSchool of ElectronicsPeking UniversityBeijing100871China
- Pazhou Laboratory (Huangpu)Guangzhou510555China
| |
Collapse
|
10
|
Naughton N, Cahoon SM, Sutton BP, Georgiadis JG. Accelerated, Physics-Inspired Inference of Skeletal Muscle Microstructure From Diffusion-Weighted MRI. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3698-3709. [PMID: 38709599 PMCID: PMC11650671 DOI: 10.1109/tmi.2024.3397790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Muscle health is a critical component of overall health and quality of life. However, current measures of skeletal muscle health take limited account of microstructural variations within muscle, which play a crucial role in mediating muscle function. To address this, we present a physics-inspired, machine learning-based framework for the non-invasive estimation of microstructural organization in skeletal muscle from diffusion-weighted MRI (dMRI) in an uncertainty-aware manner. To reduce the computational expense associated with direct numerical simulations of dMRI physics, a polynomial meta-model is developed that accurately represents the input/output relationships of a high-fidelity numerical model. This meta-model is used to develop a Gaussian process (GP) model that provides voxel-wise estimates and confidence intervals of microstructure organization in skeletal muscle. Given noise-free data, the GP model accurately estimates microstructural parameters. In the presence of noise, the diameter, intracellular diffusion coefficient, and membrane permeability are accurately estimated with narrow confidence intervals, while volume fraction and extracellular diffusion coefficient are poorly estimated and exhibit wide confidence intervals. A reduced-acquisition GP model, consisting of one-third the diffusion-encoding measurements, is shown to predict parameters with similar accuracy to the original model. The fiber diameter and volume fraction estimated by the reduced GP model is validated via histology, with both parameters accurately estimated, demonstrating the capability of the proposed framework as a promising non-invasive tool for assessing skeletal muscle health and function.
Collapse
|
11
|
Choi K. Self-supervised learning for CT image denoising and reconstruction: a review. Biomed Eng Lett 2024; 14:1207-1220. [PMID: 39465103 PMCID: PMC11502646 DOI: 10.1007/s13534-024-00424-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 08/28/2024] [Accepted: 09/03/2024] [Indexed: 10/29/2024] Open
Abstract
This article reviews the self-supervised learning methods for CT image denoising and reconstruction. Currently, deep learning has become a dominant tool in medical imaging as well as computer vision. In particular, self-supervised learning approaches have attracted great attention as a technique for learning CT images without clean/noisy references. After briefly reviewing the fundamentals of CT image denoising and reconstruction, we examine the progress of deep learning in CT image denoising and reconstruction. Finally, we focus on the theoretical and methodological evolution of self-supervised learning for image denoising and reconstruction.
Collapse
Affiliation(s)
- Kihwan Choi
- Department of Applied Artificial Intelligence, Seoul National University of Science and Technology, 232 Gongneung-ro, Nowon-gu, Seoul, 01811 Republic of Korea
| |
Collapse
|
12
|
Mohammadi N, Goswami S, Kabir IE, Khan S, Feng F, McAleavey S, Doyley MM, Cetin M. Integrating Learning-Based Priors With Physics-Based Models in Ultrasound Elasticity Reconstruction. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2024; 71:1406-1419. [PMID: 38913531 DOI: 10.1109/tuffc.2024.3417905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Ultrasound elastography images, which enable quantitative visualization of tissue stiffness, can be reconstructed by solving an inverse problem. Classical model-based methods are usually formulated in terms of constrained optimization problems. To stabilize the elasticity reconstructions, regularization techniques, such as Tikhonov method, are used with the cost of promoting smoothness and blurriness in the reconstructed images. Thus, incorporating a suitable regularizer is essential for reducing the elasticity reconstruction artifacts, while finding the most suitable one is challenging. In this work, we present a new statistical representation of the physical imaging model, which incorporates effective signal-dependent colored noise modeling. Moreover, we develop a learning-based integrated statistical framework, which combines a physical model with learning-based priors. We use a dataset of simulated phantoms with various elasticity distributions and geometric patterns to train a denoising regularizer as the learning-based prior. We use fixed-point approaches and variants of gradient descent for solving the integrated optimization task following learning-based plug-and-play (PnP) prior and regularization by denoising (RED) paradigms. Finally, we evaluate the performance of the proposed approaches in terms of relative mean square error (RMSE) with nearly 20% improvement for both piecewise smooth simulated phantoms and experimental phantoms compared with the classical model-based methods and 12% improvement for both spatially varying breast-mimicking simulated phantoms and an experimental breast phantom, demonstrating the potential clinical relevance of our work. Moreover, the qualitative comparisons of reconstructed images demonstrate the robust performance of the proposed methods even for complex elasticity structures that might be encountered in clinical settings.
Collapse
|
13
|
Yiasemis G, Moriakov N, Sonke JJ, Teuwen J. vSHARP: Variable Splitting Half-quadratic ADMM algorithm for reconstruction of inverse-problems. Magn Reson Imaging 2024; 115:110266. [PMID: 39461485 DOI: 10.1016/j.mri.2024.110266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 10/16/2024] [Accepted: 10/19/2024] [Indexed: 10/29/2024]
Abstract
Medical Imaging (MI) tasks, such as accelerated parallel Magnetic Resonance Imaging (MRI), often involve reconstructing an image from noisy or incomplete measurements. This amounts to solving ill-posed inverse problems, where a satisfactory closed-form analytical solution is not available. Traditional methods such as Compressed Sensing (CS) in MRI reconstruction can be time-consuming or prone to obtaining low-fidelity images. Recently, a plethora of Deep Learning (DL) approaches have demonstrated superior performance in inverse-problem solving, surpassing conventional methods. In this study, we propose vSHARP (variable Splitting Half-quadratic ADMM algorithm for Reconstruction of inverse Problems), a novel DL-based method for solving ill-posed inverse problems arising in MI. vSHARP utilizes the Half-Quadratic Variable Splitting method and employs the Alternating Direction Method of Multipliers (ADMM) to unroll the optimization process. For data consistency, vSHARP unrolls a differentiable gradient descent process in the image domain, while a DL-based denoiser, such as a U-Net architecture, is applied to enhance image quality. vSHARP also employs a dilated-convolution DL-based model to predict the Lagrange multipliers for the ADMM initialization. We evaluate vSHARP on tasks of accelerated parallel MRI Reconstruction using two distinct datasets and on accelerated parallel dynamic MRI Reconstruction using another dataset. Our comparative analysis with state-of-the-art methods demonstrates the superior performance of vSHARP in these applications.
Collapse
Affiliation(s)
- George Yiasemis
- Department of Radiation Oncology, the Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066 CX, the Netherlands; University of Amsterdam, Science Park 904, Amsterdam 1098 XH, the Netherlands
| | - Nikita Moriakov
- Department of Radiation Oncology, the Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066 CX, the Netherlands; University of Amsterdam, Science Park 904, Amsterdam 1098 XH, the Netherlands
| | - Jan-Jakob Sonke
- Department of Radiation Oncology, the Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066 CX, the Netherlands; University of Amsterdam, Science Park 904, Amsterdam 1098 XH, the Netherlands
| | - Jonas Teuwen
- Department of Radiation Oncology, the Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066 CX, the Netherlands; University of Amsterdam, Science Park 904, Amsterdam 1098 XH, the Netherlands; Department of Medical Imaging, Radboud University Medical Center, Geert Grooteplein Zuid 10, Nijmegen 6525 GA, the Netherlands.
| |
Collapse
|
14
|
Hu Y, Gan W, Ying C, Wang T, Eldeniz C, Liu J, Chen Y, An H, Kamilov US. SPICER: Self-supervised learning for MRI with automatic coil sensitivity estimation and reconstruction. Magn Reson Med 2024; 92:1048-1063. [PMID: 38725383 DOI: 10.1002/mrm.30121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 02/28/2024] [Accepted: 04/02/2024] [Indexed: 06/27/2024]
Abstract
PURPOSE To introduce a novel deep model-based architecture (DMBA), SPICER, that uses pairs of noisy and undersampled k-space measurements of the same object to jointly train a model for MRI reconstruction and automatic coil sensitivity estimation. METHODS SPICER consists of two modules to simultaneously reconstructs accurate MR images and estimates high-quality coil sensitivity maps (CSMs). The first module, CSM estimation module, uses a convolutional neural network (CNN) to estimate CSMs from the raw measurements. The second module, DMBA-based MRI reconstruction module, forms reconstructed images from the input measurements and the estimated CSMs using both the physical measurement model and learned CNN prior. With the benefit of our self-supervised learning strategy, SPICER can be efficiently trained without any fully sampled reference data. RESULTS We validate SPICER on both open-access datasets and experimentally collected data, showing that it can achieve state-of-the-art performance in highly accelerated data acquisition settings (up to10 × $$ 10\times $$ ). Our results also highlight the importance of different modules of SPICER-including the DMBA, the CSM estimation, and the SPICER training loss-on the final performance of the method. Moreover, SPICER can estimate better CSMs than pre-estimation methods especially when the ACS data is limited. CONCLUSION Despite being trained on noisy undersampled data, SPICER can reconstruct high-quality images and CSMs in highly undersampled settings, which outperforms other self-supervised learning methods and matches the performance of the well-known E2E-VarNet trained on fully sampled ground-truth data.
Collapse
Affiliation(s)
- Yuyang Hu
- Department of Electrical and Systems Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - Weijie Gan
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - Chunwei Ying
- Mallinckrodt Institute of Radiology, Washington University in St. Louis, St. Louis, Missouri
| | - Tongyao Wang
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - Cihat Eldeniz
- Mallinckrodt Institute of Radiology, Washington University in St. Louis, St. Louis, Missouri
| | - Jiaming Liu
- Department of Electrical and Systems Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - Yasheng Chen
- Department of Neurology, Washington University in St. Louis, St. Louis, Missouri
| | - Hongyu An
- Department of Electrical and Systems Engineering, Washington University in St. Louis, St. Louis, Missouri
- Mallinckrodt Institute of Radiology, Washington University in St. Louis, St. Louis, Missouri
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri
- Department of Neurology, Washington University in St. Louis, St. Louis, Missouri
| | - Ulugbek S Kamilov
- Department of Electrical and Systems Engineering, Washington University in St. Louis, St. Louis, Missouri
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri
| |
Collapse
|
15
|
Yasuhiko O, Takeuchi K. Bidirectional in-silico clearing approach for deep refractive-index tomography using a sparsely sampled transmission matrix. BIOMEDICAL OPTICS EXPRESS 2024; 15:5296-5313. [PMID: 39296398 PMCID: PMC11407245 DOI: 10.1364/boe.524859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 07/21/2024] [Accepted: 08/04/2024] [Indexed: 09/21/2024]
Abstract
Optical diffraction tomography (ODT) enables the label-free volumetric imaging of biological specimens by mapping their three-dimensional refractive index (RI) distribution. However, the depth of imaging achievable is restricted due to spatially inhomogeneous RI distributions that induce multiple scattering. In this study, we introduce a novel ODT technique named bidirectional in-silico clearing RI tomography. This method incorporates both forward and reversed in-silico clearing. For the reversed in-silico clearing, we have integrated an ODT reconstruction framework with a transmission matrix approach, which enables RI reconstruction and wave backpropagation from the illumination side without necessitating modifications to the conventional ODT setup. Furthermore, the framework employs a sparsely sampled transmission matrix, significantly reducing the requisite number of measurements and computational expenses. Employing this proposed technique, we successfully imaged a spheroid with a thickness of 263 µm, corresponding to 11.4 scattering mean free paths. This method was successfully applied to various biological specimens, including liver and colon spheroids, demonstrating consistent imaging performance across samples with varied morphologies.
Collapse
Affiliation(s)
- Osamu Yasuhiko
- Central Research Laboratory, Hamamatsu Photonics K.K., 5000 Hirakuchi, Hamana-ku, Hamamatsu, Shizuoka 434-8601, Japan
| | - Kozo Takeuchi
- Central Research Laboratory, Hamamatsu Photonics K.K., 5000 Hirakuchi, Hamana-ku, Hamamatsu, Shizuoka 434-8601, Japan
| |
Collapse
|
16
|
Manekar R, Negrini E, Pham M, Jacobs D, Srivastava J, Osher SJ, Miao J. Low-Light Phase Retrieval With Implicit Generative Priors. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:4728-4737. [PMID: 39178091 DOI: 10.1109/tip.2024.3445739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Phase retrieval (PR) is fundamentally important in scientific imaging and is crucial for nanoscale techniques like coherent diffractive imaging (CDI). Low radiation dose imaging is essential for applications involving radiation-sensitive samples. However, most PR methods struggle in low-dose scenarios due to high shot noise. Recent advancements in optical data acquisition setups, such as in-situ CDI, have shown promise for low-dose imaging, but they rely on a time series of measurements, making them unsuitable for single-image applications. Similarly, data-driven phase retrieval techniques are not easily adaptable to data-scarce situations. Zero-shot deep learning methods based on pre-trained and implicit generative priors have been effective in various imaging tasks but have shown limited success in PR. In this work, we propose low-dose deep image prior (LoDIP), which combines in-situ CDI with the power of implicit generative priors to address single-image low-dose phase retrieval. Quantitative evaluations demonstrate LoDIP's superior performance in this task and its applicability to real experimental scenarios.
Collapse
|
17
|
Capozzoli A, Catapano I, Cinotti E, Curcio C, Esposito G, Gennarelli G, Liseno A, Ludeno G, Soldovieri F. A Learned-SVD Approach to the Electromagnetic Inverse Source Problem. SENSORS (BASEL, SWITZERLAND) 2024; 24:4496. [PMID: 39065893 PMCID: PMC11281023 DOI: 10.3390/s24144496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/02/2024] [Accepted: 07/05/2024] [Indexed: 07/28/2024]
Abstract
We propose an artificial intelligence approach based on deep neural networks to tackle a canonical 2D scalar inverse source problem. The learned singular value decomposition (L-SVD) based on hybrid autoencoding is considered. We compare the reconstruction performance of L-SVD to the Truncated SVD (TSVD) regularized inversion, which is a canonical regularization scheme, to solve an ill-posed linear inverse problem. Numerical tests referring to far-field acquisitions show that L-SVD provides, with proper training on a well-organized dataset, superior performance in terms of reconstruction errors as compared to TSVD, allowing for the retrieval of faster spatial variations of the source. Indeed, L-SVD accommodates a priori information on the set of relevant unknown current distributions. Different from TSVD, which performs linear processing on a linear problem, L-SVD operates non-linearly on the data. A numerical analysis also underlines how the performance of the L-SVD degrades when the unknown source does not match the training dataset.
Collapse
Affiliation(s)
- Amedeo Capozzoli
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell’Informazione (DIETI), Università di Napoli Federico II, Via Claudio 21, I 80125 Napoli, Italy; (E.C.); (C.C.); (A.L.)
| | - Ilaria Catapano
- Consiglio Nazionale delle Ricerche, Istituto per il Rilevamento Elettromagnetico dell’Ambiente (IREA), Via Diocleziano 328, I 80124 Napoli, Italy; (I.C.); (G.E.); (G.G.); (G.L.); (F.S.)
| | - Eliana Cinotti
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell’Informazione (DIETI), Università di Napoli Federico II, Via Claudio 21, I 80125 Napoli, Italy; (E.C.); (C.C.); (A.L.)
- Consiglio Nazionale delle Ricerche, Istituto per il Rilevamento Elettromagnetico dell’Ambiente (IREA), Via Diocleziano 328, I 80124 Napoli, Italy; (I.C.); (G.E.); (G.G.); (G.L.); (F.S.)
| | - Claudio Curcio
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell’Informazione (DIETI), Università di Napoli Federico II, Via Claudio 21, I 80125 Napoli, Italy; (E.C.); (C.C.); (A.L.)
| | - Giuseppe Esposito
- Consiglio Nazionale delle Ricerche, Istituto per il Rilevamento Elettromagnetico dell’Ambiente (IREA), Via Diocleziano 328, I 80124 Napoli, Italy; (I.C.); (G.E.); (G.G.); (G.L.); (F.S.)
| | - Gianluca Gennarelli
- Consiglio Nazionale delle Ricerche, Istituto per il Rilevamento Elettromagnetico dell’Ambiente (IREA), Via Diocleziano 328, I 80124 Napoli, Italy; (I.C.); (G.E.); (G.G.); (G.L.); (F.S.)
| | - Angelo Liseno
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell’Informazione (DIETI), Università di Napoli Federico II, Via Claudio 21, I 80125 Napoli, Italy; (E.C.); (C.C.); (A.L.)
| | - Giovanni Ludeno
- Consiglio Nazionale delle Ricerche, Istituto per il Rilevamento Elettromagnetico dell’Ambiente (IREA), Via Diocleziano 328, I 80124 Napoli, Italy; (I.C.); (G.E.); (G.G.); (G.L.); (F.S.)
| | - Francesco Soldovieri
- Consiglio Nazionale delle Ricerche, Istituto per il Rilevamento Elettromagnetico dell’Ambiente (IREA), Via Diocleziano 328, I 80124 Napoli, Italy; (I.C.); (G.E.); (G.G.); (G.L.); (F.S.)
| |
Collapse
|
18
|
Wang Y, Cheng J. HiCDiff: single-cell Hi-C data denoising with diffusion models. Brief Bioinform 2024; 25:bbae279. [PMID: 38856167 PMCID: PMC11163381 DOI: 10.1093/bib/bbae279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 05/21/2024] [Accepted: 05/29/2024] [Indexed: 06/11/2024] Open
Abstract
The genome-wide single-cell chromosome conformation capture technique, i.e. single-cell Hi-C (ScHi-C), was recently developed to interrogate the conformation of the genome of individual cells. However, single-cell Hi-C data are much sparser than bulk Hi-C data of a population of cells, and noise in single-cell Hi-C makes it difficult to apply and analyze them in biological research. Here, we developed the first generative diffusion models (HiCDiff) to denoise single-cell Hi-C data in the form of chromosomal contact matrices. HiCDiff uses a deep residual network to remove the noise in the reverse process of diffusion and can be trained in both unsupervised and supervised learning modes. Benchmarked on several single-cell Hi-C test datasets, the diffusion models substantially remove the noise in single-cell Hi-C data. The unsupervised HiCDiff outperforms most supervised non-diffusion deep learning methods and achieves the performance comparable to the state-of-the-art supervised deep learning method in terms of multiple metrics, demonstrating that diffusion models are a useful approach to denoising single-cell Hi-C data. Moreover, its good performance holds on denoising bulk Hi-C data.
Collapse
Affiliation(s)
- Yanli Wang
- Department of Electrical Engineering and Computer Science, NextGen Precision Health Institute, University of Missouri, Columbia, MO 65211, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health Institute, University of Missouri, Columbia, MO 65211, United States
| |
Collapse
|
19
|
Zhang H, Chen Y, Wang Z, Cui TJ, Del Hougne P, Li L. Semantic regularization of electromagnetic inverse problems. Nat Commun 2024; 15:3869. [PMID: 38719933 PMCID: PMC11079068 DOI: 10.1038/s41467-024-48115-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
Solving ill-posed inverse problems typically requires regularization based on prior knowledge. To date, only prior knowledge that is formulated mathematically (e.g., sparsity of the unknown) or implicitly learned from quantitative data can be used for regularization. Thereby, semantically formulated prior knowledge derived from human reasoning and recognition is excluded. Here, we introduce and demonstrate the concept of semantic regularization based on a pre-trained large language model to overcome this vexing limitation. We study the approach, first, numerically in a prototypical 2D inverse scattering problem, and, second, experimentally in 3D and 4D compressive microwave imaging problems based on programmable metasurfaces. We highlight that semantic regularization enables new forms of highly-sought privacy protection for applications like smart homes, touchless human-machine interaction and security screening: selected subjects in the scene can be concealed, or their actions and postures can be altered in the reconstruction by manipulating the semantic prior with suitable language-based control commands.
Collapse
Affiliation(s)
- Hongrui Zhang
- State Key Laboratory of Advanced Optical Communication Systems and Networks, School of Electronics, Peking University, Beijing, 100871, China
| | - Yanjin Chen
- State Key Laboratory of Advanced Optical Communication Systems and Networks, School of Electronics, Peking University, Beijing, 100871, China
| | - Zhuo Wang
- State Key Laboratory of Advanced Optical Communication Systems and Networks, School of Electronics, Peking University, Beijing, 100871, China
| | - Tie Jun Cui
- State Key Laboratory of Millimeter Waves, Southeast University, Nanjing, 210096, China.
- Pazhou Laboratory (Huangpu), Guangzhou, Guangdong, 510555, China.
| | | | - Lianlin Li
- State Key Laboratory of Advanced Optical Communication Systems and Networks, School of Electronics, Peking University, Beijing, 100871, China.
- Pazhou Laboratory (Huangpu), Guangzhou, Guangdong, 510555, China.
| |
Collapse
|
20
|
Kofler A, Wald C, Kolbitsch C, V Tycowicz C, Ambellan F. Joint reconstruction and segmentation in undersampled 3D knee MRI combining shape knowledge and deep learning. Phys Med Biol 2024; 69:095022. [PMID: 38527376 DOI: 10.1088/1361-6560/ad3797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 03/25/2024] [Indexed: 03/27/2024]
Abstract
Objective.Task-adapted image reconstruction methods using end-to-end trainable neural networks (NNs) have been proposed to optimize reconstruction for subsequent processing tasks, such as segmentation. However, their training typically requires considerable hardware resources and thus, only relatively simple building blocks, e.g. U-Nets, are typically used, which, albeit powerful, do not integrate model-specific knowledge.Approach.In this work, we extend an end-to-end trainable task-adapted image reconstruction method for a clinically realistic reconstruction and segmentation problem of bone and cartilage in 3D knee MRI by incorporating statistical shape models (SSMs). The SSMs model the prior information and help to regularize the segmentation maps as a final post-processing step. We compare the proposed method to a simultaneous multitask learning approach for image reconstruction and segmentation (MTL) and to a complex SSMs-informed segmentation pipeline (SIS).Main results.Our experiments show that the combination of joint end-to-end training and SSMs to further regularize the segmentation maps obtained by MTL highly improves the results, especially in terms of mean and maximal surface errors. In particular, we achieve the segmentation quality of SIS and, at the same time, a substantial model reduction that yields a five-fold decimation in model parameters and a computational speedup of an order of magnitude.Significance.Remarkably, even for undersampling factors of up toR= 8, the obtained segmentation maps are of comparable quality to those obtained by SIS from ground-truth images.
Collapse
Affiliation(s)
- A Kofler
- Physikalisch-Technische Bundesanstalt, Braunschweig and Berlin, Germany
| | - C Wald
- Department of Mathematics, Technical University of Berlin, Berlin, Germany
| | - C Kolbitsch
- Physikalisch-Technische Bundesanstalt, Braunschweig and Berlin, Germany
| | - C V Tycowicz
- Department of Visual and Data-Centric Computing, Zuse Institute Berlin, Berlin, Germany
| | - F Ambellan
- Department of Visual and Data-Centric Computing, Zuse Institute Berlin, Berlin, Germany
| |
Collapse
|
21
|
Park H, Park JH, Hwang J. An inversion problem for optical spectrum data via physics-guided machine learning. Sci Rep 2024; 14:9042. [PMID: 38641702 PMCID: PMC11031606 DOI: 10.1038/s41598-024-59594-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 04/12/2024] [Indexed: 04/21/2024] Open
Abstract
We propose the regularized recurrent inference machine (rRIM), a novel machine-learning approach to solve the challenging problem of deriving the pairing glue function from measured optical spectra. The rRIM incorporates physical principles into both training and inference and affords noise robustness, flexibility with out-of-distribution data, and reduced data requirements. It effectively obtains reliable pairing glue functions from experimental optical spectra and yields promising solutions for similar inverse problems of the Fredholm integral equation of the first kind.
Collapse
Affiliation(s)
- Hwiwoo Park
- Department of Physics, Sungkyunkwan University, Suwon, Gyeonggi-do, 16419, Republic of Korea
| | - Jun H Park
- School of Mechanical Engineering, Sungkyunkwan University, Suwon, Gyeonggi-do, 16419, Republic of Korea.
| | - Jungseek Hwang
- Department of Physics, Sungkyunkwan University, Suwon, Gyeonggi-do, 16419, Republic of Korea.
| |
Collapse
|
22
|
Kumar N, Krause L, Wondrak T, Eckert S, Eckert K, Gumhold S. Robust Reconstruction of the Void Fraction from Noisy Magnetic Flux Density Using Invertible Neural Networks. SENSORS (BASEL, SWITZERLAND) 2024; 24:1213. [PMID: 38400371 PMCID: PMC10893175 DOI: 10.3390/s24041213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/07/2024] [Accepted: 02/09/2024] [Indexed: 02/25/2024]
Abstract
Electrolysis stands as a pivotal method for environmentally sustainable hydrogen production. However, the formation of gas bubbles during the electrolysis process poses significant challenges by impeding the electrochemical reactions, diminishing cell efficiency, and dramatically increasing energy consumption. Furthermore, the inherent difficulty in detecting these bubbles arises from the non-transparency of the wall of electrolysis cells. Additionally, these gas bubbles induce alterations in the conductivity of the electrolyte, leading to corresponding fluctuations in the magnetic flux density outside of the electrolysis cell, which can be measured by externally placed magnetic sensors. By solving the inverse problem of the Biot-Savart Law, we can estimate the conductivity distribution as well as the void fraction within the cell. In this work, we study different approaches to solve the inverse problem including Invertible Neural Networks (INNs) and Tikhonov regularization. Our experiments demonstrate that INNs are much more robust to solving the inverse problem than Tikhonov regularization when the level of noise in the magnetic flux density measurements is not known or changes over space and time.
Collapse
Affiliation(s)
- Nishant Kumar
- Institute of Software and Multimedia Technology, Technische Universität Dresden, 01187 Dresden, Germany;
| | - Lukas Krause
- Institute of Process Engineering and Environmental Technology, Technische Universität Dresden, 01069 Dresden, Germany; (L.K.); (K.E.)
- Institute of Fluid Dynamics, Helmholtz-Zentrum Dresden-Rossendorf, 01328 Dresden, Germany; (T.W.); (S.E.)
| | - Thomas Wondrak
- Institute of Fluid Dynamics, Helmholtz-Zentrum Dresden-Rossendorf, 01328 Dresden, Germany; (T.W.); (S.E.)
| | - Sven Eckert
- Institute of Fluid Dynamics, Helmholtz-Zentrum Dresden-Rossendorf, 01328 Dresden, Germany; (T.W.); (S.E.)
| | - Kerstin Eckert
- Institute of Process Engineering and Environmental Technology, Technische Universität Dresden, 01069 Dresden, Germany; (L.K.); (K.E.)
- Institute of Fluid Dynamics, Helmholtz-Zentrum Dresden-Rossendorf, 01328 Dresden, Germany; (T.W.); (S.E.)
| | - Stefan Gumhold
- Institute of Software and Multimedia Technology, Technische Universität Dresden, 01187 Dresden, Germany;
| |
Collapse
|
23
|
Sidky EY, Pan X. Report on the AAPM deep-learning spectral CT Grand Challenge. Med Phys 2024; 51:772-785. [PMID: 36938878 PMCID: PMC10509324 DOI: 10.1002/mp.16363] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 02/24/2023] [Accepted: 02/26/2023] [Indexed: 03/21/2023] Open
Abstract
BACKGROUND This Special Report summarizes the 2022 AAPM Grand Challenge on Deep-Learning spectral Computed Tomography (DL-spectral CT) image reconstruction. PURPOSE The purpose of the challenge is to develop the most accurate image reconstruction algorithm possible for solving the inverse problem associated with a fast kilovolt switching dual-energy CT scan using a three tissue-map decomposition. Participants could choose to use a deep-learning (DL), iterative, or a hybrid approach. METHODS The challenge is based on a 2D breast CT simulation, where the simulated breast phantom consists of three tissue maps: adipose, fibroglandular, and calcification distributions. The phantom specification is stochastic so that multiple realizations can be generated for DL approaches. A dual-energy scan is simulated where the x-ray source potential of successive views alternates between 50 and 80 kilovolts (kV). A total of 512 views are generated, yielding 256 views for each source voltage. We generate 50 and 80 kV images by use of filtered back-projection (FBP) on negative logarithm processed transmission data. For participants who develop a DL approach, 1000 cases are available. Each case consists of the three 512 × 512 tissue maps, 50 and 80-kV transmission data sets and their corresponding FBP images. The goal of the DL network would then be to predict the material maps from either the transmission data, FBP images, or a combination of the two. For participants developing a physics-based approach, all of the required modeling parameters are made available: geometry, spectra, and tissue attenuation curves. The provided information also allows for hybrid approaches where physics is exploited as well as information about the scanned object derived from the 1000 training cases. Final testing is performed by computation of root-mean-square error (RMSE) for predictions on the tissue maps from 100 new cases. RESULTS Test phase submission were received from 18 research groups. Of the 18 submissions, 17 were results obtained with algorithms that involved DL. Only the second place finishing team developed a physics-based image reconstruction algorithm. Both the winning and second place teams had highly accurate results where the RMSE was nearly zero to single floating point precision. Results from the top 10 also achieved a high degree of accuracy; and as a result, this special report outlines the methodology developed by each of these groups. CONCLUSIONS The DL-spectral CT challenge successfully established a forum for developing image reconstruction algorithms that address an important inverse problem relevant for spectral CT.
Collapse
Affiliation(s)
- Emil Y Sidky
- Department of Radiology, The University of Chicago, Chicago, Illinois, USA
| | - Xiaochuan Pan
- Department of Radiology, The University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
24
|
Xue F, Guo L, Bialkowski A, Abbosh A. Training Universal Deep-Learning Networks for Electromagnetic Medical Imaging Using a Large Database of Randomized Objects. SENSORS (BASEL, SWITZERLAND) 2023; 24:8. [PMID: 38202870 PMCID: PMC10780526 DOI: 10.3390/s24010008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/18/2023] [Accepted: 12/18/2023] [Indexed: 01/12/2024]
Abstract
Deep learning has become a powerful tool for solving inverse problems in electromagnetic medical imaging. However, contemporary deep-learning-based approaches are susceptible to inaccuracies stemming from inadequate training datasets, primarily consisting of signals generated from simplified and homogeneous imaging scenarios. This paper introduces a novel methodology to construct an expansive and diverse database encompassing domains featuring randomly shaped structures with electrical properties representative of healthy and abnormal tissues. The core objective of this database is to enable the training of universal deep-learning techniques for permittivity profile reconstruction in complex electromagnetic medical imaging domains. The constructed database contains 25,000 unique objects created by superimposing from 6 to 24 randomly sized ellipses and polygons with varying electrical attributes. Introducing randomness in the database enhances training, allowing the neural network to achieve universality while reducing the risk of overfitting. The representative signals in the database are generated using an array of antennas that irradiate the imaging domain and capture scattered signals. A custom-designed U-net is trained by using those signals to generate the permittivity profile of the defined imaging domain. To assess the database and confirm the universality of the trained network, three distinct testing datasets with diverse objects are imaged using the designed U-net. Quantitative assessments of the generated images show promising results, with structural similarity scores consistently exceeding 0.84, normalized root mean square errors remaining below 14%, and peak signal-to-noise ratios exceeding 33 dB. These results demonstrate the practicality of the constructed database for training deep learning networks that have generalization capabilities in solving inverse problems in medical imaging without the need for additional physical assistant algorithms.
Collapse
Affiliation(s)
- Fei Xue
- School of Electrical Engineering and Computer Science, The University of Queensland, Brisbane 4072, Australia; (L.G.); (A.B.); (A.A.)
| | | | | | | |
Collapse
|
25
|
Pramanik A, Bhave S, Sajib S, Sharma SD, Jacob M. Adapting model-based deep learning to multiple acquisition conditions: Ada-MoDL. Magn Reson Med 2023; 90:2033-2051. [PMID: 37332189 PMCID: PMC10524947 DOI: 10.1002/mrm.29750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 05/21/2023] [Accepted: 05/22/2023] [Indexed: 06/20/2023]
Abstract
PURPOSE The aim of this work is to introduce a single model-based deep network that can provide high-quality reconstructions from undersampled parallel MRI data acquired with multiple sequences, acquisition settings, and field strengths. METHODS A single unrolled architecture, which offers good reconstructions for multiple acquisition settings, is introduced. The proposed scheme adapts the model to each setting by scaling the convolutional neural network (CNN) features and the regularization parameter with appropriate weights. The scaling weights and regularization parameter are derived using a multilayer perceptron model from conditional vectors, which represents the specific acquisition setting. The perceptron parameters and the CNN weights are jointly trained using data from multiple acquisition settings, including differences in field strengths, acceleration, and contrasts. The conditional network is validated using datasets acquired with different acquisition settings. RESULTS The comparison of the adaptive framework, which trains a single model using the data from all the settings, shows that it can offer consistently improved performance for each acquisition condition. The comparison of the proposed scheme with networks that are trained independently for each acquisition setting shows that it requires less training data per acquisition setting to offer good performance. CONCLUSION The Ada-MoDL framework enables the use of a single model-based unrolled network for multiple acquisition settings. In addition to eliminating the need to train and store multiple networks for different acquisition settings, this approach reduces the training data needed for each acquisition setting.
Collapse
Affiliation(s)
- Aniket Pramanik
- Department of Electrical and Computer Engineering, University of Iowa, Iowa, USA
| | - Sampada Bhave
- Canon Medical Research USA, Inc., Mayfield Village, Ohio, USA
| | - Saurav Sajib
- Canon Medical Research USA, Inc., Mayfield Village, Ohio, USA
| | - Samir D. Sharma
- Canon Medical Research USA, Inc., Mayfield Village, Ohio, USA
| | - Mathews Jacob
- Department of Electrical and Computer Engineering, University of Iowa, Iowa, USA
| |
Collapse
|
26
|
Farris S, Clapp R, Araya-Polo M. Learning-Based Seismic Velocity Inversion with Synthetic and Field Data. SENSORS (BASEL, SWITZERLAND) 2023; 23:8277. [PMID: 37837108 PMCID: PMC10574958 DOI: 10.3390/s23198277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 09/22/2023] [Accepted: 10/02/2023] [Indexed: 10/15/2023]
Abstract
Building accurate acoustic subsurface velocity models is essential for successful industrial exploration projects. Traditional inversion methods from field-recorded seismograms struggle in regions with complex geology. While deep learning (DL) presents a promising alternative, its robustness using field data in these complicated regions has not been sufficiently explored. In this study, we present a thorough analysis of DL's capability to harness labeled seismograms, whether field-recorded or synthetically generated, for accurate velocity model recovery in a challenging region of the Gulf of Mexico. Our evaluation centers on the impact of training data selection and data augmentation techniques on the DL model's ability to recover velocity profiles. Models trained on field data produced superior results to data obtained using quantitative metrics like Mean Squared Error (MSE), Structural Similarity Index Measure (SSIM), and R2 (R-squared). They also yielded more geologically plausible predictions and sharper geophysical migration images. Conversely, models trained on synthetic data, while less precise, highlighted the potential utility of synthetic training data, especially when labeled field data are scarce. Our work shows that the efficacy of synthetic data-driven models largely depends on bridging the domain gap between training and test data through the use of advanced wave equation solvers and geologic priors. Our results underscore DL's potential to advance velocity model-building workflows in industrial settings using previously labeled field-recorded seismograms. They also highlight the indispensable role of earth scientists' domain expertise in curating synthetic data when field data are lacking.
Collapse
Affiliation(s)
- Stuart Farris
- Department of Geophysics, Stanford University, Stanford, CA 94305, USA;
| | - Robert Clapp
- Department of Geophysics, Stanford University, Stanford, CA 94305, USA;
| | | |
Collapse
|
27
|
Man C, Lau V, Su S, Zhao Y, Xiao L, Ding Y, Leung GK, Leong AT, Wu EX. Deep learning enabled fast 3D brain MRI at 0.055 tesla. SCIENCE ADVANCES 2023; 9:eadi9327. [PMID: 37738341 PMCID: PMC10516503 DOI: 10.1126/sciadv.adi9327] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 08/21/2023] [Indexed: 09/24/2023]
Abstract
In recent years, there has been an intensive development of portable ultralow-field magnetic resonance imaging (MRI) for low-cost, shielding-free, and point-of-care applications. However, its quality is poor and scan time is long. We propose a fast acquisition and deep learning reconstruction framework to accelerate brain MRI at 0.055 tesla. The acquisition consists of a single average three-dimensional (3D) encoding with 2D partial Fourier sampling, reducing the scan time of T1- and T2-weighted imaging protocols to 2.5 and 3.2 minutes, respectively. The 3D deep learning leverages the homogeneous brain anatomy available in high-field human brain data to enhance image quality, reduce artifacts and noise, and improve spatial resolution to synthetic 1.5-mm isotropic resolution. Our method successfully overcomes low-signal barrier, reconstructing fine anatomical structures that are reproducible within subjects and consistent across two protocols. It enables fast and quality whole-brain MRI at 0.055 tesla, with potential for widespread biomedical applications.
Collapse
Affiliation(s)
- Christopher Man
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Vick Lau
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Shi Su
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Yujiao Zhao
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Linfang Xiao
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Ye Ding
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Gilberto K. K. Leung
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Alex T. L. Leong
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Ed X. Wu
- Laboratory of Biomedical Imaging and Signal Processing, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, People’s Republic of China
| |
Collapse
|
28
|
Peng Y, Xiao Y, Chen W. High-fidelity and high-robustness free-space ghost transmission in complex media with coherent light source using physics-driven untrained neural network. OPTICS EXPRESS 2023; 31:30735-30749. [PMID: 37710611 DOI: 10.1364/oe.498073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 07/23/2023] [Indexed: 09/16/2023]
Abstract
It is well recognized that it is challenging to realize high-fidelity and high-robustness ghost transmission through complex media in free space using coherent light source. In this paper, we report a new method to realize high-fidelity and high-robustness ghost transmission through complex media by generating random amplitude-only patterns as 2D information carriers using physics-driven untrained neural network (UNN). The random patterns are generated to encode analog signals (i.e., ghost) without any training datasets and labeled data, and are used as information carriers in a free-space optical channel. Coherent light source modulated by the random patterns propagates through complex media, and a single-pixel detector is utilized to collect light intensities at the receiving end. A series of optical experiments have been conducted to verify the proposed approach. Experimental results demonstrate that the proposed method can realize high-fidelity and high-robustness analog-signal (ghost) transmission in complex environments, e.g., around a corner, or dynamic and turbid water. The proposed approach using the designed physics-driven UNN could open an avenue for high-fidelity free-space ghost transmission through complex media.
Collapse
|
29
|
Ivanenko M, Smolik WT, Wanta D, Midura M, Wróblewski P, Hou X, Yan X. Image Reconstruction Using Supervised Learning in Wearable Electrical Impedance Tomography of the Thorax. SENSORS (BASEL, SWITZERLAND) 2023; 23:7774. [PMID: 37765831 PMCID: PMC10538128 DOI: 10.3390/s23187774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 09/05/2023] [Accepted: 09/06/2023] [Indexed: 09/29/2023]
Abstract
Electrical impedance tomography (EIT) is a non-invasive technique for visualizing the internal structure of a human body. Capacitively coupled electrical impedance tomography (CCEIT) is a new contactless EIT technique that can potentially be used as a wearable device. Recent studies have shown that a machine learning-based approach is very promising for EIT image reconstruction. Most of the studies concern models containing up to 22 electrodes and focus on using different artificial neural network models, from simple shallow networks to complex convolutional networks. However, the use of convolutional networks in image reconstruction with a higher number of electrodes requires further investigation. In this work, two different architectures of artificial networks were used for CCEIT image reconstruction: a fully connected deep neural network and a conditional generative adversarial network (cGAN). The training dataset was generated by the numerical simulation of a thorax phantom with healthy and illness-affected lungs. Three kinds of illnesses, pneumothorax, pleural effusion, and hydropneumothorax, were modeled using the electrical properties of the tissues. The thorax phantom included the heart, aorta, spine, and lungs. The sensor with 32 area electrodes was used in the numerical model. The ECTsim custom-designed toolbox for Matlab was used to solve the forward problem and measurement simulation. Two artificial neural networks were trained with supervision for image reconstruction. Reconstruction quality was compared between those networks and one-step algebraic reconstruction methods such as linear back projection and pseudoinverse with Tikhonov regularization. This evaluation was based on pixel-to-pixel metrics such as root-mean-square error, structural similarity index, 2D correlation coefficient, and peak signal-to-noise ratio. Additionally, the diagnostic value measured by the ROC AUC metric was used to assess the image quality. The results showed that obtaining information about regional lung function (regions affected by pneumothorax or pleural effusion) is possible using image reconstruction based on supervised learning and deep neural networks in EIT. The results obtained using cGAN are strongly better than those obtained using a fully connected network, especially in the case of noisy measurement data. However, diagnostic value estimation showed that even algebraic methods allow us to obtain satisfactory results.
Collapse
Affiliation(s)
- Mikhail Ivanenko
- Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland; (M.I.); (D.W.); (M.M.); (P.W.)
| | - Waldemar T. Smolik
- Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland; (M.I.); (D.W.); (M.M.); (P.W.)
| | - Damian Wanta
- Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland; (M.I.); (D.W.); (M.M.); (P.W.)
| | - Mateusz Midura
- Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland; (M.I.); (D.W.); (M.M.); (P.W.)
| | - Przemysław Wróblewski
- Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland; (M.I.); (D.W.); (M.M.); (P.W.)
| | - Xiaohan Hou
- Faculty of Electrical and Control Engineering, Liaoning Technical University, No. 188 Longwan Street, Huludao 125105, China; (X.H.); (X.Y.)
| | - Xiaoheng Yan
- Faculty of Electrical and Control Engineering, Liaoning Technical University, No. 188 Longwan Street, Huludao 125105, China; (X.H.); (X.Y.)
| |
Collapse
|
30
|
Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, Fu H. Transformers in medical imaging: A survey. Med Image Anal 2023; 88:102802. [PMID: 37315483 DOI: 10.1016/j.media.2023.102802] [Citation(s) in RCA: 186] [Impact Index Per Article: 93.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/11/2023] [Accepted: 03/23/2023] [Indexed: 06/16/2023]
Abstract
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.
Collapse
Affiliation(s)
- Fahad Shamshad
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
| | - Salman Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; CECS, Australian National University, Canberra ACT 0200, Australia
| | - Syed Waqas Zamir
- Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | | | - Munawar Hayat
- Faculty of IT, Monash University, Clayton VIC 3800, Australia
| | - Fahad Shahbaz Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; Computer Vision Laboratory, Linköping University, Sweden
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore
| |
Collapse
|
31
|
Feshki M, Martel S, De Koninck Y, Gosselin B. Improving flat fluorescence microscopy in scattering tissue through deep learning strategies. OPTICS EXPRESS 2023; 31:23008-23026. [PMID: 37475396 DOI: 10.1364/oe.489677] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 05/24/2023] [Indexed: 07/22/2023]
Abstract
Intravital microscopy in small animals growingly contributes to the visualization of short- and long-term mammalian biological processes. Miniaturized fluorescence microscopy has revolutionized the observation of live animals' neural circuits. The technology's ability to further miniaturize to improve freely moving experimental settings is limited by its standard lens-based layout. Typical miniature microscope designs contain a stack of heavy and bulky optical components adjusted at relatively long distances. Computational lensless microscopy can overcome this limitation by replacing the lenses with a simple thin mask. Among other critical applications, Flat Fluorescence Microscope (FFM) holds promise to allow for real-time brain circuits imaging in freely moving animals, but recent research reports show that the quality needs to be improved, compared with imaging in clear tissue, for instance. Although promising results were reported with mask-based fluorescence microscopes in clear tissues, the impact of light scattering in biological tissue remains a major challenge. The outstanding performance of deep learning (DL) networks in computational flat cameras and imaging through scattering media studies motivates the development of deep learning models for FFMs. Our holistic ray-tracing and Monte Carlo FFM computational model assisted us in evaluating deep scattering medium imaging with DL techniques. We demonstrate that physics-based DL models combined with the classical reconstruction technique of the alternating direction method of multipliers (ADMM) perform a fast and robust image reconstruction, particularly in the scattering medium. The structural similarity indexes of the reconstructed images in scattering media recordings were increased by up to 20% compared with the prevalent iterative models. We also introduce and discuss the challenges of DL approaches for FFMs under physics-informed supervised and unsupervised learning.
Collapse
|
32
|
Chen YJ, Vyas S, Huang HM, Luo Y. Self-supervised neural network for phase retrieval in QDPC microscopy. OPTICS EXPRESS 2023; 31:19897-19908. [PMID: 37381395 DOI: 10.1364/oe.491496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 05/08/2023] [Indexed: 06/30/2023]
Abstract
Quantitative differential phase contrast (QDPC) microscope plays an important role in biomedical research since it can provide high-resolution images and quantitative phase information for thin transparent objects without staining. With weak phase assumption, the retrieval of phase information in QDPC can be treated as a linearly inverse problem which can be solved by Tikhonov regularization. However, the weak phase assumption is limited to thin objects, and tuning the regularization parameter manually is inconvenient. A self-supervised learning method based on deep image prior (DIP) is proposed to retrieve phase information from intensity measurements. The DIP model that takes intensity measurements as input is trained to output phase image. To achieve this goal, a physical layer that synthesizes the intensity measurements from the predicted phase is used. By minimizing the difference between the measured and predicted intensities, the trained DIP model is expected to reconstruct the phase image from its intensity measurements. To evaluate the performance of the proposed method, we conducted two phantom studies and reconstructed the micro-lens array and standard phase targets with different phase values. In the experimental results, the deviation of the reconstructed phase values obtained from the proposed method was less than 10% of the theoretical values. Our results show the feasibility of the proposed methods to predict quantitative phase with high accuracy, and no use of ground truth phase.
Collapse
|
33
|
Qayyum A, Ilahi I, Shamshad F, Boussaid F, Bennamoun M, Qadir J. Untrained Neural Network Priors for Inverse Imaging Problems: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6511-6536. [PMID: 36063506 DOI: 10.1109/tpami.2022.3204527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In recent years, advancements in machine learning (ML) techniques, in particular, deep learning (DL) methods have gained a lot of momentum in solving inverse imaging problems, often surpassing the performance provided by hand-crafted approaches. Traditionally, analytical methods have been used to solve inverse imaging problems such as image restoration, inpainting, and superresolution. Unlike analytical methods for which the problem is explicitly defined and the domain knowledge is carefully engineered into the solution, DL models do not benefit from such prior knowledge and instead make use of large datasets to predict an unknown solution to the inverse problem. Recently, a new paradigm of training deep models using a single image, named untrained neural network prior (UNNP) has been proposed to solve a variety of inverse tasks, e.g., restoration and inpainting. Since then, many researchers have proposed various applications and variants of UNNP. In this paper, we present a comprehensive review of such studies and various UNNP applications for different tasks and highlight various open research problems which require further research.
Collapse
|
34
|
Hasani H, Sun J, Zhu SI, Rong Q, Willomitzer F, Amor R, McConnell G, Cossairt O, Goodhill GJ. Whole-brain imaging of freely-moving zebrafish. Front Neurosci 2023; 17:1127574. [PMID: 37139528 PMCID: PMC10150962 DOI: 10.3389/fnins.2023.1127574] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 03/28/2023] [Indexed: 05/05/2023] Open
Abstract
One of the holy grails of neuroscience is to record the activity of every neuron in the brain while an animal moves freely and performs complex behavioral tasks. While important steps forward have been taken recently in large-scale neural recording in rodent models, single neuron resolution across the entire mammalian brain remains elusive. In contrast the larval zebrafish offers great promise in this regard. Zebrafish are a vertebrate model with substantial homology to the mammalian brain, but their transparency allows whole-brain recordings of genetically-encoded fluorescent indicators at single-neuron resolution using optical microscopy techniques. Furthermore zebrafish begin to show a complex repertoire of natural behavior from an early age, including hunting small, fast-moving prey using visual cues. Until recently work to address the neural bases of these behaviors mostly relied on assays where the fish was immobilized under the microscope objective, and stimuli such as prey were presented virtually. However significant progress has recently been made in developing brain imaging techniques for zebrafish which are not immobilized. Here we discuss recent advances, focusing particularly on techniques based on light-field microscopy. We also draw attention to several important outstanding issues which remain to be addressed to increase the ecological validity of the results obtained.
Collapse
Affiliation(s)
- Hamid Hasani
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, United States
| | - Jipeng Sun
- Department of Computer Science, Northwestern University, Evanston, IL, United States
| | - Shuyu I. Zhu
- Departments of Developmental Biology and Neuroscience, Washington University in St. Louis, St. Louis, MO, United States
| | - Qiangzhou Rong
- Departments of Developmental Biology and Neuroscience, Washington University in St. Louis, St. Louis, MO, United States
| | - Florian Willomitzer
- Wyant College of Optical Sciences, University of Arizona, Tucson, AZ, United States
| | - Rumelo Amor
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia
| | - Gail McConnell
- Centre for Biophotonics, Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, United Kingdom
| | - Oliver Cossairt
- Department of Computer Science, Northwestern University, Evanston, IL, United States
| | - Geoffrey J. Goodhill
- Departments of Developmental Biology and Neuroscience, Washington University in St. Louis, St. Louis, MO, United States
| |
Collapse
|
35
|
Zheng S, Zhu M, Chen M. Hybrid Multi-Dimensional Attention U-Net for Hyperspectral Snapshot Compressive Imaging Reconstruction. ENTROPY (BASEL, SWITZERLAND) 2023; 25:e25040649. [PMID: 37190437 PMCID: PMC10137936 DOI: 10.3390/e25040649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 04/10/2023] [Accepted: 04/10/2023] [Indexed: 05/17/2023]
Abstract
In order to capture the spatial-spectral (x,y,λ) information of the scene, various techniques have been proposed. Different from the widely used scanning-based methods, spectral snapshot compressive imaging (SCI) utilizes the idea of compressive sensing to compressively capture the 3D spatial-spectral data-cube in a single-shot 2D measurement and thus it is efficient, enjoying the advantages of high-speed and low bandwidth. However, the reconstruction process, i.e., to retrieve the 3D cube from the 2D measurement, is an ill-posed problem and it is challenging to reconstruct high quality images. Previous works usually use 2D convolutions and preliminary attention to address this challenge. However, these networks and attention do not exactly extract spectral features. On the other hand, 3D convolutions can extract more features in a 3D cube, but increase computational cost significantly. To balance this trade-off, in this paper, we propose a hybrid multi-dimensional attention U-Net (HMDAU-Net) to reconstruct hyperspectral images from the 2D measurement in an end-to-end manner. HMDAU-Net integrates 3D and 2D convolutions in an encoder-decoder structure to fully utilize the abundant spectral information of hyperspectral images with a trade-off between performance and computational cost. Furthermore, attention gates are employed to highlight salient features and suppress the noise carried by the skip connections. Our proposed HMDAU-Net achieves superior performance over previous state-of-the-art reconstruction algorithms.
Collapse
Affiliation(s)
- Siming Zheng
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingyu Zhu
- School of Engineering, Westlake University, Hangzhou 310024, China
| | - Mingliang Chen
- Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai 201800, China
| |
Collapse
|
36
|
Barmada S, Di Barba P, Formisano A, Mognaschi ME, Tucci M. Learning-Based Approaches to Current Identification from Magnetic Sensors. SENSORS (BASEL, SWITZERLAND) 2023; 23:3832. [PMID: 37112172 PMCID: PMC10146113 DOI: 10.3390/s23083832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/03/2023] [Accepted: 04/06/2023] [Indexed: 06/19/2023]
Abstract
Direct measurement of electric currents can be prevented by poor accessibility or prohibitive technical conditions. In such cases, magnetic sensors can be used to measure the field in regions adjacent to the sources, and the measured data then can be used to estimate source currents. Unfortunately, this is classified as an Electromagnetic Inverse Problem (EIP), and data from sensors must be cautiously treated to obtain meaningful current measurements. The usual approach requires using suited regularization schemes. On the other hand, behavioral approaches are recently spreading for this class of problems. The reconstructed model is not obliged to follow the physics equations, and this implies approximations which must be accurately controlled, especially if aiming to reconstruct an inverse model from examples. In this paper, a systematic study of the role of different learning parameters (or rules) on the (re-)construction of an EIP model is proposed, in comparison with more assessed regularization techniques. Attention is particularly devoted to linear EIPs, and in this class, a benchmark problem is used to illustrate in practice the results. It is shown that, by applying classical regularization methods and analogous correcting actions in behavioral models, similar results can be obtained. Both classical methodologies and neural approaches are described and compared in the paper.
Collapse
Affiliation(s)
- Sami Barmada
- Department of Energy, Systems, Territory and Construction Engineering (DESTEC), University of Pisa, 56122 Pisa, Italy
| | - Paolo Di Barba
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, 27100 Pavia, Italy
| | - Alessandro Formisano
- Department of Engineering, University of Campania “Luigi Vanvitelli”, 81031 Aversa, Italy
| | - Maria Evelina Mognaschi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, 27100 Pavia, Italy
| | - Mauro Tucci
- Department of Energy, Systems, Territory and Construction Engineering (DESTEC), University of Pisa, 56122 Pisa, Italy
| |
Collapse
|
37
|
Fang X, Wen K, An S, Zheng J, Li J, Zalevsky Z, Gao P. Reconstruction algorithm using 2N+1 raw images for structured illumination microscopy. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2023; 40:765-773. [PMID: 37132974 DOI: 10.1364/josaa.483884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
This paper presents a structured illumination microscopy (SIM) reconstruction algorithm that allows the reconstruction of super-resolved images with 2N + 1 raw intensity images, with N being the number of structured illumination directions used. The intensity images are recorded after using a 2D grating for the projection fringe and a spatial light modulator to select two orthogonal fringe orientations and perform phase shifting. Super-resolution images can be reconstructed from the five intensity images, enhancing the imaging speed and reducing the photobleaching by 17%, compared to conventional two-direction and three-step phase-shifting SIM. We believe the proposed technique will be further developed and widely applied in many fields.
Collapse
|
38
|
Zhang Z, Du H, Qiu B. FFVN: An explicit feature fusion-based variational network for accelerated multi-coil MRI reconstruction. Magn Reson Imaging 2023; 97:31-45. [PMID: 36586627 DOI: 10.1016/j.mri.2022.12.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 11/15/2022] [Accepted: 12/23/2022] [Indexed: 12/30/2022]
Abstract
Magnetic Resonance Imaging (MRI) is a leading diagnostic imaging modality that supports high contrast of soft tissues with no invasiveness or radiation. Nonetheless, it suffers from long scan time owing to the inherent physics in its data acquisition process, hampering its development and applications. Traditional strategies such as Compressed Sensing (CS) and Parallel Imaging (PI) allow for MRI acceleration via sub-sampling strategy, and multiple coils, respectively. When Deep Learning (DL) joins in, both strategies get re-vitalized to achieve even faster reconstruction in various reconstruction methods, among which the variational network is a previously proposed method that combines the mathematical structure of variational models with DL for fast MRI reconstruction. However, in our study we observe that the information of MR features is either not efficiently or explicitly exploited in former works based on the variational network. Instead, we introduce a variational network with explicit feature fusion that combines the CS, PI, with DL for accelerated multi-coil MRI reconstruction. By explicitly leveraging the extra information via feature fusion following feature extraction, our proposed method achieves comparably satisfying performance to the state-of-the-art methods without too much computation overhead on a public multi-coil brain dataset under 5-fold and 10-fold acceleration.
Collapse
Affiliation(s)
- Zhenxi Zhang
- Biomedical Engineering Center, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Hongwei Du
- Biomedical Engineering Center, University of Science and Technology of China, Hefei, Anhui 230026, China.
| | - Bensheng Qiu
- Biomedical Engineering Center, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
39
|
Federated End-to-End Unrolled Models for Magnetic Resonance Image Reconstruction. Bioengineering (Basel) 2023; 10:bioengineering10030364. [PMID: 36978755 PMCID: PMC10045102 DOI: 10.3390/bioengineering10030364] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/05/2023] [Accepted: 03/07/2023] [Indexed: 03/19/2023] Open
Abstract
Image reconstruction is the process of recovering an image from raw, under-sampled signal measurements, and is a critical step in diagnostic medical imaging, such as magnetic resonance imaging (MRI). Recently, data-driven methods have led to improved image quality in MRI reconstruction using a limited number of measurements, but these methods typically rely on the existence of a large, centralized database of fully sampled scans for training. In this work, we investigate federated learning for MRI reconstruction using end-to-end unrolled deep learning models as a means of training global models across multiple clients (data sites), while keeping individual scans local. We empirically identify a low-data regime across a large number of heterogeneous scans, where a small number of training samples per client are available and non-collaborative models lead to performance drops. In this regime, we investigate the performance of adaptive federated optimization algorithms as a function of client data distribution and communication budget. Experimental results show that adaptive optimization algorithms are well suited for the federated learning of unrolled models, even in a limited-data regime (50 slices per data site), and that client-sided personalization can improve reconstruction quality for clients that did not participate in training.
Collapse
|
40
|
Pramanik A, Zimmerman MB, Jacob M. Memory-efficient model-based deep learning with convergence and robustness guarantees. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING 2023; 9:260-275. [PMID: 37090026 PMCID: PMC10121192 DOI: 10.1109/tci.2023.3252268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Computational imaging has been revolutionized by compressed sensing algorithms, which offer guaranteed uniqueness, convergence, and stability properties. Model-based deep learning methods that combine imaging physics with learned regularization priors have emerged as more powerful alternatives for image recovery. The main focus of this paper is to introduce a memory efficient model-based algorithm with similar theoretical guarantees as CS methods. The proposed iterative algorithm alternates between a gradient descent involving the score function and a conjugate gradient algorithm to encourage data consistency. The score function is modeled as a monotone convolutional neural network. Our analysis shows that the monotone constraint is necessary and sufficient to enforce the uniqueness of the fixed point in arbitrary inverse problems. In addition, it also guarantees the convergence to a fixed point, which is robust to input perturbations. We introduce two implementations of the proposed MOL framework, which differ in the way the monotone property is imposed. The first approach enforces a strict monotone constraint, while the second one relies on an approximation. The guarantees are not valid for the second approach in the strict sense. However, our empirical studies show that the convergence and robustness of both approaches are comparable, while the less constrained approximate implementation offers better performance. The proposed deep equilibrium formulation is significantly more memory efficient than unrolled methods, which allows us to apply it to 3D or 2D+time problems that current unrolled algorithms cannot handle.
Collapse
Affiliation(s)
- Aniket Pramanik
- Department of Electrical and Computer Engineering at the University of Iowa, Iowa City, IA, 52242, USA
| | - M Bridget Zimmerman
- Department of Biostatistics at the University of Iowa, Iowa City, IA, 52242, USA
| | - Mathews Jacob
- Department of Electrical and Computer Engineering at the University of Iowa, Iowa City, IA, 52242, USA
| |
Collapse
|
41
|
Gkillas A, Ampeliotis D, Berberidis K. Connections Between Deep Equilibrium and Sparse Representation Models With Application to Hyperspectral Image Denoising. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:1513-1528. [PMID: 37027683 DOI: 10.1109/tip.2023.3245323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In this study, the problem of computing a sparse representation of multi-dimensional visual data is considered. In general, such data e.g., hyperspectral images, color images or video data consists of signals that exhibit strong local dependencies. A new computationally efficient sparse coding optimization problem is derived by employing regularization terms that are adapted to the properties of the signals of interest. Exploiting the merits of the learnable regularization techniques, a neural network is employed to act as structure prior and reveal the underlying signal dependencies. To solve the optimization problem Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures, that process the input dataset in a block-by-block fashion. Extensive simulation results, in the context of hyperspectral image denoising, are provided, which demonstrate that the proposed algorithms outperform significantly other sparse coding approaches and exhibit superior performance against recent state-of-the-art deep-learning-based denoising models. In a wider perspective, our work provides a unique bridge between a classic approach, that is the sparse representation theory, and modern representation tools that are based on deep learning modeling.
Collapse
|
42
|
Danan E, Cohen NE, Schwarz A, Shemer A, Danan Y. Deep learning method for pinhole array color image reconstruction. OPTICS LETTERS 2023; 48:1116-1119. [PMID: 36857227 DOI: 10.1364/ol.477693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 01/24/2023] [Indexed: 06/18/2023]
Abstract
The following paper proposes a combination of a supervised encoder-decoder neural network with coded apertures. Coded apertures provide improved sensitivity and signal-to-noise ratio (SNR) in planar images. The unique array design of this method overcomes the spatial frequency cutoff found in standard multi-pinhole arrays. In this design, the pinholes were positioned to minimize loss in spatial frequencies. The large number of pinholes results in significant overlapping on the detector. To overcome the overlapping issue, reconstruction of the object from the obtained image is done using inverse filtering methods. However, traces of duplications remain leading to a decline in SNR, contrast, and resolution. The proposed technique addresses the challenge of image distortion caused by the lack of accuracy in the inverse filter methods, by using a deep neural network. In this work, the coded aperture is combined with a deep convolutional neural network (CNN) to remove noise caused by pinhole imaging and inverse filter limitations. Compared to only using Wiener filtering, the proposed method delivers higher SNR, contrast, and resolution. The imaging system is presented in detail with experimental results that illustrate its efficiency.
Collapse
|
43
|
Cordero-Grande L, Ortuno-Fisac JE, Del Hoyo AA, Uus A, Deprez M, Santos A, Hajnal JV, Ledesma-Carbayo MJ. Fetal MRI by Robust Deep Generative Prior Reconstruction and Diffeomorphic Registration. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:810-822. [PMID: 36288233 DOI: 10.1109/tmi.2022.3217725] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Magnetic resonance imaging of whole fetal body and placenta is limited by different sources of motion affecting the womb. Usual scanning techniques employ single-shot multi-slice sequences where anatomical information in different slices may be subject to different deformations, contrast variations or artifacts. Volumetric reconstruction formulations have been proposed to correct for these factors, but they must accommodate a non-homogeneous and non-isotropic sampling, so regularization becomes necessary. Thus, in this paper we propose a deep generative prior for robust volumetric reconstructions integrated with a diffeomorphic volume to slice registration method. Experiments are performed to validate our contributions and compare with ifdefined tmiformat R2.5a state of the art method methods in the literature in a cohort of 72 fetal datasets in the range of 20-36 weeks gestational age. Results suggest improved image resolution Quantitative as well as radiological assessment suggest improved image quality and more accurate prediction of gestational age at scan is obtained when comparing to a state of the art reconstruction method methods. In addition, gestational age prediction results from our volumetric reconstructions compare favourably are competitive with existing brain-based approaches, with boosted accuracy when integrating information of organs other than the brain. Namely, a mean absolute error of 0.618 weeks ( R2=0.958 ) is achieved when combining fetal brain and trunk information.
Collapse
|
44
|
Mom K, Langer M, Sixou B. Deep Gauss-Newton for phase retrieval. OPTICS LETTERS 2023; 48:1136-1139. [PMID: 36857232 DOI: 10.1364/ol.484862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 01/26/2023] [Indexed: 06/18/2023]
Abstract
We propose the deep Gauss-Newton (DGN) algorithm. The DGN allows one to take into account the knowledge of the forward model in a deep neural network by unrolling a Gauss-Newton optimization method. No regularization or step size needs to be chosen; they are learned through convolutional neural networks. The proposed algorithm does not require an initial reconstruction and is able to retrieve simultaneously the phase and absorption from a single-distance diffraction pattern. The DGN method was applied to both simulated and experimental data and permitted large improvements of the reconstruction error and of the resolution compared with a state-of-the-art iterative method and another neural-network-based reconstruction algorithm.
Collapse
|
45
|
Fernandez-Grande E, Karakonstantis X, Caviedes-Nozal D, Gerstoft P. Generative models for sound field reconstruction. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:1179. [PMID: 36859132 DOI: 10.1121/10.0016896] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 01/03/2023] [Indexed: 06/18/2023]
Abstract
This work examines the use of generative adversarial networks for reconstructing sound fields from experimental data. It is investigated whether generative models, which learn the underlying statistics of a given signal or process, can improve the spatio-temporal reconstruction of a sound field by extending its bandwidth. The problem is significant as acoustic array processing is naturally band limited by the spatial sampling of the sound field (due to the difficulty to satisfy the Nyquist criterion in space domain at high frequencies). In this study, the reconstruction of spatial room impulse responses in a conventional room is tested based on three different generative adversarial models. The results indicate that the models can improve the reconstruction, mostly by recovering some of the sound field energy that would otherwise be lost at high frequencies. There is an encouraging outlook in the use of statistical learning models to overcome the bandwidth limitations of acoustic sensor arrays. The approach can be of interest in other areas, such as computational acoustics, to alleviate the classical computational burden at high frequencies.
Collapse
Affiliation(s)
- Efren Fernandez-Grande
- Department of Electrical and Photonics Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Xenofon Karakonstantis
- Department of Electrical and Photonics Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Diego Caviedes-Nozal
- Department of Electrical and Photonics Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Peter Gerstoft
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California 92037, USA
| |
Collapse
|
46
|
Cheng Z, Chen B, Lu R, Wang Z, Zhang H, Meng Z, Yuan X. Recurrent Neural Networks for Snapshot Compressive Imaging. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2264-2281. [PMID: 35324434 DOI: 10.1109/tpami.2022.3161934] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Conventional high-speed and spectral imaging systems are expensive and they usually consume a significant amount of memory and bandwidth to save and transmit the high-dimensional data. By contrast, snapshot compressive imaging (SCI), where multiple sequential frames are coded by different masks and then summed to a single measurement, is a promising idea to use a 2-dimensional camera to capture 3-dimensional scenes. In this paper, we consider the reconstruction problem in SCI, i.e., recovering a series of scenes from a compressed measurement. Specifically, the measurement and modulation masks are fed into our proposed network, dubbed BIdirectional Recurrent Neural networks with Adversarial Training (BIRNAT) to reconstruct the desired frames. BIRNAT employs a deep convolutional neural network with residual blocks and self-attention to reconstruct the first frame, based on which a bidirectional recurrent neural network is utilized to sequentially reconstruct the following frames. Moreover, we build an extended BIRNAT-color algorithm for color videos aiming at joint reconstruction and demosaicing. Extensive results on both video and spectral, simulation and real data from three SCI cameras demonstrate the superior performance of BIRNAT.
Collapse
|
47
|
Schledewitz T, Klein M, Rueter D. Magnetic Induction Tomography: Separation of the Ill-Posed and Non-Linear Inverse Problem into a Series of Isolated and Less Demanding Subproblems. SENSORS (BASEL, SWITZERLAND) 2023; 23:1059. [PMID: 36772097 PMCID: PMC9920446 DOI: 10.3390/s23031059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/09/2023] [Accepted: 01/12/2023] [Indexed: 06/18/2023]
Abstract
Magnetic induction tomography (MIT) is based on remotely excited eddy currents inside a measurement object. The conductivity distribution shapes the eddies, and their secondary fields are detected and used to reconstruct the conductivities. While the forward problem from given conductivities to detected signals can be unambiguously simulated, the inverse problem from received signals back to searched conductivities is a non-linear ill-posed problem that compromises MIT and results in rather blurry imaging. An MIT inversion is commonly applied over the entire process (i.e., localized conductivities are directly determined from specific signal features), but this involves considerable computation. The present more theoretical work treats the inverse problem as a non-retroactive series of four individual subproblems, each one less difficult by itself. The decoupled tasks yield better insights and control and promote more efficient computation. The overall problem is divided into an ill-posed but linear problem for reconstructing eddy currents from given signals and a nonlinear but benign problem for reconstructing conductivities from given eddies. The separated approach is unsuitable for common and circular MIT designs, as it merely fits the data structure of a recently presented and planar 3D MIT realization for large biomedical phantoms. For this MIT scanner, in discretization, the number of unknown and independent eddy current elements reflects the number of ultimately searched conductivities. For clarity and better representation, representative 2D bodies are used here and measured at the depth of the 3D scanner. The overall difficulty is not substantially smaller or different than for 3D bodies. In summary, the linear problem from signals to eddies dominates the overall MIT performance.
Collapse
|
48
|
Genzel M, Macdonald J, Marz M. Solving Inverse Problems With Deep Neural Networks - Robustness Included? IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:1119-1134. [PMID: 35119999 DOI: 10.1109/tpami.2022.3148324] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the past five years, deep learning methods have become state-of-the-art in solving various inverse problems. Before such approaches can find application in safety-critical fields, a verification of their reliability appears mandatory. Recent works have pointed out instabilities of deep neural networks for several image reconstruction tasks. In analogy to adversarial attacks in classification, it was shown that slight distortions in the input domain may cause severe artifacts. The present article sheds new light on this concern, by conducting an extensive study of the robustness of deep-learning-based algorithms for solving underdetermined inverse problems. This covers compressed sensing with Gaussian measurements as well as image recovery from Fourier and Radon measurements, including a real-world scenario for magnetic resonance imaging (using the NYU-fastMRI dataset). Our main focus is on computing adversarial perturbations of the measurements that maximize the reconstruction error. A distinctive feature of our approach is the quantitative and qualitative comparison with total-variation minimization, which serves as a provably robust reference method. In contrast to previous findings, our results reveal that standard end-to-end network architectures are not only resilient against statistical noise, but also against adversarial perturbations. All considered networks are trained by common deep learning techniques, without sophisticated defense strategies.
Collapse
|
49
|
Jia Y, McMichael N, Mokarzel P, Thompson B, Si D, Humphries T. Superiorization-inspired unrolled SART algorithm with U-Net generated perturbations for sparse-view and limited-angle CT reconstruction. Phys Med Biol 2022; 67. [PMID: 36541524 DOI: 10.1088/1361-6560/aca513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 11/22/2022] [Indexed: 11/23/2022]
Abstract
Objective.Unrolled algorithms are a promising approach for reconstruction of CT images in challenging scenarios, such as low-dose, sparse-view and limited-angle imaging. In an unrolled algorithm, a fixed number of iterations of a reconstruction method are unrolled into multiple layers of a neural network, and interspersed with trainable layers. The entire network is then trained end-to-end in a supervised fashion, to learn an appropriate regularizer from training data. In this paper we propose a novel unrolled algorithm, and compare its performance with several other approaches on sparse-view and limited-angle CT.Approach.The proposed algorithm is inspired by the superiorization methodology, an optimization heuristic in which iterates of a feasibility-seeking method are perturbed between iterations, typically using descent directions of a model-based penalty function. Our algorithm instead uses a modified U-net architecture to introduce the perturbations, allowing a network to learn beneficial perturbations to the image at various stages of the reconstruction, based on the training data.Main Results.In several numerical experiments modeling sparse-view and limited angle CT scenarios, the algorithm provides excellent results. In particular, it outperforms several competing unrolled methods in limited-angle scenarios, while providing comparable or better performance on sparse-view scenarios.Significance.This work represents a first step towards exploiting the power of deep learning within the superiorization methodology. Additionally, it studies the effect of network architecture on the performance of unrolled methods, as well as the effectiveness of the unrolled approach on both limited-angle CT, where previous studies have primarily focused on the sparse-view and low-dose cases.
Collapse
Affiliation(s)
- Yiran Jia
- School of STEM, University of Washington Bothell, Bothell, WA 98011, United States of America
| | - Noah McMichael
- School of STEM, University of Washington Bothell, Bothell, WA 98011, United States of America
| | - Pedro Mokarzel
- School of STEM, University of Washington Bothell, Bothell, WA 98011, United States of America
| | - Brandon Thompson
- School of STEM, University of Washington Bothell, Bothell, WA 98011, United States of America
| | - Dong Si
- School of STEM, University of Washington Bothell, Bothell, WA 98011, United States of America
| | - Thomas Humphries
- School of STEM, University of Washington Bothell, Bothell, WA 98011, United States of America
| |
Collapse
|
50
|
Wu X, Wu Z, Shanmugavel SC, Yu HZ, Zhu Y. Physics-informed neural network for phase imaging based on transport of intensity equation. OPTICS EXPRESS 2022; 30:43398-43416. [PMID: 36523038 DOI: 10.1364/oe.462844] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 09/26/2022] [Indexed: 06/17/2023]
Abstract
Non-interferometric quantitative phase imaging based on Transport of Intensity Equation (TIE) has been widely used in bio-medical imaging. However, analytic TIE phase retrieval is prone to low-spatial frequency noise amplification, which is caused by the illposedness of inversion at the origin of the spectrum. There are also retrieval ambiguities resulting from the lack of sensitivity to the curl component of the Poynting vector occurring with strong absorption. Here, we establish a physics-informed neural network (PINN) to address these issues, by integrating the forward and inverse physics models into a cascaded deep neural network. We demonstrate that the proposed PINN is efficiently trained using a small set of sample data, enabling the conversion of noise-corrupted 2-shot TIE phase retrievals to high quality phase images under partially coherent LED illumination. The efficacy of the proposed approach is demonstrated by both simulation using a standard image database and experiment using human buccal epitehlial cells. In particular, high image quality (SSIM = 0.919) is achieved experimentally using a reduced size of labeled data (140 image pairs). We discuss the robustness of the proposed approach against insufficient training data, and demonstrate that the parallel architecture of PINN is efficient for transfer learning.
Collapse
|