1
|
Wang Z, Hu B, Zhang M, Li J, Li L, Gong M, Gao X. Diffusion Model-Based Visual Compensation Guidance and Visual Difference Analysis for No-Reference Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:263-278. [PMID: 40030878 DOI: 10.1109/tip.2024.3523800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Existing free-energy guided No-Reference Image Quality Assessment (NR-IQA) methods continue to face challenges in effectively restoring complexly distorted images. The features guiding the main network for quality assessment lack interpretability, and efficiently leveraging high-level feature information remains a significant challenge. As a novel class of state-of-the-art (SOTA) generative model, the diffusion model exhibits the capability to model intricate relationships, enhancing image restoration effectiveness. Moreover, the intermediate variables in the denoising iteration process exhibit clearer and more interpretable meanings for high-level visual information guidance. In view of these, we pioneer the exploration of the diffusion model into the domain of NR-IQA. We design a novel diffusion model for enhancing images with various types of distortions, resulting in higher quality and more interpretable high-level visual information. Our experiments demonstrate that the diffusion model establishes a clear mapping relationship between image reconstruction and image quality scores, which the network learns to guide quality assessment. Finally, to fully leverage high-level visual information, we design two complementary visual branches to collaboratively perform quality evaluation. Extensive experiments are conducted on seven public NR-IQA datasets, and the results demonstrate that the proposed model outperforms SOTA methods for NR-IQA. The codes will be available at https://github.com/handsomewzy/DiffV2IQA.
Collapse
|
2
|
Zhang Z, Tian S, Zou W, Morin L, Zhang L. EDDMF: An Efficient Deep Discrepancy Measuring Framework for Full-Reference Light Field Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:6426-6440. [PMID: 37966926 DOI: 10.1109/tip.2023.3329663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2023]
Abstract
The increasing demand for immersive experience has greatly promoted the quality assessment research of Light Field Image (LFI). In this paper, we propose an efficient deep discrepancy measuring framework for full-reference light field image quality assessment. The main idea of the proposed framework is to efficiently evaluate the quality degradation of distorted LFIs by measuring the discrepancy between reference and distorted LFI patches. Firstly, a patch generation module is proposed to extract spatio-angular patches and sub-aperture patches from LFIs, which greatly reduces the computational cost. Then, we design a hierarchical discrepancy network based on convolutional neural networks to extract the hierarchical discrepancy features between reference and distorted spatio-angular patches. Besides, the local discrepancy features between reference and distorted sub-aperture patches are extracted as complementary features. After that, the angular-dominant hierarchical discrepancy features and the spatial-dominant local discrepancy features are combined to evaluate the patch quality. Finally, the quality of all patches is pooled to obtain the overall quality of distorted LFIs. To the best of our knowledge, the proposed framework is the first patch-based full-reference light field image quality assessment metric based on deep-learning technology. Experimental results on four representative LFI datasets show that our proposed framework achieves superior performance as well as lower computational complexity compared to other state-of-the-art metrics.
Collapse
|
3
|
Wu G, Fei L, Deng L, Yang H, Han M, Han Z, Zhao L. Identification of Soybean Mutant Lines Based on Dual-Branch CNN Model Fusion Framework Utilizing Images from Different Organs. PLANTS (BASEL, SWITZERLAND) 2023; 12:2315. [PMID: 37375940 DOI: 10.3390/plants12122315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/08/2023] [Accepted: 06/13/2023] [Indexed: 06/29/2023]
Abstract
The accurate identification and classification of soybean mutant lines is essential for developing new plant varieties through mutation breeding. However, most existing studies have focused on the classification of soybean varieties. Distinguishing mutant lines solely by their seeds can be challenging due to their high genetic similarities. Therefore, in this paper, we designed a dual-branch convolutional neural network (CNN) composed of two identical single CNNs to fuse the image features of pods and seeds together to solve the soybean mutant line classification problem. Four single CNNs (AlexNet, GoogLeNet, ResNet18, and ResNet50) were used to extract features, and the output features were fused and input into the classifier for classification. The results demonstrate that dual-branch CNNs outperform single CNNs, with the dual-ResNet50 fusion framework achieving a 90.22 ± 0.19% classification rate. We also identified the most similar mutant lines and genetic relationships between certain soybean lines using a clustering tree and t-distributed stochastic neighbor embedding algorithm. Our study represents one of the primary efforts to combine various organs for the identification of soybean mutant lines. The findings of this investigation provide a new path to select potential lines for soybean mutation breeding and signify a meaningful advancement in the propagation of soybean mutant line recognition technology.
Collapse
Affiliation(s)
- Guangxia Wu
- College of Agronomy, Qingdao Agricultural University, Qingdao 266109, China
- Academy of Dongying Efficient Agricultural Technology and Industry on Saline and Alkaline Land in Collaboration with Qingdao Agricultural University, Dongying 257091, China
- Qingdao Key Laboratory of Specialty Plant Germplasm Innovation and Utilization in Saline Soils of Coastal Beach, Qingdao Agricultural University, Qingdao 266109, China
| | - Lin Fei
- College of Grassland Science, Qingdao Agricultural University, Qingdao 266109, China
| | - Limiao Deng
- Academy of Dongying Efficient Agricultural Technology and Industry on Saline and Alkaline Land in Collaboration with Qingdao Agricultural University, Dongying 257091, China
- College of Science and Information, Qingdao Agricultural University, Qingdao 266109, China
| | - Haoyan Yang
- College of Science and Information, Qingdao Agricultural University, Qingdao 266109, China
| | - Meng Han
- Rural Revitalization Service Center, Shizhong District, Zaozhuang 277000, China
| | - Zhongzhi Han
- Academy of Dongying Efficient Agricultural Technology and Industry on Saline and Alkaline Land in Collaboration with Qingdao Agricultural University, Dongying 257091, China
- College of Science and Information, Qingdao Agricultural University, Qingdao 266109, China
| | - Longgang Zhao
- Academy of Dongying Efficient Agricultural Technology and Industry on Saline and Alkaline Land in Collaboration with Qingdao Agricultural University, Dongying 257091, China
- College of Grassland Science, Qingdao Agricultural University, Qingdao 266109, China
| |
Collapse
|
4
|
Chen F, Fu H, Yu H, Chu Y. Using HVS Dual-Pathway and Contrast Sensitivity to Blindly Assess Image Quality. SENSORS (BASEL, SWITZERLAND) 2023; 23:4974. [PMID: 37430884 DOI: 10.3390/s23104974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 05/19/2023] [Accepted: 05/21/2023] [Indexed: 07/12/2023]
Abstract
Blind image quality assessment (BIQA) aims to evaluate image quality in a way that closely matches human perception. To achieve this goal, the strengths of deep learning and the characteristics of the human visual system (HVS) can be combined. In this paper, inspired by the ventral pathway and the dorsal pathway of the HVS, a dual-pathway convolutional neural network is proposed for BIQA tasks. The proposed method consists of two pathways: the "what" pathway, which mimics the ventral pathway of the HVS to extract the content features of distorted images, and the "where" pathway, which mimics the dorsal pathway of the HVS to extract the global shape features of distorted images. Then, the features from the two pathways are fused and mapped to an image quality score. Additionally, gradient images weighted by contrast sensitivity are used as the input to the "where" pathway, allowing it to extract global shape features that are more sensitive to human perception. Moreover, a dual-pathway multi-scale feature fusion module is designed to fuse the multi-scale features of the two pathways, enabling the model to capture both global features and local details, thus improving the overall performance of the model. Experiments conducted on six databases show that the proposed method achieves state-of-the-art performance.
Collapse
Affiliation(s)
- Fan Chen
- Department of Artificial Intelligence, Shenzhen University, Shenzhen 518060, China
| | - Hong Fu
- Department of Mathematics and Information Technology, The Education University of Hong Kong, Hong Kong, China
| | - Hengyong Yu
- Department of Electrical and Computer Engineering, University of Massachusetts Lowell, Lowell, MA 01854, USA
| | - Ying Chu
- Department of Artificial Intelligence, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
5
|
Zhu Y, Li C, Hu K, Luo H, Zhou M, Li X, Gao X. A new two-stream network based on feature separation and complementation for ultrasound image segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
6
|
Automatic No-Reference kidney tissue whole slide image quality assessment based on composite fusion models. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
7
|
No-reference image quality assessment with multi-scale weighted residuals and channel attention mechanism. Soft comput 2022. [DOI: 10.1007/s00500-022-07535-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2022]
|
8
|
Ma L, Liu R, Zhang J, Fan X, Luo Z. Learning Deep Context-Sensitive Decomposition for Low-Light Image Enhancement. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5666-5680. [PMID: 33929967 DOI: 10.1109/tnnls.2021.3071245] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Enhancing the quality of low-light (LOL) images plays a very important role in many image processing and multimedia applications. In recent years, a variety of deep learning techniques have been developed to address this challenging task. A typical framework is to simultaneously estimate the illumination and reflectance, but they disregard the scene-level contextual information encapsulated in feature spaces, causing many unfavorable outcomes, e.g., details loss, color unsaturation, and artifacts. To address these issues, we develop a new context-sensitive decomposition network (CSDNet) architecture to exploit the scene-level contextual dependencies on spatial scales. More concretely, we build a two-stream estimation mechanism including reflectance and illumination estimation network. We design a novel context-sensitive decomposition connection to bridge the two-stream mechanism by incorporating the physical principle. The spatially varying illumination guidance is further constructed for achieving the edge-aware smoothness property of the illumination component. According to different training patterns, we construct CSDNet (paired supervision) and context-sensitive decomposition generative adversarial network (CSDGAN) (unpaired supervision) to fully evaluate our designed architecture. We test our method on seven testing benchmarks [including massachusetts institute of technology (MIT)-Adobe FiveK, LOL, ExDark, and naturalness preserved enhancement (NPE)] to conduct plenty of analytical and evaluated experiments. Thanks to our designed context-sensitive decomposition connection, we successfully realized excellent enhanced results (with sufficient details, vivid colors, and few noises), which fully indicates our superiority against existing state-of-the-art approaches. Finally, considering the practical needs for high efficiency, we develop a lightweight CSDNet (named LiteCSDNet) by reducing the number of channels. Furthermore, by sharing an encoder for these two components, we obtain a more lightweight version (SLiteCSDNet for short). SLiteCSDNet just contains 0.0301M parameters but achieves the almost same performance as CSDNet. Code is available at https://github.com/KarelZhang/CSDNet-CSDGAN.
Collapse
|
9
|
Shi H, Wang L, Wang G. Blind Quality Prediction for View Synthesis Based on Heterogeneous Distortion Perception. SENSORS (BASEL, SWITZERLAND) 2022; 22:7081. [PMID: 36146438 PMCID: PMC9504726 DOI: 10.3390/s22187081] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 09/07/2022] [Accepted: 09/08/2022] [Indexed: 06/16/2023]
Abstract
The quality of synthesized images directly affects the practical application of virtual view synthesis technology, which typically uses a depth-image-based rendering (DIBR) algorithm to generate a new viewpoint based on texture and depth images. Current view synthesis quality metrics commonly evaluate the quality of DIBR-synthesized images, where the DIBR process is computationally expensive and time-consuming. In addition, the existing view synthesis quality metrics cannot achieve robustness due to the shallow hand-crafted features. To avoid the complicated DIBR process and learn more efficient features, this paper presents a blind quality prediction model for view synthesis based on HEterogeneous DIstortion Perception, dubbed HEDIP, which predicts the image quality of view synthesis from texture and depth images. Specifically, the texture and depth images are first fused based on discrete cosine transform to simulate the distortion of view synthesis images, and then the spatial and gradient domain features are extracted in a Two-Channel Convolutional Neural Network (TCCNN). Finally, a fully connected layer maps the extracted features to a quality score. Notably, the ground-truth score of the source image cannot effectively represent the labels of each image patch during training due to the presence of local distortions in view synthesis image. So, we design a Heterogeneous Distortion Perception (HDP) module to provide effective training labels for each image patch. Experiments show that with the help of the HDP module, the proposed model can effectively predict the quality of view synthesis. Experimental results demonstrate the effectiveness of the proposed model.
Collapse
Affiliation(s)
- Haozhi Shi
- School of Physics, Xidian University, Xi’an 710071, China
| | - Lanmei Wang
- School of Physics, Xidian University, Xi’an 710071, China
| | - Guibao Wang
- School of Physics and Telecommunication Engineering, Shaanxi University of Technology, Hanzhong 723001, China
| |
Collapse
|
10
|
Wang Z, Ma K. Active Fine-Tuning From gMAD Examples Improves Blind Image Quality Assessment. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:4577-4590. [PMID: 33830918 DOI: 10.1109/tpami.2021.3071759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The research in image quality assessment (IQA) has a long history, and significant progress has been made by leveraging recent advances in deep neural networks (DNNs). Despite high correlation numbers on existing IQA datasets, DNN-based models may be easily falsified in the group maximum differentiation (gMAD) competition. Here we show that gMAD examples can be used to improve blind IQA (BIQA) methods. Specifically, we first pre-train a DNN-based BIQA model using multiple noisy annotators, and fine-tune it on multiple synthetically distorted images, resulting in a "top-performing" baseline model. We then seek pairs of images by comparing the baseline model with a set of full-reference IQA methods in gMAD. The spotted gMAD examples are most likely to reveal the weaknesses of the baseline, and suggest potential ways for refinement. We query human quality annotations for the selected images in a well-controlled laboratory environment, and further fine-tune the baseline on the combination of human-rated images from gMAD and existing databases. This process may be iterated, enabling active fine-tuning from gMAD examples for BIQA. We demonstrate the feasibility of our active learning scheme on a large-scale unlabeled image set, and show that the fine-tuned quality model achieves improved generalizability in gMAD, without destroying performance on previously seen databases.
Collapse
|
11
|
Cao J, Wu W, Wang R, Kwong S. No-reference image quality assessment by using convolutional neural networks via object detection. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01611-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
12
|
Joint Inversion of Evaporation Duct Based on Radar Sea Clutter and Target Echo Using Deep Learning. ELECTRONICS 2022. [DOI: 10.3390/electronics11142157] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Tropospheric duct is an anomalous atmospheric phenomenon over the sea surface that seriously affects the normal operation and performance evaluation of electromagnetic communication equipment at sea. Therefore, achieving precise sensing of tropospheric duct is of profound significance for the propagation of electromagnetic signals. The approach of inverting atmospheric refractivity from easily measurable radar sea clutter is also known as the refractivity from clutter (RFC) technique. However, inversion precision of the conventional RFC technique is low in the low-altitude evaporation duct environment. Due to the weak attenuation of the over-the-horizon target signal as it passes through the tropospheric duct, its strength is much stronger than that of sea clutter. Therefore, this study proposes a new method for the joint inversion of evaporation duct height (EDH) based on sea clutter and target echo by combining deep learning. By testing the inversion performance and noise immunity of the new joint inversion method, the experimental results show that the mean error RMSE and MAE of the new method proposed in this paper are reduced by 41.2% and 40.3%, respectively, compared with the conventional method in the EDH range from 0 to 40 m. In particular, the RMSE and MAE in the EDH range from 0 to 16.7 m are reduced by 54.2% and 56.4%, respectively, compared with the conventional method. It shows that the target signal is more sensitive to the lower evaporation duct, which obviously enhances the inversion precision of the lower evaporation duct and has effectively improved the weak practicality of the conventional RFC technique.
Collapse
|
13
|
A dark and bright channel prior guided deep network for retinal image quality assessment. Biocybern Biomed Eng 2022. [DOI: 10.1016/j.bbe.2022.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
14
|
Xing Y, Golodetz S, Everitt A, Markham A, Trigoni N. Multiscale Human Activity Recognition and Anticipation Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:451-465. [PMID: 35622807 DOI: 10.1109/tnnls.2022.3175480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Deep convolutional neural networks have been leveraged to achieve huge improvements in video understanding and human activity recognition performance in the past decade. However, most existing methods focus on activities that have similar time scales, leaving the task of action recognition on multiscale human behaviors less explored. In this study, a two-stream multiscale human activity recognition and anticipation (MS-HARA) network is proposed, which is jointly optimized using a multitask learning method. The MS-HARA network fuses the two streams of the network using an efficient temporal-channel attention (TCA)-based fusion approach to improve the model's representational ability for both temporal and spatial features. We investigate the multiscale human activities from two basic categories, namely, midterm activities and long-term activities. The network is designed to function as part of a real-time processing framework to support interaction and mutual understanding between humans and intelligent machines. It achieves state-of-the-art results on several datasets for different tasks and different application domains. The midterm and long-term action recognition and anticipation performance, as well as the network fusion, are extensively tested to show the efficiency of the proposed network. The results show that the MS-HARA network can easily be extended to different application domains.
Collapse
|
15
|
Yang W, Wu J, Tian S, Li L, Dong W, Shi G. Fine-Grained Image Quality Caption With Hierarchical Semantics Degradation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3578-3590. [PMID: 35511851 DOI: 10.1109/tip.2022.3171445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Blind image quality assessment (BIQA), which is capable of precisely and automatically estimating human perceived image quality with no pristine image for comparison, attracts extensive attention and is of wide applications. Recently, many existing BIQA methods commonly represent image quality with a quantitative value, which is inconsistent with human cognition. Generally, human beings are good at perceiving image quality in terms of semantic description rather than quantitative value. Moreover, cognition is a needs-oriented task where humans are able to extract image contents with local to global semantics as they need. The mediocre quality value represents coarse or holistic image quality and fails to reflect degradation on hierarchical semantics. In this paper, to comply with human cognition, a novel quality caption model is inventively proposed to measure fine-grained image quality with hierarchical semantics degradation. Research on human visual system indicates there are hierarchy and reverse hierarchy correlations between hierarchical semantics. Meanwhile, empirical evidence shows that there are also bi-directional degradation dependencies between them. Thus, a novel bi-directional relationship-based network (BDRNet) is proposed for semantics degradation description, through adaptively exploring those correlations and degradation dependencies in a bi-directional manner. Extensive experiments demonstrate that our method outperforms the state-of-the-arts in terms of both evaluation performance and generalization ability.
Collapse
|
16
|
Singh A, Kaur A, Dhillon A, Ahuja S, Vohra H. Software system to predict the infection in COVID-19 patients using deep learning and web of things. SOFTWARE: PRACTICE & EXPERIENCE 2022; 52:868-886. [PMID: 34538962 PMCID: PMC8441673 DOI: 10.1002/spe.3011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/06/2021] [Accepted: 05/27/2021] [Indexed: 05/09/2023]
Abstract
Since the end of 2019, computed tomography (CT) images have been used as an important substitute for the time-consuming Reverse Transcriptase polymerase chain reaction (RT-PCR) test; a new coronavirus 2019 (COVID-19) disease has been detected and has quickly spread through many countries across the world. Medical imaging such as computed tomography provides great potential due to growing skepticism toward the sensitivity of RT-PCR as a screening tool. For this purpose, automated image segmentation is highly desired for a clinical decision aid and disease monitoring. However, there is limited publicly accessible COVID-19 image knowledge, leading to the overfitting of conventional approaches. To address this issue, the present paper focuses on data augmentation techniques to create synthetic data. Further, a framework has been proposed using WoT and traditional U-Net with EfficientNet B0 to segment the COVID Radiopedia and Medseg datasets automatically. The framework achieves an F-score of 0.96, which is best among state-of-the-art methods. The performance of the proposed framework also computed using Sensitivity, Specificity, and Dice-coefficient, achieves 84.5%, 93.9%, and 65.0%, respectively. Finally, the proposed work is validated using three quality of service (QoS) parameters such as server latency, response time, and network latency which improves the performance by 8%, 7%, and 10%, respectively.
Collapse
Affiliation(s)
- Ashima Singh
- CSEDThapar Institute of Engineering and TechnologyPatialaIndia
| | - Amrita Kaur
- CSEDThapar Institute of Engineering and TechnologyPatialaIndia
| | | | - Sahil Ahuja
- CSEDThapar Institute of Engineering and TechnologyPatialaIndia
| | - Harpreet Vohra
- ECEDThapar Institute of Engineering and TechnologyPatialaIndia
| |
Collapse
|
17
|
Zhang T, Zhang K, Xiao C, Xiong Z, Lu J. Joint channel-spatial attention network for super-resolution image quality assessment. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03338-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
18
|
Pan Z, Yuan F, Lei J, Fang Y, Shao X, Kwong S. VCRNet: Visual Compensation Restoration Network for No-Reference Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1613-1627. [PMID: 35081029 DOI: 10.1109/tip.2022.3144892] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Guided by the free-energy principle, generative adversarial networks (GAN)-based no-reference image quality assessment (NR-IQA) methods have improved the image quality prediction accuracy. However, the GAN cannot well handle the restoration task for the free-energy principle-guided NR-IQA methods, especially for the severely destroyed images, which results in that the quality reconstruction relationship between the distorted image and its restored image cannot be accurately built. To address this problem, a visual compensation restoration network (VCRNet)-based NR-IQA method is proposed, which uses a non-adversarial model to efficiently handle the distorted image restoration task. The proposed VCRNet consists of a visual restoration network and a quality estimation network. To accurately build the quality reconstruction relationship between the distorted image and its restored image, a visual compensation module, an optimized asymmetric residual block, and an error map-based mixed loss function, are proposed for increasing the restoration capability of the visual restoration network. For further addressing the NR-IQA problem of severely destroyed images, the multi-level restoration features which are obtained from the visual restoration network are used for the image quality estimation. To prove the effectiveness of the proposed VCRNet, seven representative IQA databases are used, and experimental results show that the proposed VCRNet achieves the state-of-the-art image quality prediction accuracy. The implementation of the proposed VCRNet has been released at https://github.com/NUIST-Videocoding/VCRNet.
Collapse
|
19
|
Liu X, He W, Zhang Y, Yao S, Cui Z. Effect of dual-convolutional neural network model fusion for Aluminum profile surface defects classification and recognition. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:997-1025. [PMID: 34903023 DOI: 10.3934/mbe.2022046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Classifying and identifying surface defects is essential during the production and use of aluminum profiles. Recently, the dual-convolutional neural network(CNN) model fusion framework has shown promising performance for defects classification and recognition. Spurred by this trend, this paper proposes an improved dual-CNN model fusion framework to classify and identify defects in aluminum profiles. Compared with traditional dual-CNN model fusion frameworks, the proposed architecture involves an improved fusion layer, fusion strategy, and classifier block. Specifically, the suggested method extracts the feature map of the aluminum profile RGB image from the pre-trained VGG16 model's pool5 layer and the feature map of the maximum pooling layer of the suggested A4 network, which is added after the Alexnet model. then, weighted bilinear interpolation unsamples the feature maps extracted from the maximum pooling layer of the A4 part. The network layer and upsampling schemes ensure equal feature map dimensions ensuring feature map merging utilizing an improved wavelet transform. Finally, global average pooling is employed in the classifier block instead of dense layers to reduce the model's parameters and avoid overfitting. The fused feature map is then input into the classifier block for classification. The experimental setup involves data augmentation and transfer learning to prevent overfitting due to the small-sized data sets exploited, while the K cross-validation method is employed to evaluate the model's performance during the training process. The experimental results demonstrate that the proposed dual-CNN model fusion framework attains a classification accuracy higher than current techniques, and specifically 4.3% higher than Alexnet, 2.5% for VGG16, 2.9% for Inception v3, 2.2% for VGG19, 3.6% for Resnet50, 3% for Resnet101, and 0.7% and 1.2% than the conventional dual-CNN fusion framework 1 and 2, respectively, proving the effectiveness of the proposed strategy.
Collapse
Affiliation(s)
- Xiaochen Liu
- School of Mechanical Engineering, Dalian Jiaotong University, Dalian 116028, China
| | - Weidong He
- School of Mechanical Engineering, Dalian Jiaotong University, Dalian 116028, China
| | - Yinghui Zhang
- School of Mechanical Engineering, Dalian Jiaotong University, Dalian 116028, China
| | - Shixuan Yao
- School of Software Engineering, Dalian University of Foreign Languages, Dalian 116044, China
| | - Ze Cui
- School of Control Science and Engineering, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
20
|
GMANet: Gradient Mask Attention Network for Finding Clearest Human Fecal Microscopic Image in Autofocus Process. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app112110293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The intelligent recognition of formed elements in microscopic images is a research hotspot. Whether the microscopic image is clear or blurred is the key factor affecting the recognition accuracy. Microscopic images of human feces contain numerous items, such as undigested food, epithelium, bacteria and other formed elements, leading to a complex image composition. Consequently, traditional image quality assessment (IQA) methods cannot accurately assess the quality of fecal microscopic images or even identify the clearest image in the autofocus process. In response to this difficulty, we propose a blind IQA method based on a deep convolutional neural network (CNN), namely GMANet. The gradient information of the microscopic image is introduced into a low-level convolutional layer of the CNN as a mask attention mechanism to force high-level features to pay more attention to sharp regions. Experimental results show that the proposed network has good consistency with human visual properties and can accurately identify the clearest microscopic image in the autofocus process. Our proposed model, trained on fecal microscopic images, can be directly applied to the autofocus process of leucorrhea and blood samples without additional transfer learning. Our study is valuable for the autofocus task of microscopic images with complex compositions.
Collapse
|
21
|
Yan Q, Wang B, Zhang W, Luo C, Xu W, Xu Z, Zhang Y, Shi Q, Zhang L, You Z. Attention-Guided Deep Neural Network With Multi-Scale Feature Fusion for Liver Vessel Segmentation. IEEE J Biomed Health Inform 2021; 25:2629-2642. [PMID: 33264097 DOI: 10.1109/jbhi.2020.3042069] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Liver vessel segmentation is fast becoming a key instrument in the diagnosis and surgical planning of liver diseases. In clinical practice, liver vessels are normally manual annotated by clinicians on each slice of CT images, which is extremely laborious. Several deep learning methods exist for liver vessel segmentation, however, promoting the performance of segmentation remains a major challenge due to the large variations and complex structure of liver vessels. Previous methods mainly using existing UNet architecture, but not all features of the encoder are useful for segmentation and some even cause interferences. To overcome this problem, we propose a novel deep neural network for liver vessel segmentation, called LVSNet, which employs special designs to obtain the accurate structure of the liver vessel. Specifically, we design Attention-Guided Concatenation (AGC) module to adaptively select the useful context features from low-level features guided by high-level features. The proposed AGC module focuses on capturing rich complemented information to obtain more details. In addition, we introduce an innovative multi-scale fusion block by constructing hierarchical residual-like connections within one single residual block, which is of great importance for effectively linking the local blood vessel fragments together. Furthermore, we construct a new dataset containing 40 thin thickness cases (0.625 mm) which consist of CT volumes and annotated vessels. To evaluate the effectiveness of the method with minor vessels, we also propose an automatic stratification method to split major and minor liver vessels. Extensive experimental results demonstrate that the proposed LVSNet outperforms previous methods on liver vessel segmentation datasets. Additionally, we conduct a series of ablation studies that comprehensively support the superiority of the underlying concepts.
Collapse
|
22
|
Wang B, Yang J, Ai J, Luo N, An L, Feng H, Yang B, You Z. Accurate Tumor Segmentation via Octave Convolution Neural Network. Front Med (Lausanne) 2021; 8:653913. [PMID: 34095168 PMCID: PMC8169966 DOI: 10.3389/fmed.2021.653913] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 03/24/2021] [Indexed: 11/13/2022] Open
Abstract
Three-dimensional (3D) liver tumor segmentation from Computed Tomography (CT) images is a prerequisite for computer-aided diagnosis, treatment planning, and monitoring of liver cancer. Despite many years of research, 3D liver tumor segmentation remains a challenging task. In this paper, we propose an effective and efficient method for tumor segmentation in liver CT images using encoder-decoder based octave convolution networks. Compared with other convolution networks utilizing standard convolution for feature extraction, the proposed method utilizes octave convolutions for learning multiple-spatial-frequency features, thus can better capture tumors with varying sizes and shapes. The proposed network takes advantage of a fully convolutional architecture which performs efficient end-to-end learning and inference. More importantly, we introduce a deep supervision mechanism during the learning process to combat potential optimization difficulties, and thus the model can acquire a much faster convergence rate and more powerful discrimination capability. Finally, we integrate octave convolutions into the encoder-decoder architecture of UNet, which can generate high resolution tumor segmentation in one single forward feeding without post-processing steps. Both architectures are trained on a subset of the LiTS (Liver Tumor Segmentation) Challenge. The proposed approach is shown to significantly outperform other networks in terms of various accuracy measures and processing speed.
Collapse
Affiliation(s)
- Bo Wang
- The State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China.,Innovation Center for Future Chips, Tsinghua University, Beijing, China.,Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Jingyi Yang
- School of Artificial Intelligence, Xidian University, Xi'an, China
| | - Jingyang Ai
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Nana Luo
- Affiliated Hospital of Jining Medical University, Jining, China
| | - Lihua An
- Affiliated Hospital of Jining Medical University, Jining, China
| | - Haixia Feng
- Affiliated Hospital of Jining Medical University, Jining, China
| | - Bo Yang
- China Institute of Marine Technology & Economy, Beijing, China
| | - Zheng You
- The State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China.,Innovation Center for Future Chips, Tsinghua University, Beijing, China
| |
Collapse
|
23
|
Wang B, Yang J, Peng H, Ai J, An L, Yang B, You Z, Ma L. Brain Tumor Segmentation via Multi-Modalities Interactive Feature Learning. Front Med (Lausanne) 2021; 8:653925. [PMID: 34055832 PMCID: PMC8158657 DOI: 10.3389/fmed.2021.653925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 03/04/2021] [Indexed: 11/16/2022] Open
Abstract
Automatic segmentation of brain tumors from multi-modalities magnetic resonance image data has the potential to enable preoperative planning and intraoperative volume measurement. Recent advances in deep convolutional neural network technology have opened up an opportunity to achieve end-to-end segmenting the brain tumor areas. However, the medical image data used in brain tumor segmentation are relatively scarce and the appearance of brain tumors is varied, so that it is difficult to find a learnable pattern to directly describe tumor regions. In this paper, we propose a novel cross-modalities interactive feature learning framework to segment brain tumors from the multi-modalities data. The core idea is that the multi-modality MR data contain rich patterns of the normal brain regions, which can be easily captured and can be potentially used to detect the non-normal brain regions, i.e., brain tumor regions. The proposed multi-modalities interactive feature learning framework consists of two modules: cross-modality feature extracting module and attention guided feature fusing module, which aim at exploring the rich patterns cross multi-modalities and guiding the interacting and the fusing process for the rich features from different modalities. Comprehensive experiments are conducted on the BraTS 2018 benchmark, which show that the proposed cross-modality feature learning framework can effectively improve the brain tumor segmentation performance when compared with the baseline methods and state-of-the-art methods.
Collapse
Affiliation(s)
- Bo Wang
- The State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China.,Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Jingyi Yang
- School of Artificial Intelligence, Xidian University, Xi'an, China
| | - Hong Peng
- Department of Radiology, The 1st Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Jingyang Ai
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Lihua An
- Radiology Department, Affiliated Hospital of Jining Medical University, Jining, China
| | - Bo Yang
- China Institute of Marine Technology & Economy, Beijing, China
| | - Zheng You
- The State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China
| | - Lin Ma
- Department of Radiology, The 1st Medical Center, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
24
|
|
25
|
Yan Q, Wang B, Gong D, Luo C, Zhao W, Shen J, Ai J, Shi Q, Zhang Y, Jin S, Zhang L, You Z. COVID-19 Chest CT Image Segmentation Network by Multi-Scale Fusion and Enhancement Operations. IEEE TRANSACTIONS ON BIG DATA 2021; 7:13-24. [PMID: 36811064 PMCID: PMC8769014 DOI: 10.1109/tbdata.2021.3056564] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 01/07/2021] [Accepted: 01/27/2021] [Indexed: 05/08/2023]
Abstract
A novel coronavirus disease 2019 (COVID-19) was detected and has spread rapidly across various countries around the world since the end of the year 2019. Computed Tomography (CT) images have been used as a crucial alternative to the time-consuming RT-PCR test. However, pure manual segmentation of CT images faces a serious challenge with the increase of suspected cases, resulting in urgent requirements for accurate and automatic segmentation of COVID-19 infections. Unfortunately, since the imaging characteristics of the COVID-19 infection are diverse and similar to the backgrounds, existing medical image segmentation methods cannot achieve satisfactory performance. In this article, we try to establish a new deep convolutional neural network tailored for segmenting the chest CT images with COVID-19 infections. We first maintain a large and new chest CT image dataset consisting of 165,667 annotated chest CT images from 861 patients with confirmed COVID-19. Inspired by the observation that the boundary of the infected lung can be enhanced by adjusting the global intensity, in the proposed deep CNN, we introduce a feature variation block which adaptively adjusts the global properties of the features for segmenting COVID-19 infection. The proposed FV block can enhance the capability of feature representation effectively and adaptively for diverse cases. We fuse features at different scales by proposing Progressive Atrous Spatial Pyramid Pooling to handle the sophisticated infection areas with diverse appearance and shapes. The proposed method achieves state-of-the-art performance. Dice similarity coefficients are 0.987 and 0.726 for lung and COVID-19 segmentation, respectively. We conducted experiments on the data collected in China and Germany and show that the proposed deep CNN can produce impressive performance effectively. The proposed network enhances the segmentation ability of the COVID-19 infection, makes the connection with other techniques and contributes to the development of remedying COVID-19 infection.
Collapse
Affiliation(s)
- Qingsen Yan
- Australian Institute for Machine LearningUniversity of Adelaide Adelaide SA 5005 Australia
| | - Bo Wang
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Innovation Center for Future ChipsTsinghua University (THU) Beijing 100084 China
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Dong Gong
- Australian Institute for Machine LearningUniversity of Adelaide Adelaide SA 5005 Australia
| | - Chuan Luo
- State Key Laboratory of Precision Measurement Technology and InstrumentsTsinghua University Beijing 100084 China
| | - Wei Zhao
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Jianhu Shen
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Jingyang Ai
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Qinfeng Shi
- Australian Institute for Machine LearningUniversity of Adelaide Adelaide SA 5005 Australia
| | - Yanning Zhang
- School of Computer ScienceNorthwestern Polytechnical University Xi'an 710072 China
| | - Shuo Jin
- Beijing Tsinghua Changgung Hospital, School of Clinical MedicineTsinghua University Beijing 100084 China
| | - Liang Zhang
- School of Computer Science and TechnologyXidian University Xi'an 710071 China
| | - Zheng You
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Innovation Center for Future ChipsTsinghua University (THU) Beijing 100084 China
| |
Collapse
|
26
|
高 琦, 朱 曼, 李 丹, 边 兆, 马 建. [CT image quality assessment based on prior information of pre-restored images]. NAN FANG YI KE DA XUE XUE BAO = JOURNAL OF SOUTHERN MEDICAL UNIVERSITY 2021; 41:230-237. [PMID: 33624596 PMCID: PMC7905247 DOI: 10.12122/j.issn.1673-4254.2021.02.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Indexed: 06/12/2023]
Abstract
OBJECTIVE We propose a CT IQA strategy based on the prior information of pre-restored images (PR-IQA) to improve the performance of IQA models. OBJECTIVE We propose a CNN-based no-reference CT IQA strategy using the prior information of image quality features in the image restoration algorithm, which is combined with the original distorted image information into the two CNNs through the pre-restored image and the residual image. Multi-information fusion was used to improve the feature extraction ability and prediction performance of CNN. We built a CT IQA dataset based on spiral CT data published by Mayo Clinic. The performance of PR- IQA was evaluated by calculating the quantitative metrics and statistical tests. The influence of different hyperparameter settings for PR-IQA was analyzed. We then compared PR-IQA with the BASELINE model based on the single CNN to evaluate the original distorted image without reference image and other eight IQA algorithms. OBJECTIVE The comparative experiment results showed that the PR-IQA model based on the prior information of 3 different image restoration algorithms (BF, NLM and BM3D) was better than all the tested IQA algorithms. Compared with the BASELINE method, the proposed method showed significantly improved performance, and the mean PLCC was increased by 12.56% and SROCC by 19.95%, and RMSE was decreased by 22.77%. OBJECTIVE The proposed PR-IQA method can make full use of the prior information of the image restoration algorithm to effectively predict the quality of CT images.
Collapse
Affiliation(s)
- 琦 高
- />南方医科大学生物医学工程学院//广州市医用放射成像与检测技术重点实验室,广东 广州 510515School of Biomedical Engineering, Southern Medical University; Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| | - 曼曼 朱
- />南方医科大学生物医学工程学院//广州市医用放射成像与检测技术重点实验室,广东 广州 510515School of Biomedical Engineering, Southern Medical University; Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| | - 丹阳 李
- />南方医科大学生物医学工程学院//广州市医用放射成像与检测技术重点实验室,广东 广州 510515School of Biomedical Engineering, Southern Medical University; Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| | - 兆英 边
- />南方医科大学生物医学工程学院//广州市医用放射成像与检测技术重点实验室,广东 广州 510515School of Biomedical Engineering, Southern Medical University; Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| | - 建华 马
- />南方医科大学生物医学工程学院//广州市医用放射成像与检测技术重点实验室,广东 广州 510515School of Biomedical Engineering, Southern Medical University; Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| |
Collapse
|
27
|
Shen L, Chen X, Pan Z, Fan K, Li F, Lei J. No-reference stereoscopic image quality assessment based on global and local content characteristics. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.10.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
28
|
Lyu J, Ling SH, Banerjee S, Zheng JY, Lai KL, Yang D, Zheng YP, Bi X, Su S, Chamoli U. Ultrasound volume projection image quality selection by ranking from convolutional RankNet. Comput Med Imaging Graph 2021; 89:101847. [PMID: 33476927 DOI: 10.1016/j.compmedimag.2020.101847] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 11/15/2020] [Accepted: 12/11/2020] [Indexed: 01/16/2023]
Abstract
Periodic inspection and assessment are important for scoliosis patients. 3D ultrasound imaging has become an important means of scoliosis assessment as it is a real-time, cost-effective and radiation-free imaging technique. With the generation of a 3D ultrasound volume projection spine image using our Scolioscan system, a series of 2D coronal ultrasound images are produced at different depths with different qualities. Selecting a high quality image from these 2D images is the crucial task for further scoliosis measurement. However, adjacent images are similar and difficult to distinguish. To learn the nuances between these images, we propose selecting the best image automatically, based on their quality rankings. Here, the ranking algorithm we use is a pairwise learning-to-ranking network, RankNet. Then, to extract more efficient features of input images and to improve the discriminative ability of the model, we adopt the convolutional neural network as the backbone due to its high power of image exploration. Finally, by inputting the images in pairs into the proposed convolutional RankNet, we can select the best images from each case based on the output ranking orders. The experimental result shows that convolutional RankNet achieves better than 95.5% top-3 accuracy, and we prove that this performance is beyond the experience of a human expert.
Collapse
Affiliation(s)
- Juan Lyu
- College of Information and Communication Engineering, Harbin Engineering University, Harbin, China
| | - Sai Ho Ling
- School of Biomedical Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia.
| | - S Banerjee
- School of Biomedical Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - J Y Zheng
- Department of Computer Science, Imperial College London, UK
| | - K L Lai
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hung Hum, Hong Kong
| | - D Yang
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hung Hum, Hong Kong
| | - Y P Zheng
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hung Hum, Hong Kong
| | - Xiaojun Bi
- College of Information and Communication Engineering, Harbin Engineering University, Harbin, China; College of Information Engineering, Minzu University of China, Beijing, China
| | - Steven Su
- School of Biomedical Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Uphar Chamoli
- School of Biomedical Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia
| |
Collapse
|
29
|
Ma J, Wu J, Li L, Dong W, Xie X, Shi G, Lin W. Blind Image Quality Assessment With Active Inference. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:3650-3663. [PMID: 33705313 DOI: 10.1109/tip.2021.3064195] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Blind image quality assessment (BIQA) is a useful but challenging task. It is a promising idea to design BIQA methods by mimicking the working mechanism of human visual system (HVS). The internal generative mechanism (IGM) indicates that the HVS actively infers the primary content (i.e., meaningful information) of an image for better understanding. Inspired by that, this paper presents a novel BIQA metric by mimicking the active inference process of IGM. Firstly, an active inference module based on the generative adversarial network (GAN) is established to predict the primary content, in which the semantic similarity and the structural dissimilarity (i.e., semantic consistency and structural completeness) are both considered during the optimization. Then, the image quality is measured on the basis of its primary content. Generally, the image quality is highly related to three aspects, i.e., the scene information (content-dependency), the distortion type (distortion-dependency), and the content degradation (degradation-dependency). According to the correlation between the distorted image and its primary content, the three aspects are analyzed and calculated respectively with a multi-stream convolutional neural network (CNN) based quality evaluator. As a result, with the help of the primary content obtained from the active inference and the comprehensive quality degradation measurement from the multi-stream CNN, our method achieves competitive performance on five popular IQA databases. Especially in cross-database evaluations, our method achieves significant improvements.
Collapse
|
30
|
Wang B, Jin S, Yan Q, Xu H, Luo C, Wei L, Zhao W, Hou X, Ma W, Xu Z, Zheng Z, Sun W, Lan L, Zhang W, Mu X, Shi C, Wang Z, Lee J, Jin Z, Lin M, Jin H, Zhang L, Guo J, Zhao B, Ren Z, Wang S, Xu W, Wang X, Wang J, You Z, Dong J. AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system. Appl Soft Comput 2021; 98:106897. [PMID: 33199977 PMCID: PMC7654325 DOI: 10.1016/j.asoc.2020.106897] [Citation(s) in RCA: 176] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Revised: 10/28/2020] [Accepted: 11/05/2020] [Indexed: 12/27/2022]
Abstract
The sudden outbreak of novel coronavirus 2019 (COVID-19) increased the diagnostic burden of radiologists. In the time of an epidemic crisis, we hope artificial intelligence (AI) to reduce physician workload in regions with the outbreak, and improve the diagnosis accuracy for physicians before they could acquire enough experience with the new disease. In this paper, we present our experience in building and deploying an AI system that automatically analyzes CT images and provides the probability of infection to rapidly detect COVID-19 pneumonia. The proposed system which consists of classification and segmentation will save about 30%-40% of the detection time for physicians and promote the performance of COVID-19 detection. Specifically, working in an interdisciplinary team of over 30 people with medical and/or AI background, geographically distributed in Beijing and Wuhan, we are able to overcome a series of challenges (e.g. data discrepancy, testing time-effectiveness of model, data security, etc.) in this particular situation and deploy the system in four weeks. In addition, since the proposed AI system provides the priority of each CT image with probability of infection, the physicians can confirm and segregate the infected patients in time. Using 1,136 training cases (723 positives for COVID-19) from five hospitals, we are able to achieve a sensitivity of 0.974 and specificity of 0.922 on the test dataset, which included a variety of pulmonary diseases.
Collapse
Affiliation(s)
- Bo Wang
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China
- Beijing Innovation Center for Future Chips, Tsinghua University, Beijing, China
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Shuo Jin
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
- Institute for Precision Medicine, Tsinghua University, Beijing, China
| | - Qingsen Yan
- University of Adelaide, SA, Australia
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Haibo Xu
- Department of Radiology, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Chuan Luo
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China
- Beijing Laboratory for Biomedical Detection Technology and Instrument, Tsinghua University, Beijing, China
| | - Lai Wei
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
- Institute for Precision Medicine, Tsinghua University, Beijing, China
| | - Wei Zhao
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Xuexue Hou
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Wenshuo Ma
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Zhengqing Xu
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Zhuozhao Zheng
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Wenbo Sun
- Department of Radiology, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Lan Lan
- Department of Radiology, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Wei Zhang
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
- School of Telecommunication Engineering, Xidian University, Xi'an, China
| | - Xiangdong Mu
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
- Institute for Precision Medicine, Tsinghua University, Beijing, China
| | - Chenxi Shi
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Zhongxiao Wang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Jihae Lee
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Zijian Jin
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Minggui Lin
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Hongbo Jin
- Beijing Jingzhen Medical Technology Ltd., Beijing, China
| | - Liang Zhang
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Jun Guo
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Benqi Zhao
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Zhizhong Ren
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Shuhao Wang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
- Thorough Images, Beijing, China
| | - Wei Xu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Xinghuan Wang
- Center for Evidence-Based and Translational Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China
- Wuhan Leishenshan Hospital, Wuhan, China
| | - Jianming Wang
- Department of Biliary and Pancreatic Surgery/Cancer Research Center Affiliated Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Tianyou Hospital Affiliated To Wuhan University of Science and Technology, Wuhan, China
| | - Zheng You
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China
- Beijing Innovation Center for Future Chips, Tsinghua University, Beijing, China
- Beijing Laboratory for Biomedical Detection Technology and Instrument, Tsinghua University, Beijing, China
| | - Jiahong Dong
- Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
- Institute for Precision Medicine, Tsinghua University, Beijing, China
| |
Collapse
|
31
|
Yang J, Zhao Y, Jiang B, Lu W, Gao X. No-Reference Quality Evaluation of Stereoscopic Video Based on Spatio-Temporal Texture. IEEE TRANSACTIONS ON MULTIMEDIA 2020; 22:2635-2644. [PMID: 0 DOI: 10.1109/tmm.2019.2961209] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
|
32
|
Yang X, Li F, Liu H. Deep feature importance awareness based no-reference image quality prediction. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.03.072] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
33
|
Liu S, Thung KH, Lin W, Yap PT, Shen D. Real-Time Quality Assessment of Pediatric MRI via Semi-Supervised Deep Nonlocal Residual Neural Networks. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:10.1109/TIP.2020.2992079. [PMID: 32396089 PMCID: PMC7648726 DOI: 10.1109/tip.2020.2992079] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, we introduce an image quality assessment (IQA) method for pediatric T1- and T2-weighted MR images. IQA is first performed slice-wise using a nonlocal residual neural network (NR-Net) and then volume-wise by agglomerating the slice QA results using random forest. Our method requires only a small amount of quality-annotated images for training and is designed to be robust to annotation noise that might occur due to rater errors and the inevitable mix of good and bad slices in an image volume. Using a small set of quality-assessed images, we pre-train NR-Net to annotate each image slice with an initial quality rating (i.e., pass, questionable, fail), which we then refine by semi-supervised learning and iterative self-training. Experimental results demonstrate that our method, trained using only samples of modest size, exhibit great generalizability, capable of real-time (milliseconds per volume) large-scale IQA with nearperfect accuracy.
Collapse
|
34
|
Zhao Y, Ji X, Liu Z. Blind image quality assessment based on statistics features and perceptual features. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-190998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Youen Zhao
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
- Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics, Jinan, China
| | - Xiuhua Ji
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
- Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics, Jinan, China
| | - Zhaoguang Liu
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
- Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics, Jinan, China
| |
Collapse
|
35
|
Abstract
In this paper, we propose a no-reference image quality assessment (NR-IQA) approach towards authentically distorted images, based on expanding proxy labels. In order to distinguish from the human labels, we define the quality score, which is generated by using a traditional NR-IQA algorithm, as “proxy labels”. “Proxy” means that the objective results are obtained by computer after the extraction and assessment of the image features, instead of human judging. To solve the problem of limited image quality assessment (IQA) dataset size, we adopt a cascading transfer-learning method. First, we obtain large numbers of proxy labels which denote the quality score of authentically distorted images by using a traditional no-reference IQA method. Then the deep network is trained by the proxy labels, in order to learn IQA-related knowledge from the amounts of images with their scores. Ultimately, we use fine-tuning to inherit knowledge represented in the trained network. During the procedure, the mapping relationship fits in with human visual perception closer. The experimental results demonstrate that the proposed algorithm shows an outstanding performance as compared with the existing algorithms. On the LIVE In the Wild Image Quality Challenge database and KonIQ-10k database (two standard databases for authentically distorted image quality assessment), the algorithm realized good consistency between human visual perception and the predicted quality score of authentically distorted images.
Collapse
|
36
|
|