1
|
Tahir YS, Rosdi BA. FV-EffResNet: an efficient lightweight convolutional neural network for finger vein recognition. PeerJ Comput Sci 2024; 10:e1837. [PMID: 38435623 PMCID: PMC10909234 DOI: 10.7717/peerj-cs.1837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 01/07/2024] [Indexed: 03/05/2024]
Abstract
Several deep neural networks have been introduced for finger vein recognition over time, and these networks have demonstrated high levels of performance. However, most current state-of-the-art deep learning systems use networks with increasing layers and parameters, resulting in greater computational costs and complexity. This can make them impractical for real-time implementation, particularly on embedded hardware. To address these challenges, this article concentrates on developing a lightweight convolutional neural network (CNN) named FV-EffResNet for finger vein recognition, aiming to find a balance between network size, speed, and accuracy. The key improvement lies in the utilization of the proposed novel convolution block named the Efficient Residual (EffRes) block, crafted to facilitate efficient feature extraction while minimizing the parameter count. The block decomposes the convolution process, employing pointwise and depthwise convolutions with a specific rectangular dimension realized in two layers (n × 1) and (1 × m) for enhanced handling of finger vein data. The approach achieves computational efficiency through a combination of squeeze units, depthwise convolution, and a pooling strategy. The hidden layers of the network use the Swish activation function, which has been shown to enhance performance compared to conventional functions like ReLU or Leaky ReLU. Furthermore, the article adopts cyclical learning rate techniques to expedite the training process of the proposed network. The effectiveness of the proposed pipeline is demonstrated through comprehensive experiments conducted on four benchmark databases, namely FV-USM, SDUMLA, MMCBNU_600, and NUPT-FV. The experimental results reveal that the EffRes block has a remarkable impact on finger vein recognition. The proposed FV-EffResNet achieves state-of-the-art performance in both identification and verification settings, leveraging the benefits of being lightweight and incurring low computational costs.
Collapse
Affiliation(s)
- Yusuf Suleiman Tahir
- School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Nibong Tebal, Penang, Malaysia
| | - Bakhtiar Affendi Rosdi
- School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Nibong Tebal, Penang, Malaysia
| |
Collapse
|
2
|
Yin Y, Luo S, Zhou J, Kang L, Chen CYC. LDCNet: Lightweight dynamic convolution network for laparoscopic procedures image segmentation. Neural Netw 2024; 170:441-452. [PMID: 38039682 DOI: 10.1016/j.neunet.2023.11.055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/07/2023] [Accepted: 11/24/2023] [Indexed: 12/03/2023]
Abstract
Medical image segmentation is fundamental for modern healthcare systems, especially for reducing the risk of surgery and treatment planning. Transanal total mesorectal excision (TaTME) has emerged as a recent focal point in laparoscopic research, representing a pivotal modality in the therapeutic arsenal for the treatment of colon & rectum cancers. Real-time instance segmentation of surgical imagery during TaTME procedures can serve as an invaluable tool in assisting surgeons, ultimately reducing surgical risks. The dynamic variations in size and shape of anatomical structures within intraoperative images pose a formidable challenge, rendering the precise instance segmentation of TaTME images a task of considerable complexity. Deep learning has exhibited its efficacy in Medical image segmentation. However, existing models have encountered challenges in concurrently achieving a satisfactory level of accuracy while maintaining manageable computational complexity in the context of TaTME data. To address this conundrum, we propose a lightweight dynamic convolution Network (LDCNet) that has the same superior segmentation performance as the state-of-the-art (SOTA) medical image segmentation network while running at the speed of the lightweight convolutional neural network. Experimental results demonstrate the promising performance of LDCNet, which consistently exceeds previous SOTA approaches. Codes are available at github.com/yinyiyang416/LDCNet.
Collapse
Affiliation(s)
- Yiyang Yin
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China
| | - Shuangling Luo
- Department of General Surgery (Colorectal Surgery), The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China; Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Department of Colorectal Surgery, Guangzhou, 510655, Guangdong, China; The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China
| | - Jun Zhou
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China
| | - Liang Kang
- Department of General Surgery (Colorectal Surgery), The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China; Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Department of Colorectal Surgery, Guangzhou, 510655, Guangdong, China; The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China.
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China; AI for Science (AI4S) - Preferred Program, Peking University Shenzhen Graduate School, Shenzhen, 518055, Guangdong, China; School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, 518055, Guangdong, China; Department of Medical Research, China Medical University Hospital, Taichung, 40447, Guangdong, Taiwan; Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.
| |
Collapse
|
3
|
Gao W, Fan B, Fang Y, Song N. Lightweight and multi-lesion segmentation model for diabetic retinopathy based on the fusion of mixed attention and ghost feature mapping. Comput Biol Med 2024; 169:107854. [PMID: 38109836 DOI: 10.1016/j.compbiomed.2023.107854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/04/2023] [Accepted: 12/11/2023] [Indexed: 12/20/2023]
Abstract
Diabetic retinopathy is the main cause of blindness, and lesion segmentation is an important basic work for the diagnosis of this disease. The main lesions include soft and hard exudates, microaneurysms, and hemorrhages. However, the segmentation of these four types of lesions is difficult because of their variability in size and contrast, and high intertype similarity. Currently, many network models have problems, such as a large number of parameters and complex calculations, and most segmentation models for diabetic retinopathy focus only on one type of lesion. In this study, a lightweight algorithm based on BiSeNet V2 was proposed for the segmentation of multiple lesions in diabetic retinopathy fundus. First, a hybrid attention module was embedded in the semantic branch of BiSeNet V2 for 8- and 16-fold downsampling, which helped reassign deep feature-map weights and enhanced the ability to extract local key features. Second, a ghost feature-mapping unit was used to optimize the traditional convolution layers and further reduce the computational cost. Third, a new loss function based on the dynamic threshold loss function was applied to supervise the training by adjusting the training weights of the high-loss difficult samples, which enhanced the model's attention to small goals. In experiments on the IDRiD dataset, we conducted an ablation study to verify the effectiveness of each component and compared the proposed model, BiSeNet V2-Pro, with several state-of-the-art models. In comparison with the baseline BiSeNet V2, the segmentation performance of BiSeNet V2-Pro improved by 12.17 %, 11.44 %, and 8.49 % in terms of Sensitivity (SEN), Intersection over Union (IoU), and Dice coefficient (DICE), respectively. Specifically, IoU of MA reaches 0.5716. Compared with other methods, the segmentation speed was significantly improved while ensuring segmentation accuracy, and the number of model parameters was lower. These results demonstrate the superiority of BiSeNet V2-Pro in the multi-lesion segmentation of diabetic retinopathy.
Collapse
Affiliation(s)
- Weiwei Gao
- Institute of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 201620, China.
| | - Bo Fan
- Institute of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 201620, China.
| | - Yu Fang
- Institute of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 201620, China.
| | - Nan Song
- Department of Ophthalmology, Eye&Ent Hospital of University, Shanghai 200031, China.
| |
Collapse
|
4
|
Zhou M, Han X, Liu Z, Chen Y, Sun L. A lightweight segmentation network for endoscopic surgical instruments based on edge refinement and efficient self-attention. PeerJ Comput Sci 2023; 9:e1746. [PMID: 38259682 PMCID: PMC10803021 DOI: 10.7717/peerj-cs.1746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 11/17/2023] [Indexed: 01/24/2024]
Abstract
In robot-assisted surgical systems, surgical instrument segmentation is a critical task that provides important information for surgeons to make informed decisions and ensure surgical safety. However, current mainstream models often lack precise segmentation edges and suffer from an excess of parameters, rendering their deployment challenging. To address these issues, this article proposes a lightweight semantic segmentation model based on edge refinement and efficient self-attention. The proposed model utilizes a lightweight densely connected network for feature extraction, which is able to extract high-quality semantic information with fewer parameters. The decoder combines a feature pyramid module with an efficient criss-cross self-attention module. This fusion integrates multi-scale data, strengthens focus on surgical instrument details, and enhances edge segmentation accuracy. To train and evaluate the proposed model, the authors developed a private dataset of endoscopic surgical instruments. It containing 1,406 images for training, 469 images for validation and 469 images for testing. The proposed model performs well on this dataset with only 466 K parameters, achieving a mean Intersection over Union (mIoU) of 97.11%. In addition, the model was trained on public datasets Kvasir-instrument and Endovis2017. Excellent results of 93.24% and 95.83% were achieved on the indicator mIoU, respectively. The superiority and effectiveness of the method are proved. Experimental results show that the proposed model has lower parameters and higher accuracy than other state-of-the-art models. The proposed model thus lays the foundation for further research in the field of surgical instrument segmentation.
Collapse
Affiliation(s)
- Mengyu Zhou
- School of Medical Instruments, Shanghai University of Medicine & Health Sciences, Shanghai, P.R.China
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Xiaoxiang Han
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Zhoujin Liu
- School of Medical Instruments, Shanghai University of Medicine & Health Sciences, Shanghai, P.R.China
| | - Yitong Chen
- School of Medical Instruments, Shanghai University of Medicine & Health Sciences, Shanghai, P.R.China
| | - Liping Sun
- School of Medical Instruments, Shanghai University of Medicine & Health Sciences, Shanghai, P.R.China
- School of Information Science and Technology, Fudan University, Shanghai, China
| |
Collapse
|
5
|
Haq I, Mazhar T, Naz Asif R, Yasin Ghadi Y, Saleem R, Mallek F, Hamam H. A deep learning approach for the detection and counting of colon cancer cells (HT-29 cells) bunches and impurities. PeerJ Comput Sci 2023; 9:e1651. [PMID: 38192457 PMCID: PMC10773923 DOI: 10.7717/peerj-cs.1651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 09/22/2023] [Indexed: 01/10/2024]
Abstract
HT-29 has an epithelial appearance as a human colorectal cancer cell line. Early detection of colorectal cancer can enhance survival rates. This study aims to detect and count HT-29 cells using a deep-learning approach (ResNet-50). The cell lines were procured from Procell Life Science & Technology Co., Ltd. (Wuhan, China). Further, the dataset is self-prepared in lab experiments, cell culture, and collected 566 images. These images contain two classes; the HT-29 human colorectal adenocarcinoma cells (blue shapes in bunches) and impurities (tinny circular grey shapes). These images are annotated with the help of an image labeller as impurity and cancer cells. Then afterwards, the images are trained, validated, and tested against the deep learning approach ResNet50. Finally, in each image, the number of impurity and cancer cells are counted to find the accuracy of the proposed model. Accuracy and computational expense are used to gauge the network's performance. Each model is tested ten times with a non-overlapping train and random test splits. The effect of data pre-processing is also examined and shown in several tasks. The results show an accuracy of 95.5% during training and 95.3% in validation for detecting and counting HT-29 cells. HT-29 cell detection and counting using deep learning is novel due to the scarcity of research in this area, the application of deep learning, and potential performance improvements over traditional methods. By addressing a gap in the literature, employing a unique dataset, and using custom model architecture, this approach contributes to advancing colon cancer understanding and diagnosis techniques.
Collapse
Affiliation(s)
- Inayatul Haq
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, Henan, China
| | - Tehseen Mazhar
- Department of Computer Science, Virtual University of Pakistan, Lahore, Pakistan
| | - Rizwana Naz Asif
- School of Computer Science, National College of Business Administration and Economics, Lahore, Pakistan
| | - Yazeed Yasin Ghadi
- Department of computer science and software engineering, Al Ain university, Abu Dhabi, United Arab Emirates
| | - Rabea Saleem
- Department of computer science and software engineering, Air University, Multan, Pakistan
| | - Fatma Mallek
- Faculty of Engineering, University of Moncton, Moncton, Canada
| | - Habib Hamam
- Faculty of Engineering, University of Moncton, Moncton, Canada
- Spectrum of Knowledge Production, Skills Development, Sfax, Tunisia
- College of Computer Science and Engineering, University of Ha’il, Ha’il, Saudi Arabia
- International Institute of Technology and Management, Libreville, Commune d’Akanda, Gabon
- Department of Electrical and Electronic Engineering Science, School of Electrical Engineering, University of Johannesburg, Johannesburg, South Africa
| |
Collapse
|
6
|
Han X, Liu Y, Liu G, Lin Y, Liu Q. LOANet: a lightweight network using object attention for extracting buildings and roads from UAV aerial remote sensing images. PeerJ Comput Sci 2023; 9:e1467. [PMID: 37547422 PMCID: PMC10403170 DOI: 10.7717/peerj-cs.1467] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 06/08/2023] [Indexed: 08/08/2023]
Abstract
Semantic segmentation for extracting buildings and roads from uncrewed aerial vehicle (UAV) remote sensing images by deep learning becomes a more efficient and convenient method than traditional manual segmentation in surveying and mapping fields. In order to make the model lightweight and improve the model accuracy, a lightweight network using object attention (LOANet) for buildings and roads from UAV aerial remote sensing images is proposed. The proposed network adopts an encoder-decoder architecture in which a lightweight densely connected network (LDCNet) is developed as the encoder. In the decoder part, the dual multi-scale context modules which consist of the atrous spatial pyramid pooling module (ASPP) and the object attention module (OAM) are designed to capture more context information from feature maps of UAV remote sensing images. Between ASPP and OAM, a feature pyramid network (FPN) module is used to fuse multi-scale features extracted from ASPP. A private dataset of remote sensing images taken by UAV which contains 2431 training sets, 945 validation sets, and 475 test sets is constructed. The proposed basic model performs well on this dataset, with only 1.4M parameters and 5.48G floating point operations (FLOPs), achieving excellent mean Intersection-over-Union (mIoU). Further experiments on the publicly available LoveDA and CITY-OSM datasets have been conducted to further validate the effectiveness of the proposed basic and large model, and outstanding mIoU results have been achieved. All codes are available on https://github.com/GtLinyer/LOANet.
Collapse
Affiliation(s)
- Xiaoxiang Han
- School of Medical Instruments, Shanghai University of Medicine and Health Sciences, Shanghai, People’s Republic of China
- School of Health Sciences and Engineering, University of Shanghai for Science and Technology, Shanghai, People’s Republic of China
| | - Yiman Liu
- Department of Pediatric Cardiology, Shanghai Children’s Medical Center, School of Medicine, Shanghai Jiao Tong University, Shanghai, People’s Republic of China
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication & Electronic Engineering, East China Normal University, Shanghai, People’s Republic of China
| | - Gang Liu
- Key Laboratory of Earthquake Geodesy, Institute of Seismology, China Earthquake Administration, Wuhan, Hubei, People’s Republic of China
| | - Yuanjie Lin
- School of Health Sciences and Engineering, University of Shanghai for Science and Technology, Shanghai, People’s Republic of China
| | - Qiaohong Liu
- School of Medical Instruments, Shanghai University of Medicine and Health Sciences, Shanghai, People’s Republic of China
| |
Collapse
|
7
|
Wei J, Yu S, Du Y, Liu K, Xu Y, Xu X. Automatic Segmentation of Hyperreflective Foci in OCT Images Based on Lightweight DBR Network. J Digit Imaging 2023; 36:1148-1157. [PMID: 36749455 PMCID: PMC10287852 DOI: 10.1007/s10278-023-00786-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 01/14/2023] [Accepted: 01/18/2023] [Indexed: 02/08/2023] Open
Abstract
Hyperreflective foci (HF) reflects inflammatory responses for fundus diseases such as diabetic macular edema (DME), retina vein occlusion (RVO), and central serous chorioretinopathy (CSC). Shown as high contrast and reflectivity in optical coherence tomography (OCT) images, automatic segmentation of HF in OCT images is helpful for the prognosis of fundus diseases. Previous traditional methods were time-consuming and required high computing power. Hence, we proposed a lightweight network to segment HF (with a speed of 57 ms per OCT image, at least 150 ms faster than other methods). Our framework consists of two stages: an NLM filter and patch-based split to preprocess images and a lightweight DBR neural network to segment HF automatically. Experimental results from 3000 OCT images of 300 patients (100 DME,100 RVO, and 100 CSC) revealed that our method achieved HF segmentation successfully. The DBR network had the area under curves dice similarity coefficient (DSC) of 83.65%, 76.43%, and 82.20% in segmenting HF in DME, RVO, and CSC on the test cohort respectively. Our DBR network achieves at least 5% higher DSC than previous methods. HF in DME was more easily segmented compared with the other two types. In addition, our DBR network is universally applicable to clinical practice with the ability to segment HF in a wide range of fundus diseases.
Collapse
Affiliation(s)
- Jin Wei
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, 200080, China
- Shanghai Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Suqin Yu
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, 200080, China
| | - Yuchen Du
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, 200080, China
- Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Kun Liu
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, 200080, China
| | - Yupeng Xu
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, 200080, China.
| | - Xun Xu
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, 200080, China
| |
Collapse
|
8
|
Wu X, Lu S, Sun J, Yuan K. A lightweight super-resolution network with skip-connections. Curr Med Imaging 2023:CMIR-EPUB-131995. [PMID: 37218185 DOI: 10.2174/1573405620666230522151414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 03/13/2023] [Accepted: 03/17/2023] [Indexed: 05/24/2023]
Abstract
INTRODUCTION In some hospitals in remote areas, due to the lack of MRI scanners with high magnetic field intensity, only low-resolution MRI images can be obtained, hindering doctors from making correct diagnoses. In our study, higher-resolution images were obtained through low-resolution MRI images. Moreover, as our algorithm is a lightweight algorithm with a small number of parameters, it can be carried out in remote areas under the condition of the lack of computing resources. Moreover, our algorithm is of great clinical significance in providing references for doctors' diagnoses and treatment in remote areas. METHODS We compared different super-resolution algorithms to obtain high-resolution MRI images, including SRGAN, SPSR, and LESRCNN. A global skip connection was applied to the original network of LESRCNN to use global semantic information to get better performance. RESULTS Experiments reported that our network improved SSMI by 0.8% and also achieved an obvious increase in PSNR, PI, and LPIPS compared to LESRCNN in our dataset. Similar to LESRCNN, our network has a very short running time, a small number of parameters, low time complexity, and low space complexity while ensuring high performance compared to SRGAN and SPSR. Five MRI doctors were invited for a subjective evaluation of our algorithm. All agreed on significant improvements and that our algorithm could be used clinically in remote areas and has great value. CONCLUSION The experimental results demonstrated the performance of our algorithm in super-resolution MRI image reconstruction. It allows us to obtain high-resolution images in the absence of high-field intensity MRI scanners, which have great clinical significance. The short running time, a small number of parameters, low time complexity, and low space complexity ensure that our network can be used in grassroots hospitals in remote areas that lack computing resources. We can reconstruct high-resolution MRI images in a short time, thus saving time for patients. Our algorithm can be biased towards practical applications; however, doctors have affirmed the clinical value of our algorithm.
Collapse
Affiliation(s)
- Xuzhou Wu
- Graduate School at Shenzhen, Tsinghua University, Shenzhen, China
| | - Shi Lu
- Graduate School at Shenzhen, Tsinghua University, Shenzhen, China
| | - Jirang Sun
- Sanbo Brain Hospital, Capital Medical University, Beijing, China
| | - Kehong Yuan
- Graduate School at Shenzhen, Tsinghua University, Shenzhen, China
| |
Collapse
|
9
|
Wang Q, Deng H, Wu X, Yang Z, Liu Y, Wang Y, Hao G. LCM-Captioner: A lightweight text-based image captioning method with collaborative mechanism between vision and text. Neural Netw 2023; 162:318-329. [PMID: 36934693 DOI: 10.1016/j.neunet.2023.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 12/20/2022] [Accepted: 03/07/2023] [Indexed: 03/21/2023]
Abstract
Text-based image captioning (TextCap) aims to remedy the shortcomings of existing image captioning tasks that ignore text content when describing images. Instead, it requires models to recognize and describe images from both visual and textual content to achieve a deeper level of comprehension of the images. However, existing methods tend to use numerous complex network architectures to improve performance, which still fails to adequately model the relationship between vision and text on the one side, while on the other side this leads to long running times, high memory consumption, and other unfavorable deployment problems. To solve the above issues, we have developed a lightweight captioning method with a collaborative mechanism, LCM-Captioner, which balances high efficiency with high performance. First, we propose a feature-lightening transformation for the TextCap task, named TextLighT, which is able to learn rich multimodal representations while mapping features to lower dimensions, thereby reducing memory costs. Next, we present a collaborative attention module for visual and text information, VTCAM, to facilitate the semantic alignment of multimodal information to uncover important visual objects and textual content. Finally, the conducted extensive experiments on the TextCaps dataset demonstrate the effectiveness of our method. Code is available at https://github.com/DengHY258/LCM-Captioner.
Collapse
Affiliation(s)
- Qi Wang
- State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, China.
| | - Hongyu Deng
- State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, China.
| | - Xue Wu
- Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang, China; State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, China.
| | - Zhenguo Yang
- School of computing, Guangdong University of Technology, China.
| | - Yun Liu
- Department of Automation, Moutai Institute, China.
| | - Yazhou Wang
- School of Microelectronics, Southeast University, Nanjing 210096, China.
| | - Gefei Hao
- State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, China.
| |
Collapse
|
10
|
Dong S, Fan Z, Chen Y, Chen K, Qin M, Zeng M, Lu X, Zhou G, Gao X, Liu JM. Performance estimation for the memristor-based computing-in-memory implementation of extremely factorized network for real-time and low-power semantic segmentation. Neural Netw 2023; 160:202-215. [PMID: 36657333 DOI: 10.1016/j.neunet.2023.01.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 12/05/2022] [Accepted: 01/09/2023] [Indexed: 01/15/2023]
Abstract
Nowadays many semantic segmentation algorithms have achieved satisfactory accuracy on von Neumann platforms (e.g., GPU), but the speed and energy consumption have not meet the high requirements of certain edge applications like autonomous driving. To tackle this issue, it is of necessity to design an efficient lightweight semantic segmentation algorithm and then implement it on emerging hardware platforms with high speed and energy efficiency. Here, we first propose an extremely factorized network (EFNet) which can learn multi-scale context information while preserving rich spatial information with reduced model complexity. Experimental results on the Cityscapes dataset show that EFNet achieves an accuracy of 68.0% mean intersection over union (mIoU) with only 0.18M parameters, at a speed of 99 frames per second (FPS) on a single RTX 3090 GPU. Then, to further improve the speed and energy efficiency, we design a memristor-based computing-in-memory (CIM) accelerator for the hardware implementation of EFNet. It is shown by the simulation in DNN+NeuroSim V2.0 that the memristor-based CIM accelerator is ∼63× (∼4.6×) smaller in area, at most ∼9.2× (∼1000×) faster, and ∼470× (∼2400×) more energy-efficient than the RTX 3090 GPU (the Jetson Nano embedded development board), although its accuracy slightly decreases by 1.7% mIoU. Therefore, the memristor-based CIM accelerator has great potential to be deployed at the edge to implement lightweight semantic segmentation models like EFNet. This study showcases an algorithm-hardware co-design to realize real-time and low-power semantic segmentation at the edge.
Collapse
Affiliation(s)
- Shuai Dong
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China; Guangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Zhen Fan
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China; Guangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China.
| | - Yihong Chen
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Kaihui Chen
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Minghui Qin
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Min Zeng
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Xubing Lu
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Guofu Zhou
- Guangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China; National Center for International Research on Green Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Xingsen Gao
- Institute for Advanced Materials, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, 510006, China
| | - Jun-Ming Liu
- Laboratory of Solid State Microstructures and Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| |
Collapse
|
11
|
Yang Y, Zhang L, Ren L, Wang X. MMViT-Seg: A lightweight transformer and CNN fusion network for COVID-19 segmentation. Comput Methods Programs Biomed 2023; 230:107348. [PMID: 36706618 PMCID: PMC9833855 DOI: 10.1016/j.cmpb.2023.107348] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 01/05/2023] [Accepted: 01/08/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE COVID-19 is a serious threat to human health. Traditional convolutional neural networks (CNNs) can realize medical image segmentation, whilst transformers can be used to perform machine vision tasks, because they have a better ability to capture long-range relationships than CNNs. The combination of CNN and transformers to complete the task of semantic segmentation has attracted intense research. Currently, it is challenging to segment medical images on limited data sets like that on COVID-19. METHODS This study proposes a lightweight transformer+CNN model, in which the encoder sub-network is a two-path design that enables both the global dependence of image features and the low layer spatial details to be effectively captured. Using CNN and MobileViT to jointly extract image features reduces the amount of computation and complexity of the model as well as improves the segmentation performance. So this model is titled Mini-MobileViT-Seg (MMViT-Seg). In addition, a multi query attention (MQA) module is proposed to fuse the multi-scale features from different levels of decoder sub-network, further improving the performance of the model. MQA can simultaneously fuse multi-input, multi-scale low-level feature maps and high-level feature maps as well as conduct end-to-end supervised learning guided by ground truth. RESULTS The two-class infection labeling experiments were conducted based on three datasets. The final results show that the proposed model has the best performance and the minimum number of parameters among five popular semantic segmentation algorithms. In multi-class infection labeling results, the proposed model also achieved competitive performance. CONCLUSIONS The proposed MMViT-Seg is tested on three COVID-19 segmentation datasets, with results showing that this model has better performance than other models. In addition, the proposed MQA module, which can effectively fuse multi-scale features of different levels further improves the segmentation accuracy.
Collapse
Affiliation(s)
- Yuan Yang
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Medicine and Engineering, No.37 Xueyuan Road, Haidian District, Beijing, China; Key Laboratory of Big Data-Based Precision Medicine, Ministry of Industry and Information Technology, No.37 Xueyuan Road, Haidian District, Beijing, China; School of Automation Science and Electrical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, China
| | - Lin Zhang
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Medicine and Engineering, No.37 Xueyuan Road, Haidian District, Beijing, China; Key Laboratory of Big Data-Based Precision Medicine, Ministry of Industry and Information Technology, No.37 Xueyuan Road, Haidian District, Beijing, China; School of Automation Science and Electrical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, China.
| | - Lei Ren
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Medicine and Engineering, No.37 Xueyuan Road, Haidian District, Beijing, China; Key Laboratory of Big Data-Based Precision Medicine, Ministry of Industry and Information Technology, No.37 Xueyuan Road, Haidian District, Beijing, China; School of Automation Science and Electrical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, China
| | - Xiaohan Wang
- School of Automation Science and Electrical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, China
| |
Collapse
|
12
|
Chen J, Deng X, Wen Y, Chen W, Zeb A, Zhang D. Weakly-supervised learning method for the recognition of potato leaf diseases. Artif Intell Rev 2022; 56:1-18. [PMID: 36573133 PMCID: PMC9771599 DOI: 10.1007/s10462-022-10374-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/10/2022] [Indexed: 12/24/2022]
Abstract
As a crucial food crop, potatoes are highly consumed worldwide, while they are also susceptible to being infected by diverse diseases. Early detection and diagnosis can prevent the epidemic of plant diseases and raise crop yields. To this end, this study proposed a weakly-supervised learning approach for the identification of potato plant diseases. The foundation network was applied with the lightweight MobileNet V2, and to enhance the learning ability for minute lesion features, we modified the existing MobileNet-V2 architecture using the fine-tuning approach conducted by transfer learning. Then, the atrous convolution along with the SPP module was embedded into the pre-trained networks, which was followed by a hybrid attention mechanism containing channel attention and spatial attention submodules to efficiently extract high-dimensional features of plant disease images. The proposed approach outperformed other compared methods and achieved a superior performance gain. It realized an average recall rate of 91.99% for recognizing potato disease types on the publicly accessible dataset. In practical field scenarios, the proposed approach separately attained an average accuracy and specificity of 97.33% and 98.39% on the locally collected image dataset. Experimental results present a competitive performance and demonstrate the validity and feasibility of the proposed approach.
Collapse
Affiliation(s)
- Junde Chen
- Dale E. and Sarah Ann Fowler School of Engineering, Chapman University, Orange, CA 92866 USA
- School of Informatics, Xiamen University, Xiamen, 361005 China
- Department of Electronic Commerce, Xiangtan University, Xiangtan, 411105 China
| | - Xiaofang Deng
- National Academy of Forestry and Grassland Administration, Beijing, 102600 China
| | - Yuxin Wen
- Dale E. and Sarah Ann Fowler School of Engineering, Chapman University, Orange, CA 92866 USA
| | - Weirong Chen
- Department of Information and Electrical Engineering, Ningde Normal University, Ningde, 352100 China
| | - Adnan Zeb
- School of Informatics, Xiamen University, Xiamen, 361005 China
- College of Engineering, Southern University of Science and Technology, Shenzhen, 518000 China
| | - Defu Zhang
- School of Informatics, Xiamen University, Xiamen, 361005 China
| |
Collapse
|
13
|
Wang Y, Cao Y, Li J, Wu H, Wang S, Dong X, Yu H. A lightweight hierarchical convolution network for brain tumor segmentation. BMC Bioinformatics 2022; 22:636. [PMID: 36513986 PMCID: PMC9749147 DOI: 10.1186/s12859-022-05039-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 11/04/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Brain tumor segmentation plays a significant role in clinical treatment and surgical planning. Recently, several deep convolutional networks have been proposed for brain tumor segmentation and have achieved impressive performance. However, most state-of-the-art models use 3D convolution networks, which require high computational costs. This makes it difficult to apply these models to medical equipment in the future. Additionally, due to the large diversity of the brain tumor and uncertain boundaries between sub-regions, some models cannot well-segment multiple tumors in the brain at the same time. RESULTS In this paper, we proposed a lightweight hierarchical convolution network, called LHC-Net. Our network uses a multi-scale strategy which the common 3D convolution is replaced by the hierarchical convolution with residual-like connections. It improves the ability of multi-scale feature extraction and greatly reduces parameters and computation resources. On the BraTS2020 dataset, LHC-Net achieves the Dice scores of 76.38%, 90.01% and 83.32% for ET, WT and TC, respectively, which is better than that of 3D U-Net with 73.50%, 89.42% and 81.92%. Especially on the multi-tumor set, our model shows significant performance improvement. In addition, LHC-Net has 1.65M parameters and 35.58G FLOPs, which is two times fewer parameters and three times less computation compared with 3D U-Net. CONCLUSION Our proposed method achieves automatic segmentation of tumor sub-regions from four-modal brain MRI images. LHC-Net achieves competitive segmentation performance with fewer parameters and less computation than the state-of-the-art models. It means that our model can be applied under limited medical computing resources. By using the multi-scale strategy on channels, LHC-Net can well-segment multiple tumors in the patient's brain. It has great potential for application to other multi-scale segmentation tasks.
Collapse
Affiliation(s)
- Yuhu Wang
- Tianjin International Engineering Institute, Tianjin University, Tianjin, China
| | - Yuzhen Cao
- Department of Biomedical Engineering, Tianjin Key Laboratory of Biomedical Detecting Techniques and Instruments, Tianjin University, Tianjin, China
| | - Jinqiu Li
- Tianjin International Engineering Institute, Tianjin University, Tianjin, China
| | - Hongtao Wu
- Department of Biomedical Engineering, Tianjin Key Laboratory of Biomedical Detecting Techniques and Instruments, Tianjin University, Tianjin, China
| | - Shuo Wang
- Department of Biomedical Engineering, Tianjin Key Laboratory of Biomedical Detecting Techniques and Instruments, Tianjin University, Tianjin, China
| | - Xinming Dong
- Tianjin Rehabilitation Convalescent Center, Tianjin, China.
| | - Hui Yu
- Department of Biomedical Engineering, Tianjin Key Laboratory of Biomedical Detecting Techniques and Instruments, Tianjin University, Tianjin, China. .,Tianjin International Engineering Institute, Tianjin University, Tianjin, China. .,Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China.
| |
Collapse
|
14
|
Zhao Y, Fu C, Xu S, Cao L, Ma HF. LFANet: Lightweight feature attention network for abnormal cell segmentation in cervical cytology images. Comput Biol Med 2022; 145:105500. [PMID: 35421793 DOI: 10.1016/j.compbiomed.2022.105500] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/16/2022] [Accepted: 04/04/2022] [Indexed: 11/19/2022]
Abstract
With the widely applied computer-aided diagnosis techniques in cervical cancer screening, cell segmentation has become a necessary step to determine the progression of cervical cancer. Traditional manual methods alleviate the dilemma caused by the shortage of medical resources to a certain extent. Unfortunately, with their low segmentation accuracy for abnormal cells, the complex process cannot realize an automatic diagnosis. In addition, various methods on deep learning can automatically extract image features with high accuracy and small error, making artificial intelligence increasingly popular in computer-aided diagnosis. However, they are not suitable for clinical practice because those complicated models would result in more redundant parameters from networks. To address the above problems, a lightweight feature attention network (LFANet), extracting differentially abundant feature information of objects with various resolutions, is proposed in this study. The model can accurately segment both the nucleus and cytoplasm regions in cervical images. Specifically, a lightweight feature extraction module is designed as an encoder to extract abundant features of input images, combining with depth-wise separable convolution, residual connection and attention mechanism. Besides, the feature layer attention module is added to precisely recover pixel location, which employs the global high-level information as a guide for the low-level features, capturing dependencies of channel features. Finally, our LFANet model is evaluated on all four independent datasets. The experimental results demonstrate that compared with other advanced methods, our proposed network achieves state-of-the-art performance with a low computational complexity.
Collapse
Affiliation(s)
- Yanli Zhao
- School of Computer Science and Engineering, Northeastern University, Shenyang, 110819, China; School of Electrical Information Engineering, Ningxia Institute of Technology, Shizuishan, 753000, China
| | - Chong Fu
- School of Computer Science and Engineering, Northeastern University, Shenyang, 110819, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, 110819, China; Engineering Research Center of Security Technology of Complex Network System, Ministry of Education, China.
| | - Sen Xu
- General Hospital of Northern Theatre Command, Shenyang, 110016, China
| | - Lin Cao
- School of Information and Communication Engineering, Beijing Information Science and Technology University, Beijing, 100101, China
| | - Hong-Feng Ma
- Dopamine Group Ltd., Auckland, 1542, New Zealand
| |
Collapse
|
15
|
Jiang X, Wang N, Xin J, Xia X, Yang X, Gao X. Learning lightweight super-resolution networks with weight pruning. Neural Netw 2021; 144:21-32. [PMID: 34450444 DOI: 10.1016/j.neunet.2021.08.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 06/28/2021] [Accepted: 08/01/2021] [Indexed: 10/20/2022]
Abstract
Single image super-resolution (SISR) has achieved significant performance improvements due to the deep convolutional neural networks (CNN). However, the deep learning-based method is computationally intensive and memory demanding, which limit its practical deployment, especially for mobile devices. Focusing on this issue, in this paper, we present a novel approach to compress SR networks by weight pruning. To achieve this goal, firstly, we explore a progressive optimization method to gradually zero out the redundant parameters. Then, we construct a sparse-aware attention module by exploring a pruning-based well-suited attention strategy. Finally, we propose an information multi-slicing network which extracts and integrates multi-scale features at a granular level to acquire a more lightweight and accurate SR network. Extensive experiments reflect the pruning method could reduce the model size without a noticeable drop in performance, making it possible to apply the start-of-the-art SR models in the real-world applications. Furthermore, our proposed pruning versions could achieve better accuracy and visual improvements than state-of-the-art methods.
Collapse
Affiliation(s)
- Xinrui Jiang
- State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an 710071, China.
| | - Nannan Wang
- State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an 710071, China.
| | - Jingwei Xin
- State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an 710071, China.
| | - Xiaobo Xia
- Trustworthy Machine Learning (TML) Lab, School of Computer Science, Faculty of Engineering, The University of Sydney Darlington, NSW 2008, Australia.
| | - Xi Yang
- State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an 710071, China.
| | - Xinbo Gao
- Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
| |
Collapse
|
16
|
Abstract
Efficient methods developed with deep learning in the last ten years have provided objectivity and high accuracy in the diagnosis of skin diseases. They also support accurate, cost-effective and timely treatment. In addition, they provide diagnoses without the need to touch patients, which is very desirable when the disease is contagious or the patients have another contagious disease. On the other hand, it is not possible to run deep networks on resource-constrained devices (e.g., mobile phones). Therefore, lightweight network architectures have been proposed in the literature. However, merely a few mobile applications have been developed for the diagnosis of skin diseases from colored photographs using lightweight networks. Moreover, only a few types of skin diseases have been addressed in those applications. Additionally, they do not perform as well as the deep network models, particularly for pattern recognition. Therefore, in this study, a novel model has been constructed using MobileNet. Also, a novel loss function has been developed and used. The main contributions of this study are: (i) proposing a novel hybrid loss function; (ii) proposing a modified-MobileNet architecture; (iii) designing and implementing a mobile phone application with the modified-MobileNet and a user-friendly interface. Results indicated that the proposed technique can diagnose skin diseases with 94.76% accuracy.
Collapse
Affiliation(s)
- Evgin Goceri
- Department of Biomedical Engineering, Engineering Faculty, Akdeniz University, Turkey.
| |
Collapse
|