1
|
Zhong L, Philip Chen CL, Guo J, Zhang T. Robust Incremental Broad Learning System for Data Streams of Uncertain Scale. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7580-7593. [PMID: 38758620 DOI: 10.1109/tnnls.2024.3396659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
Due to its marvelous performance and remarkable scalability, a broad learning system (BLS) has aroused a wide range of attention. However, its incremental learning suffers from low accuracy and long training time, especially when dealing with unstable data streams, making it difficult to apply in real-world scenarios. To overcome these issues and enrich its relevant research, a robust incremental BLS (RI-BLS) is proposed. In this method, the proposed weight update strategy introduces two memory matrices to store the learned information, thus the computational procedure of ridge regression is decomposed, resulting in precomputed ridge regression. During incremental learning, RI-BLS updates two memory matrices and renews weights via precomputed ridge regression efficiently. In addition, this update strategy is theoretically analyzed in error, time complexity, and space complexity compared with existing incremental BLSs. Different from Greville's method used in the original incremental BLS, its results are closer to the solution of one-shot calculation. Compared with the existing incremental BLSs, the proposed method exhibits more stable time complexity and superior space complexity. The experiments prove that RI-BLS outperforms other incremental BLSs when handling both stable and unstable data streams. Furthermore, experiments demonstrate that the proposed weight update strategy applies to other random neural networks as well.
Collapse
|
2
|
Zhao H, Xu C, Chen J, Zhang Z, Wang X. BGLE-YOLO: A Lightweight Model for Underwater Bio-Detection. SENSORS (BASEL, SWITZERLAND) 2025; 25:1595. [PMID: 40096437 PMCID: PMC11902696 DOI: 10.3390/s25051595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2025] [Revised: 03/02/2025] [Accepted: 03/03/2025] [Indexed: 03/19/2025]
Abstract
Due to low contrast, chromatic aberration, and generally small objects in underwater environments, a new underwater fish detection model, BGLE-YOLO, is proposed to investigate automated methods dedicated to accurately detecting underwater objects in images. The model has small parameters and low computational effort and is suitable for edge devices. First, an efficient multi-scale convolutional EMC module is introduced to enhance the backbone network and capture the dynamic changes in targets in the underwater environment. Secondly, a global and local feature fusion module for small targets (BIG) is integrated into the neck network to preserve more feature information, reduce error information in higher-level features, and increase the model's effectiveness in detecting small targets. Finally, to prevent the detection accuracy impact due to excessive lightweighting, the lightweight shared head (LSH) is constructed. The reparameterization technique further improves detection accuracy without additional parameters and computational cost. Experimental results of BGLE-YOLO on the underwater datasets DUO (Detection Underwater Objects) and RUOD (Real-World Underwater Object Detection) show that the model achieves the same accuracy as the benchmark model with an ultra-low computational cost of 6.2 GFLOPs and an ultra-low model parameter of 1.6 MB.
Collapse
Affiliation(s)
- Hua Zhao
- School of Mathematical Sciences, Hebei Normal University, Shijiazhuang 050024, China;
- School of Engineering, Hebei Normal University, Shijiazhuang 050024, China; (Z.Z.); (X.W.)
| | - Chao Xu
- School of Engineering, Hebei Normal University, Shijiazhuang 050024, China; (Z.Z.); (X.W.)
| | - Jiaxing Chen
- School of Mathematical Sciences, Hebei Normal University, Shijiazhuang 050024, China;
| | - Zhexian Zhang
- School of Engineering, Hebei Normal University, Shijiazhuang 050024, China; (Z.Z.); (X.W.)
| | - Xiang Wang
- School of Engineering, Hebei Normal University, Shijiazhuang 050024, China; (Z.Z.); (X.W.)
| |
Collapse
|
3
|
Zhou Q, Shi H, Xiang W, Kang B, Latecki LJ. DPNet: Dual-Path Network for Real-Time Object Detection With Lightweight Attention. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4504-4518. [PMID: 38536700 DOI: 10.1109/tnnls.2024.3376563] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
The recent advances in compressing high-accuracy convolutional neural networks (CNNs) have witnessed remarkable progress in real-time object detection. To accelerate detection speed, lightweight detectors always have few convolution layers using a single-path backbone. Single-path architecture, however, involves continuous pooling and downsampling operations, always resulting in coarse and inaccurate feature maps that are disadvantageous to locate objects. On the other hand, due to limited network capacity, recent lightweight networks are often weak in representing large-scale visual data. To address these problems, we present a dual-path network, named DPNet, with a lightweight attention scheme for real-time object detection. The dual-path architecture enables us to extract in parallel high-level semantic features and low-level object details. Although DPNet has a nearly duplicated shape with respect to single-path detectors, the computational costs and model size are not significantly increased. To enhance representation capability, a lightweight self-correlation module (LSCM) is designed to capture global interactions, with only a few computational overheads and network parameters. In the neck, LSCM is extended into a lightweight cross correlation module (LCCM), capturing mutual dependencies among neighboring scale features. We have conducted exhaustive experiments on MS COCO, Pascal VOC 2007, and ImageNet datasets. The experimental results demonstrate that DPNet achieves a state-of-the-art trade off between detection accuracy and implementation efficiency. More specifically, DPNet achieves 31.3% AP on MS COCO test-dev, 82.7% mAP on Pascal VOC 2007 test set, and 41.6% mAP on ImageNet validation set, together with nearly 2.5M model size, 1.04 GFLOPs, and 164 and 196 frames/s (FPS) FPS for input images of three datasets.
Collapse
|
4
|
Li X, Zhao Y, Su H, Wang Y, Chen G. Efficient underwater object detection based on feature enhancement and attention detection head. Sci Rep 2025; 15:5973. [PMID: 39966488 PMCID: PMC11836378 DOI: 10.1038/s41598-025-89421-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Accepted: 02/05/2025] [Indexed: 02/20/2025] Open
Abstract
Underwater object detection presents both significant challenges and opportunities within ocean exploration and conservation. Although the current popular object detection algorithms generally achieve strong performance. Because underwater images are affected by insufficient illumination, wavelength-dependent scattering, and absorption, the detection performance for underwater objects is suboptimal. Therefore, a local channel information encoding method named Partial Semantic Encoding Module (PSEM) and an attention based detection head called Split Dimension Weighting Head (SDWH) are proposed by this paper to enhance the ability of models to extract and integrate semantic features of underwater targets, as well as the capability to locate foreground underwater targets. Specifically, PSEM enhances the fusion of features across multi-scales of the network. It successively completes semantically encoding feature information, followed by residual point-wise addition, and encoding local channel information. SDWH serially weights spatial and channel semantic information of fused features, enhancing the semantic perception of the detectors and the localization ability for foreground underwater objects. PSEM and SDWH are improvements to the neck and detection head of the YOLO series algorithms, respectively. Extensive experiments are conducted on UTDAC2020 and RUOD datasets. On the UTDAC2020 dataset, YOLOv8n improved with PSEM and SDWH achieves a 2.8% mAP increase compared to the original version, YOLOv5n shows a 1% mAP improvement, and YOLOv6n achieves a 3.0% mAP increase. Testing on the RUOD dataset, PSEM and SDWH enable YOLOv8n to achieve a 2.7% mAP improvement. YOLOv5n and YOLOv6n achieve improvements of 1.5% mAP and 3.7% mAP, respectively. Moreover, compared to other real-time underwater SOTA algorithms, YOLOv8n enhanced with PSEM and SDWH achieves the highest mAP of 82.9% on the UTDAC2020 dataset and 80.9% on the RUOD dataset. The proposed PSEM and SDWH are demonstrated to significantly improve the underwater object detection accuracy of YOLO series detectors with acceptable computational cost, and the real-time performance can fully satisfy practical requirements.
Collapse
Affiliation(s)
- Xingkun Li
- Naval Architecture and Port Engineering College, Shandong Jiaotong University, Weihai, 264200, China
| | - Yuhao Zhao
- Chinese Academy of Sciences, Beijing, 100190, China
| | - Hu Su
- Chinese Academy of Sciences, Beijing, 100190, China
| | - Yugang Wang
- Naval Architecture and Port Engineering College, Shandong Jiaotong University, Weihai, 264200, China
| | - Guodong Chen
- Naval Architecture and Port Engineering College, Shandong Jiaotong University, Weihai, 264200, China.
| |
Collapse
|
5
|
Wang C, Xu R, Xu S, Meng W, Xiao J, Zhang X. Accurate Lung Nodule Segmentation With Detailed Representation Transfer and Soft Mask Supervision. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18381-18393. [PMID: 37824321 DOI: 10.1109/tnnls.2023.3315271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
Accurate lung lesion segmentation from computed tomography (CT) images is crucial to the analysis and diagnosis of lung diseases, such as COVID-19 and lung cancer. However, the smallness and variety of lung nodules and the lack of high-quality labeling make the accurate lung nodule segmentation difficult. To address these issues, we first introduce a novel segmentation mask named " soft mask," which has richer and more accurate edge details description and better visualization, and develop a universal automatic soft mask annotation pipeline to deal with different datasets correspondingly. Then, a novel network with detailed representation transfer and soft mask supervision (DSNet) is proposed to process the input low-resolution images of lung nodules into high-quality segmentation results. Our DSNet contains a special detailed representation transfer module (DRTM) for reconstructing the detailed representation to alleviate the small size of lung nodules images and an adversarial training framework with soft mask for further improving the accuracy of segmentation. Extensive experiments validate that our DSNet outperforms other state-of-the-art methods for accurate lung nodule segmentation, and has strong generalization ability in other accurate medical segmentation tasks with competitive results. Besides, we provide a new challenging lung nodules segmentation dataset for further studies (https://drive.google.com/file/d/15NNkvDTb_0Ku0IoPsNMHezJRTH1Oi1wm/view?usp=sharing).
Collapse
|
6
|
Cao R, Zhang R, Yan X, Zhang J. BG-YOLO: A Bidirectional-Guided Method for Underwater Object Detection. SENSORS (BASEL, SWITZERLAND) 2024; 24:7411. [PMID: 39599187 PMCID: PMC11597922 DOI: 10.3390/s24227411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 11/08/2024] [Accepted: 11/12/2024] [Indexed: 11/29/2024]
Abstract
Degraded underwater images decrease the accuracy of underwater object detection. Existing research uses image enhancement methods to improve the visual quality of images, which may not be beneficial in underwater image detection and lead to serious degradation in detector performance. To alleviate this problem, we proposed a bidirectional guided method for underwater object detection, referred to as BG-YOLO. In the proposed method, a network is organized by constructing an image enhancement branch and an object detection branch in a parallel manner. The image enhancement branch consists of a cascade of an image enhancement subnet and object detection subnet. The object detection branch only consists of a detection subnet. A feature-guided module connects the shallow convolution layers of the two branches. When training the image enhancement branch, the object detection subnet in the enhancement branch guides the image enhancement subnet to be optimized towards the direction that is most conducive to the detection task. The shallow feature map of the trained image enhancement branch is output to the feature-guided module, constraining the optimization of the object detection branch through consistency loss and prompting the object detection branch to learn more detailed information about the objects. This enhances the detection performance. During the detection tasks, only the object detection branch is reserved so that no additional computational cost is introduced. Extensive experiments demonstrate that the proposed method significantly improves the detection performance of the YOLOv5s object detection network (the mAP is increased by up to 2.9%) and maintains the same inference speed as YOLOv5s (132 fps).
Collapse
Affiliation(s)
- Ruicheng Cao
- School of Cybersecurity, Northwestern Polytechnical University, Xi’an 710072, China;
| | - Ruiteng Zhang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China;
| | - Xinyue Yan
- College of Computer Science, Chongqing University, Chongqin 400044, China;
| | - Jian Zhang
- School of Tropical Agriculture and Forestry, Hainan University, Haikou 571158, China
| |
Collapse
|
7
|
Liu H, Jin F, Zeng H, Pu H, Fan B. Image Enhancement Guided Object Detection in Visually Degraded Scenes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14164-14177. [PMID: 37220059 DOI: 10.1109/tnnls.2023.3274926] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Object detection accuracy degrades seriously in visually degraded scenes. A natural solution is to first enhance the degraded image and then perform object detection. However, it is suboptimal and does not necessarily lead to the improvement of object detection due to the separation of the image enhancement and object detection tasks. To solve this problem, we propose an image enhancement guided object detection method, which refines the detection network with an additional enhancement branch in an end-to-end way. Specifically, the enhancement branch and detection branch are organized in a parallel way, and a feature guided module is designed to connect the two branches, which optimizes the shallow feature of the input image in the detection branch to be as consistent as possible with that of the enhanced image. As the enhancement branch is frozen during training, such a design plays a role in using the features of enhanced images to guide the learning of object detection branch, so as to make the learned detection branch being aware of both image quality and object detection. When testing, the enhancement branch and feature guided module are removed, and so no additional computation cost is introduced for detection. Extensive experimental results, on underwater, hazy, and low-light object detection datasets, demonstrate that the proposed method can improve the detection performance of popular detection networks (YOLO v3, Faster R-CNN, DetectoRS) significantly in visually degraded scenes.
Collapse
|
8
|
Sarkar P, De S, Gurung S, Dey P. UICE-MIRNet guided image enhancement for underwater object detection. Sci Rep 2024; 14:22448. [PMID: 39341956 PMCID: PMC11439071 DOI: 10.1038/s41598-024-73243-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 09/16/2024] [Indexed: 10/01/2024] Open
Abstract
Underwater object detection is a crucial aspect of monitoring the aquaculture resources to preserve the marine ecosystem. In most cases, Low-light and scattered lighting conditions create challenges for computer vision-based underwater object detection. To address these issues, low-colorfulness and low-light image enhancement techniques are explored. This work proposes an underwater image enhancement technique called Underwater Image Colorfulness Enhancement MIRNet (UICE-MIRNet) to increase the visibility of small, multiple, dense objects followed by underwater object detection using YOLOv4. UICE-MIRNet is a specialized version of classical MIRNet, which handles random increments of brightness features to address the visibility problem. The proposed UICE-MIRNET restrict brightness and also works on the improvement of the colourfulness of underwater images. UICE-MIRNet consists of an Underwater Image-Colorfulness Enhancement Block (UI-CEB). This block enables the extraction of low-colourful areas from underwater images and performs colour correction without affecting contextual information. The primary characteristics of UICE-MIRNet are the extraction of multiple features using a convolutional stream, feature fusion to facilitate the flow of information, preservation of contextual information by discarding irrelevant features and increasing colourfulness through proper feature selection. Enhanced images are then trained using the YOLOv4 object detection model. The performance of the proposed UICE-MIRNet method is quantitatively evaluated using standard metrics such as UIQM, UCIQE, entropy, and PSNR. The proposed work is compared with many existing image enhancement and restoration techniques. Also, the performance of object detection is assessed using precision, recall, and mAP. Extensive experiments are conducted on two standard datasets, Brackish and Trash-ICRA19, to demonstrate the performance of the proposed work compared to existing methods. The results show that the proposed model outperforms many state-of-the-art techniques.
Collapse
Affiliation(s)
- Pratima Sarkar
- Department of Computer Science and Engineering, Sikkim Manipal Institute of Technology, Sikkim Manipal University, Rangpo, Sikkim, 737136, India.
- Department of Computer Science and Engineering, Techno International New Town, Kolkata, 700156, India.
| | - Sourav De
- Department of Computer Science and Engineering, Cooch Behar Government Engineering College, Cooch Behar, 736170, India
| | - Sandeep Gurung
- Department of Computer Science and Engineering, Sikkim Manipal Institute of Technology, Sikkim Manipal University, Rangpo, Sikkim, 737136, India
| | - Prasenjit Dey
- Department of Computer Science and Engineering, NIT Rourkela, Rourkela, 769008, India
| |
Collapse
|
9
|
An S, Wang L, Wang L. MINM: Marine intelligent netting monitoring using multi-scattering model and multi-space transformation. ISA TRANSACTIONS 2024; 150:278-297. [PMID: 38782640 DOI: 10.1016/j.isatra.2024.05.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 03/28/2024] [Accepted: 05/06/2024] [Indexed: 05/25/2024]
Abstract
Marine intelligent net tank aquaculture monitoring plays an important role in improving aquaculture efficiency, environmental monitoring efficiency, and environmental safety. The underwater environment has complex light, often with problems such as scattering and absorption, resulting in poor image quality, making it difficult to accurately analyze and judge the aquaculture environment. Improving marine intelligent net tank aquaculture monitoring has the following three advantages: 1) better observation and monitoring of the aquaculture process, timely detection of problems and abnormalities, to protect the benefits of aquaculture and product quality. 2) more convenient and rapid monitoring of the aquaculture environment, improving monitoring efficiency and reducing monitoring costs. 3) effective monitoring of the underwater environment around the farm, and timely detection of foreign pollution, harmful substances, and other problems, to protect the safety of the aquaculture environment. Therefore, in order to solve the two degradation problems of scattering and absorption in the process of marine smart net farm monitoring, we propose a marine smart net farm monitoring method using multiple scattering models and multiple spatial transformations, called MINM. Specifically, inspired by the image chromatic aberration correction method, we design a color correction method in the multicolor space, which is implemented by using the Lab and RGB color space by performing contrast-constrained adaptive histogram equalization and gray world assumptions, respectively, to correct color shifts in different color models. Based on this, we propose a de-scattering method using a multi-scattering model, which eliminates the effect of scattering on underwater imaging by embedding a complete multi-scattering underwater imaging model to guide the extraction of different features in the multi-scattering model. To obtain more qualified results, we also propose an efficient perceptual fusion to mix the output of the de-scattering and color correction. Thus, our method can take advantage of multiple scattering models and multiple spatial transformations to effectively improve the visual quality of underwater images, producing enhanced results that fit the complete underwater imaging model and have bio-visual characteristics. In extensive experimental demonstrations, our MINM method has shown higher performance than the state-of-the-art methods in terms of both visual quality and quantitative metrics. All experimental results and datasets in this paper are available from the following website: https://github.com/An-Shunmin/MINM.
Collapse
Affiliation(s)
- Shunmin An
- Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Shanghai, China.
| | - Linling Wang
- Shanghai Engineering Research Center of Intelligent Maritime Search & Rescue and Underwater Vehicles, Shanghai Maritime University, Shanghai, China
| | - Le Wang
- Institute of Logistics Science and Engineering, Shanghai Maritime University, Shanghai, China
| |
Collapse
|
10
|
Lin X, Huang X, Wang L. Underwater object detection method based on learnable query recall mechanism and lightweight adapter. PLoS One 2024; 19:e0298739. [PMID: 38416764 PMCID: PMC10901356 DOI: 10.1371/journal.pone.0298739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 01/29/2024] [Indexed: 03/01/2024] Open
Abstract
With the rapid development of ocean observation technology, underwater object detection has begun to occupy an essential position in the fields of aquaculture, environmental monitoring, marine science, etc. However, due to the problems unique to underwater images such as severe noise, blurred objects, and multi-scale, deep learning-based target detection algorithms lack sufficient capabilities to cope with these challenges. To address these issues, we improve DETR to make it well suited for underwater scenarios. First, a simple and effective learnable query recall mechanism is proposed to mitigate the effect of noise and can significantly improve the detection performance of the object. Second, for underwater small and irregular object detection, a lightweight adapter is designed to provide multi-scale features for the encoding and decoding stages. Third, the regression mechanism of the bounding box is optimized using the combination loss of smooth L1 and CIoU. Finally, we validate the designed network against other state-of-the-art methods on the RUOD dataset. The experimental results show that the proposed method is effective.
Collapse
Affiliation(s)
- Xi Lin
- Institute of Logistics Science and Engineering, Shanghai Maritime University, Shanghai, People's Republic of China
| | - Xixia Huang
- Institute of Logistics Science and Engineering, Shanghai Maritime University, Shanghai, People's Republic of China
| | - Le Wang
- Institute of Logistics Science and Engineering, Shanghai Maritime University, Shanghai, People's Republic of China
| |
Collapse
|
11
|
Li Y, Liao N, Li Y, Li H, Wu W. Color Conversion of Wide-Color-Gamut Cameras Using Optimal Training Groups. SENSORS (BASEL, SWITZERLAND) 2023; 23:7186. [PMID: 37631723 PMCID: PMC10460023 DOI: 10.3390/s23167186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 08/04/2023] [Accepted: 08/06/2023] [Indexed: 08/27/2023]
Abstract
The colorimetric conversion of wide-color-gamut cameras plays an important role in the field of wide-color-gamut displays. However, it is rather difficult for us to establish the conversion models with desired approximation accuracy in the case of wide color gamut. In this paper, we propose using an optimal method to establish the color conversion models that change the RGB space of cameras to the XYZ space of a CIEXYZ system. The method makes use of the Pearson correlation coefficient to evaluate the linear correlation between the RGB values and the XYZ values in a training group so that a training group with optimal linear correlation can be obtained. By using the training group with optimal linear correlation, the color conversion models can be established, and the desired color conversion accuracy can be obtained in the whole color space. In the experiments, the wide-color-gamut sample groups were designed and then divided into different groups according to their hue angles and chromas in the CIE1976L*a*b* space, with the Pearson correlation coefficient being used to evaluate the linearity between RGB and XYZ space. Particularly, two kinds of color conversion models employing polynomial formulas with different terms and a BP artificial neural network (BP-ANN) were trained and tested with the same sample groups. The experimental results show that the color conversion errors (CIE1976L*a*b* color difference) of the polynomial transforms with the training groups divided by hue angles can be decreased efficiently.
Collapse
Affiliation(s)
| | | | - Yumei Li
- State Key Discipline Laboratory of Color Science and Engineering, School of Optoelectronics, Beijing Institute of Technology, Beijing 100081, China
| | | | | |
Collapse
|
12
|
Mbani B, Buck V, Greinert J. An automated image-based workflow for detecting megabenthic fauna in optical images with examples from the Clarion-Clipperton Zone. Sci Rep 2023; 13:8350. [PMID: 37221273 DOI: 10.1038/s41598-023-35518-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 05/19/2023] [Indexed: 05/25/2023] Open
Abstract
Recent advances in optical underwater imaging technologies enable the acquisition of huge numbers of high-resolution seafloor images during scientific expeditions. While these images contain valuable information for non-invasive monitoring of megabenthic fauna, flora and the marine ecosystem, traditional labor-intensive manual approaches for analyzing them are neither feasible nor scalable. Therefore, machine learning has been proposed as a solution, but training the respective models still requires substantial manual annotation. Here, we present an automated image-based workflow for Megabenthic Fauna Detection with Faster R-CNN (FaunD-Fast). The workflow significantly reduces the required annotation effort by automating the detection of anomalous superpixels, which are regions in underwater images that have unusual properties relative to the background seafloor. The bounding box coordinates of the detected anomalous superpixels are proposed as a set of weak annotations, which are then assigned semantic morphotype labels and used to train a Faster R-CNN object detection model. We applied this workflow to example underwater images recorded during cruise SO268 to the German and Belgian contract areas for Manganese-nodule exploration, within the Clarion-Clipperton Zone (CCZ). A performance assessment of our FaunD-Fast model showed a mean average precision of 78.1% at an intersection-over-union threshold of 0.5, which is on a par with competing models that use costly-to-acquire annotations. In more detail, the analysis of the megafauna detection results revealed that ophiuroids and xenophyophores were among the most abundant morphotypes, accounting for 62% of all the detections within the surveyed area. Investigating the regional differences between the two contract areas further revealed that both megafaunal abundance and diversity was higher in the shallower German area, which might be explainable by the higher food availability in form of sinking organic material that decreases from east-to-west across the CCZ. Since these findings are consistent with studies based on conventional image-based methods, we conclude that our automated workflow significantly reduces the required human effort, while still providing accurate estimates of megafaunal abundance and their spatial distribution. The workflow is thus useful for a quick but objective generation of baseline information to enable monitoring of remote benthic ecosystems.
Collapse
Affiliation(s)
- Benson Mbani
- DeepSea Monitoring Group, GEOMAR Helmholtz Center for Ocean Research Kiel, Wischhofstraße 1-3, 24148, Kiel, Germany.
| | - Valentin Buck
- DeepSea Monitoring Group, GEOMAR Helmholtz Center for Ocean Research Kiel, Wischhofstraße 1-3, 24148, Kiel, Germany
| | - Jens Greinert
- DeepSea Monitoring Group, GEOMAR Helmholtz Center for Ocean Research Kiel, Wischhofstraße 1-3, 24148, Kiel, Germany
- Institute of Geosciences, Kiel University, Ludewig-Meyn-Str. 10-12, 24118, Kiel, Germany
| |
Collapse
|
13
|
Li Y, Li Y, Liao N, Li H, Lv N, Wu W. Colorimetric characterization of the wide-color-gamut camera using the multilayer artificial neural network. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2023; 40:629-636. [PMID: 37133047 DOI: 10.1364/josaa.481547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
In order to realize colorimetric characterization for the wide-color-gamut camera, we propose using the multilayer artificial neural network (ML-ANN) with the error-backpropagation algorithm, to model the color conversion from the RGB space of camera to theX Y Z space of the CIEXYZ standard. In this paper, the architecture model, forward-calculation model, error-backpropagation model, and the training policy of the ML-ANN were introduced. Based on the spectral reflectance curves of the ColorChecker-SG blocks and the spectral sensitivity functions of the RGB channels of typical color cameras, the method of producing the wide-color-gamut samples for the training and testing of the ML-ANN was proposed. Meanwhile, the comparative experiment employing different polynomial transforms with the least-square method was conducted. The experimental results have shown that, with the increase of the hidden layers and the neurons in each hidden layer, the training and testing errors can be decreased obviously. The mean training errors and mean testing errors of the ML-ANN with optimal hidden layers have been decreased to 0.69 and 0.84 (color difference of CIELAB), respectively, which is much better than all the polynomial transforms, including quartic polynomial transform.
Collapse
|
14
|
Liu K, Peng L, Tang S. Underwater Object Detection Using TC-YOLO with Attention Mechanisms. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23052567. [PMID: 36904769 PMCID: PMC10007230 DOI: 10.3390/s23052567] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 02/22/2023] [Accepted: 02/22/2023] [Indexed: 05/27/2023]
Abstract
Underwater object detection is a key technology in the development of intelligent underwater vehicles. Object detection faces unique challenges in underwater applications: blurry underwater images; small and dense targets; and limited computational capacity available on the deployed platforms. To improve the performance of underwater object detection, we proposed a new object detection approach that combines a new detection neural network called TC-YOLO, an image enhancement technique using an adaptive histogram equalization algorithm, and the optimal transport scheme for label assignment. The proposed TC-YOLO network was developed based on YOLOv5s. Transformer self-attention and coordinate attention were adopted in the backbone and neck of the new network, respectively, to enhance feature extraction for underwater objects. The application of optimal transport label assignment enables a significant reduction in the number of fuzzy boxes and improves the utilization of training data. Our tests using the RUIE2020 dataset and ablation experiments demonstrate that the proposed approach performs better than the original YOLOv5s and other similar networks for underwater object detection tasks; moreover, the size and computational cost of the proposed model remain small for underwater mobile applications.
Collapse
|
15
|
Jeyaraj PR, Nadar ERS. Medical image annotation and classification employing pyramidal feature specific lightweight deep convolution neural network. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2023. [DOI: 10.1080/21681163.2023.2179341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Affiliation(s)
- Pandia Rajan Jeyaraj
- Department of Electrical and Electronics Engineering, Mepco Schlenk Engineering College, Sivakasi, India
| | - Edward Rajan Samuel Nadar
- Department of Electrical and Electronics Engineering, Mepco Schlenk Engineering College, Sivakasi, India
| |
Collapse
|
16
|
Er MJ, Chen J, Zhang Y, Gao W. Research Challenges, Recent Advances, and Popular Datasets in Deep Learning-Based Underwater Marine Object Detection: A Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:1990. [PMID: 36850584 PMCID: PMC9966468 DOI: 10.3390/s23041990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 01/18/2023] [Accepted: 01/20/2023] [Indexed: 06/18/2023]
Abstract
Underwater marine object detection, as one of the most fundamental techniques in the community of marine science and engineering, has been shown to exhibit tremendous potential for exploring the oceans in recent years. It has been widely applied in practical applications, such as monitoring of underwater ecosystems, exploration of natural resources, management of commercial fisheries, etc. However, due to complexity of the underwater environment, characteristics of marine objects, and limitations imposed by exploration equipment, detection performance in terms of speed, accuracy, and robustness can be dramatically degraded when conventional approaches are used. Deep learning has been found to have significant impact on a variety of applications, including marine engineering. In this context, we offer a review of deep learning-based underwater marine object detection techniques. Underwater object detection can be performed by different sensors, such as acoustic sonar or optical cameras. In this paper, we focus on vision-based object detection due to several significant advantages. To facilitate a thorough understanding of this subject, we organize research challenges of vision-based underwater object detection into four categories: image quality degradation, small object detection, poor generalization, and real-time detection. We review recent advances in underwater marine object detection and highlight advantages and disadvantages of existing solutions for each challenge. In addition, we provide a detailed critical examination of the most extensively used datasets. In addition, we present comparative studies with previous reviews, notably those approaches that leverage artificial intelligence, as well as future trends related to this hot topic.
Collapse
|
17
|
Boosting R-CNN: Reweighting R-CNN Samples by RPN’s Error for Underwater Object Detection. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
|
18
|
Joo Er M, Chen J, Zhang Y. Marine Robotics 4.0: Present and Future of Real-Time Detection Techniques for Underwater Objects. ARTIF INTELL 2022. [DOI: 10.5772/intechopen.107409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Underwater marine robots (UMRs), such as autonomous underwater vehicles, are promising alternatives for mankind to perform exploration tasks in the sea. These vehicles have the capability of exploring the underwater environment with onboard instruments and sensors. They are extensively used in civilian applications, scientific studies, and military missions. In recent years, the flourishing growth of deep learning has fueled tremendous theoretical breakthroughs and practical applications of computer-vision-based underwater object detection techniques. With the integration of deep-learning-based underwater object detection capability on board, the perception of underwater marine robots is expected to be enhanced greatly. Underwater object detection will play a key role in Marine Robotics 4.0, i.e., Industry 4.0 for Marine Robots. In this chapter, one of the key research challenges, i.e., real-time detection of underwater objects, which has prevented many real-world applications of object detection techniques onboard UMRs, is reviewed. In this context, state-of-the-art techniques for real-time detection of underwater objects are critically analyzed. Futuristic trends in real-time detection techniques of underwater objects are also discussed.
Collapse
|
19
|
Jia S, Jiang S, Zhang S, Xu M, Jia X. Graph-in-Graph Convolutional Network for Hyperspectral Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1157-1171. [PMID: 35724277 DOI: 10.1109/tnnls.2022.3182715] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
With the development of hyperspectral sensors, accessible hyperspectral images (HSIs) are increasing, and pixel-oriented classification has attracted much attention. Recently, graph convolutional networks (GCNs) have been proposed to process graph-structured data in non-Euclidean domains and have been employed in HSI classification. But most methods based on GCN are hard to sufficiently exploit information of ground objects due to feature aggregation. To solve this issue, in this article, we proposed a graph-in-graph (GiG) model and a related GiG convolutional network (GiGCN) for HSI classification from a superpixel viewpoint. The GiG representation covers information inside and outside superpixels, respectively, corresponding to the local and global characteristics of ground objects. Concretely, after segmenting HSI into disjoint superpixels, each one is converted to an internal graph. Meanwhile, an external graph is constructed according to the spatial adjacent relationships among superpixels. Significantly, each node in the external graph embeds a corresponding internal graph, forming the so-called GiG structure. Then, GiGCN composed of internal and External graph convolution (EGC) is designed to extract hierarchical features and integrate them into multiple scales, improving the discriminability of GiGCN. Ensemble learning is incorporated to further boost the robustness of GiGCN. It is worth noting that we are the first to propose the GiG framework from the superpixel point and the GiGCN scheme for HSI classification. Experiment results on four benchmark datasets demonstrate that our proposed method is effective and feasible for HSI classification with limited labeled samples. For study replication, the code developed for this study is available at https://github.com/ShuGuoJ/GiGCN.git.
Collapse
|
20
|
Automatically Guided Selection of a Set of Underwater Calibration Images. JOURNAL OF MARINE SCIENCE AND ENGINEERING 2022. [DOI: 10.3390/jmse10060741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The 3D reconstruction of underwater scenes from overlapping images requires modeling the sensor. While underwater self-calibration gives good results when coupled with multi-view algorithms, calibration or pre-calibration with a pattern is still necessary when scenes are weakly textured or if there are not enough points of view of the same points; however, detecting patterns on underwater images or obtaining a good distribution of these patterns on a dataset is not an easy task. Thus, we propose a methodology to guide the acquisition of a relevant underwater calibration dataset. This process is intended to provide feedback in near real-time to the operator to guide the acquisition and stop it when a sufficient number of relevant calibration images have been reached. To perform this, pattern detection must be optimized both in time and success rate. We propose three variations of optimized detection algorithms, each of which takes into account different hardware capabilities. We present the results obtained on a homemade database composed of 60,000 images taken both in pools and at sea.
Collapse
|
21
|
Somasunder S, Shih FY. Land Cover Image Segmentation Based on Individual Class Binary Masks. INT J PATTERN RECOGN 2021. [DOI: 10.1142/s0218001421540343] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Remote sensing techniques have been developed over the past decades to acquire data without being in contact of the target object or data source. Their application on land-cover image segmentation has attracted significant attention during recent years. With the help of satellites, scientists and researchers can collect and store high-resolution image data that can be further be processed, segmented, and classified. However, these research results have not yet been synthesized to provide coherent guidance on the effect of variant land-cover segmentation processes. In this paper, we present a novel model that augments segmentation using smaller networks to segment individual classes. The combined network is trained on the same data but with the masks, combined and trained using categorical cross entropy. Experimental results show that the proposed method produces the highest mean IoU (Intersection of Union) as compared against several existing state-of-the-art models on the DeepGlobe dataset.
Collapse
Affiliation(s)
| | - Frank Y. Shih
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
- Department of Computer Science and Information Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|