Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Deng S, Li S, Xie K, Song W, Liao X, Hao A, Qin H. A Global-Local Self-Adaptive Network for Drone-View Object Detection. IEEE Trans Image Process 2021;30:1556-1569. [PMID: 33360993 DOI: 10.1109/tip.2020.3045636] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

For:	Deng S, Li S, Xie K, Song W, Liao X, Hao A, Qin H. A Global-Local Self-Adaptive Network for Drone-View Object Detection. IEEE Trans Image Process 2021;30:1556-1569. [PMID: 33360993 DOI: 10.1109/tip.2020.3045636] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Number

Cited by Other Article(s)

Shi Y, Jia Y, Zhang X. FocusDet: an efficient object detector for small object. Sci Rep 2024;14:10697. [PMID: 38730236 DOI: 10.1038/s41598-024-61136-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 05/02/2024] [Indexed: 05/12/2024] Open

Abstract

The object scale of a small object scene changes greatly, and the object is easily disturbed by a complex background. Generic object detectors do not perform well on small object detection tasks. In this paper, we focus on small object detection based on FocusDet. FocusDet refers to the small object detector proposed in this paper. It consists of three parts: backbone, feature fusion structure, and detection head. STCF-EANet was used as the backbone for feature extraction, the Bottom Focus-PAN for feature fusion, and the detection head for object localization and recognition.To maintain sufficient global context information and extract multi-scale features, the STCF-EANet network backbone is used as the feature extraction network.PAN is a feature fusion module used in general object detectors. It is used to perform feature fusion on the extracted feature maps to supplement feature information.In the feature fusion network, FocusDet uses Bottom Focus-PAN to capture a wider range of locations and lower-level feature information of small objects.SIOU-SoftNMS is the proposed algorithm for removing redundant prediction boxes in the post-processing stage. SIOU multi-dimension accurately locates the prediction box, and SoftNMS uses the Gaussian algorithm to remove redundant prediction boxes. FocusDet uses SIOU-SoftNMS to address the missed detection problem common in dense tiny objects.The VisDrone2021-DET and CCTSDB2021 object detection datasets are used as benchmarks, and tests are carried out on VisDrone2021-det-test-dev and CCTSDB-val datasets. Experimental results show that FocusDet improves mAP@.5% from 33.6% to 46.7% on the VisDrone dataset. mAP@.5% on the CCTSDB2021 dataset is improved from 81.6% to 87.8%. It is shown that the model has good performance for small object detection, and the research is innovative.

Collapse

Zhang L, Qin L, Xu M, Chen W, Pu S, Zhang W. Randomized Spectrum Transformations for Adapting Object Detector in Unseen Domains. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023;32:4868-4879. [PMID: 37616139 DOI: 10.1109/tip.2023.3306915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]

Shang R, Li W, Zhu S, Jiao L, Li Y. Multi-teacher knowledge distillation based on joint Guidance of Probe and Adaptive Corrector. Neural Netw 2023;164:345-356. [PMID: 37163850 DOI: 10.1016/j.neunet.2023.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 03/05/2023] [Accepted: 04/11/2023] [Indexed: 05/12/2023]

Lv P, Hu S, Hao T. Contrastive Proposal Extension With LSTM Network for Weakly Supervised Object Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022;31:6879-6892. [PMID: 36306305 DOI: 10.1109/tip.2022.3216772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]

Coupled Global–Local object detection for large VHR aerial images. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Zhao W, Rao Y, Tang Y, Zhou J, Lu J. VideoABC: A Real-World Video Dataset for Abductive Visual Reasoning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022;31:6048-6061. [PMID: 36103440 DOI: 10.1109/tip.2022.3205207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Abstract

In this paper, we investigate the problem of abductive visual reasoning (AVR), which requires vision systems to infer the most plausible explanation for visual observations. Unlike previous work which performs visual reasoning on static images or synthesized scenes, we exploit long-term reasoning from instructional videos that contain a wealth of detailed information about the physical world. We conceptualize two tasks for this emerging and challenging topic. The primary task is AVR, which is based on the initial configuration and desired goal from an instructional video, and the model is expected to figure out what is the most plausible sequence of steps to achieve the goal. In order to avoid trivial solutions based on appearance information rather than reasoning, the second task called AVR++ is constructed, which requires the model to answer why the unselected options are less plausible. We introduce a new dataset called VideoABC, which consists of 46,354 unique steps derived from 11,827 instructional videos, formulated as 13,526 abductive reasoning questions with an average reasoning duration of 51 seconds. Through an adversarial hard hypothesis mining algorithm, non-trivial and high-quality problems are generated efficiently and effectively. To achieve human-level reasoning, we propose a Hierarchical Dual Reasoning Network (HDRNet) to capture the long-term dependencies among steps and observations. We establish a benchmark for abductive visual reasoning, and our method set state-of-the-arts on AVR ( ∼ 74 %) and AVR++ ( ∼ 45 %), and humans can easily achieve over 90% accuracy on these two tasks. The large performance gap reveals the limitation of current video understanding models on temporal reasoning and leaves substantial room for future research on this challenging problem. Our dataset and code are available at https://github.com/wl-zhao/VideoABC.

Collapse

Xu W, Zhang C, Wang Q, Dai P. FEA-Swin: Foreground Enhancement Attention Swin Transformer Network for Accurate UAV-Based Dense Object Detection. SENSORS (BASEL, SWITZERLAND) 2022;22:6993. [PMID: 36146340 PMCID: PMC9502707 DOI: 10.3390/s22186993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 09/10/2022] [Accepted: 09/12/2022] [Indexed: 06/16/2023]

A Novel Multi-Scale Transformer for Object Detection in Aerial Scenes. DRONES 2022. [DOI: 10.3390/drones6080188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Lin J, Zheng Z, Zhong Z, Luo Z, Li S, Yang Y, Sebe N. Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022;31:3780-3792. [PMID: 35604972 DOI: 10.1109/tip.2022.3175601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Adaptive dense pyramid network for object detection in UAV imagery. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.03.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Lightweight Detection Network Based on Sub-Pixel Convolution and Objectness-Aware Structure for UAV Images. SENSORS 2021;21:s21165656. [PMID: 34451098 PMCID: PMC8402490 DOI: 10.3390/s21165656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 08/13/2021] [Accepted: 08/13/2021] [Indexed: 11/19/2022]