1
|
Wang J, Wang B, Gao J, Pan S, Liu T, Yin B, Gao W. MADE: Multicurvature Adaptive Embedding for Temporal Knowledge Graph Completion. IEEE Trans Cybern 2024; PP:1-14. [PMID: 38771679 DOI: 10.1109/tcyb.2024.3392957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
Temporal knowledge graphs (TKGs) are receiving increased attention due to their time-dependent properties and the evolving nature of knowledge over time. TKGs typically contain complex geometric structures, such as hierarchical, ring, and chain structures, which can often be mixed together. However, embedding TKGs into Euclidean space, as is typically done with TKG completion (TKGC) models, presents a challenge when dealing with high-dimensional nonlinear data and complex geometric structures. To address this issue, we propose a novel TKGC model called multicurvature adaptive embedding (MADE). MADE models TKGs in multicurvature spaces, including flat Euclidean space (zero curvature), hyperbolic space (negative curvature), and hyperspherical space (positive curvature), to handle multiple geometric structures. We assign different weights to different curvature spaces in a data-driven manner to strengthen the ideal curvature spaces for modeling and weaken the inappropriate ones. Additionally, we introduce the quadruplet distributor (QD) to assist the information interaction in each geometric space. Ultimately, we develop an innovative temporal regularization to enhance the smoothness of timestamp embeddings by strengthening the correlation of neighboring timestamps. Experimental results show that MADE outperforms the existing state-of-the-art TKGC models.
Collapse
|
2
|
Liu T, Hu Y, Gao J, Wang J, Sun Y, Yin B. Multi-modal long document classification based on Hierarchical Prompt and Multi-modal Transformer. Neural Netw 2024; 176:106322. [PMID: 38653128 DOI: 10.1016/j.neunet.2024.106322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 03/04/2024] [Accepted: 04/13/2024] [Indexed: 04/25/2024]
Abstract
In the realm of long document classification (LDC), previous research has predominantly focused on modeling unimodal texts, overlooking the potential of multi-modal documents incorporating images. To address this gap, we introduce an innovative approach for multi-modal long document classification based on the Hierarchical Prompt and Multi-modal Transformer (HPMT). The proposed HPMT method facilitates multi-modal interactions at both the section and sentence levels, enabling a comprehensive capture of hierarchical structural features and complex multi-modal associations of long documents. Specifically, a Multi-scale Multi-modal Transformer (MsMMT) is tailored to capture the multi-granularity correlations between sentences and images. This is achieved through the incorporation of multi-scale convolutional kernels on sentence features, enhancing the model's ability to discern intricate patterns. Furthermore, to facilitate cross-level information interaction and promote learning of specific features at different levels, we introduce a Hierarchical Prompt (HierPrompt) block. This block incorporates section-level prompts and sentence-level prompts, both derived from a global prompt via distinct projection networks. Extensive experiments are conducted on four challenging multi-modal long document datasets. The results conclusively demonstrate the superiority of our proposed method, showcasing its performance advantages over existing techniques.
Collapse
Affiliation(s)
- Tengfei Liu
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Yongli Hu
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Junbin Gao
- Discipline of Business Analytics, The University of Sydney Business School, The University of Sydney, NSW 2006, Australia
| | - Jiapu Wang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Yanfeng Sun
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Baocai Yin
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
3
|
He X, Wang B, Hu Y, Gao J, Sun Y, Yin B. Parallelly Adaptive Graph Convolutional Clustering Model. IEEE Trans Neural Netw Learn Syst 2024; 35:4451-4464. [PMID: 35617184 DOI: 10.1109/tnnls.2022.3176411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Benefiting from exploiting the data topological structure, graph convolutional network (GCN) has made considerable improvements in processing clustering tasks. The performance of GCN significantly relies on the quality of the pretrained graph, while the graph structures are often corrupted by noise or outliers. To overcome this problem, we replace the pre-trained and fixed graph in GCN by the adaptive graph learned from the data. In this article, we propose a novel end-to-end parallelly adaptive graph convolutional clustering (AGCC) model with two pathway networks. In the first pathway, an adaptive graph convolutional (AGC) module alternatively updates the graph structure and the data representation layer by layer. The updated graph can better reflect the data relationship than the fixed graph. In the second pathway, the auto-encoder (AE) module aims to extract the latent data features. To effectively connect the AGC and AE modules, we creatively propose an attention-mechanism-based fusion (AMF) module to weight and fuse the data representations of the two modules, and transfer them to the AGC module. This simultaneously avoids the over-smoothing problem of GCN. Experimental results on six public datasets show that the effectiveness of the proposed AGCC compared with multiple state-of-the-art deep clustering methods. The code is available at https://github.com/HeXiax/AGCC.
Collapse
|
4
|
Wang B, Ma Y, Li X, Gao J, Hu Y, Yin B. Bridging the Cross-Modality Semantic Gap in Visual Question Answering. IEEE Trans Neural Netw Learn Syst 2024; PP:1-13. [PMID: 38446647 DOI: 10.1109/tnnls.2024.3370925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
The objective of visual question answering (VQA) is to adequately comprehend a question and identify relevant contents in an image that can provide an answer. Existing approaches in VQA often combine visual and question features directly to create a unified cross-modality representation for answer inference. However, this kind of approach fails to bridge the semantic gap between visual and text modalities, resulting in a lack of alignment in cross-modality semantics and the inability to match key visual content accurately. In this article, we propose a model called the caption bridge-based cross-modality alignment and contrastive learning model (CBAC) to address the issue. The CBAC model aims to reduce the semantic gap between different modalities. It consists of a caption-based cross-modality alignment module and a visual-caption (V-C) contrastive learning module. By utilizing an auxiliary caption that shares the same modality as the question and has closer semantic associations with the visual, we are able to effectively reduce the semantic gap by separately matching the caption with both the question and the visual to generate pre-alignment features for each, which are then used in the subsequent fusion process. We also leverage the fact that V-C pairs exhibit stronger semantic connections compared to question-visual (Q-V) pairs to employ a contrastive learning mechanism on visual and caption pairs to further enhance the semantic alignment capabilities of single-modality encoders. Extensive experiments conducted on three benchmark datasets demonstrate that the proposed model outperforms previous state-of-the-art VQA models. Additionally, ablation experiments confirm the effectiveness of each module in our model. Furthermore, we conduct a qualitative analysis by visualizing the attention matrices to assess the reasoning reliability of the proposed model.
Collapse
|
5
|
Yu H, Zhang X, Wang Y, Huang Q, Yin B. Fine-Grained Accident Detection: Database and Algorithm. IEEE Trans Image Process 2024; 33:1059-1069. [PMID: 38265894 DOI: 10.1109/tip.2024.3355812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
This paper presents a novel fine-grained task for traffic accident analysis. Accident detection in surveillance or dashcam videos is a common task in the field of traffic accident analysis by using videos. However, common accident detection does not analyze the specific particulars of the accident, only identifies the accident's existence or occurrence time in a video. In this paper, we define the novel fine-grained accident detection task which contains fine-grained accident classification, temporal-spatial occurrence region localization, and accident severity estimation. A transformer-based framework combining the RGB and optical flow information of videos is proposed for fine-grained accident detection. Additionally, we introduce a challenging Fine-grained Accident Detection (FAD) database that covers multiple tasks in surveillance videos which places more emphasis on the overall perspective. Experimental results demonstrate that our model could effectively extract the video features for multiple tasks, indicating that current traffic accident analysis has limitations in dealing with the FAD task and that further research is indeed needed.
Collapse
|
6
|
Li M, Zhang Y, Wang S, Hu Y, Yin B. Redundancy Is Not What You Need: An Embedding Fusion Graph Auto-Encoder for Self-Supervised Graph Representation Learning. IEEE Trans Neural Netw Learn Syst 2024; PP:1-15. [PMID: 38300769 DOI: 10.1109/tnnls.2024.3357080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Attribute graphs are a crucial data structure for graph communities. However, the presence of redundancy and noise in the attribute graph can impair the aggregation effect of integrating two different heterogeneous distributions of attribute and structural features, resulting in inconsistent and distorted data that ultimately compromises the accuracy and reliability of attribute graph learning. For instance, redundant or irrelevant attributes can result in overfitting, while noisy attributes can lead to underfitting. Similarly, redundant or noisy structural features can affect the accuracy of graph representations, making it challenging to distinguish between different nodes or communities. To address these issues, we propose the embedded fusion graph auto-encoder framework for self-supervised learning (SSL), which leverages multitask learning to fuse node features across different tasks to reduce redundancy. The embedding fusion graph auto-encoder (EFGAE) framework comprises two phases: pretraining (PT) and downstream task learning (DTL). During the PT phase, EFGAE uses a graph auto-encoder (GAE) based on adversarial contrastive learning to learn structural and attribute embeddings separately and then fuses these embeddings to obtain a representation of the entire graph. During the DTL phase, we introduce an adaptive graph convolutional network (AGCN), which is applied to graph neural network (GNN) classifiers to enhance recognition for downstream tasks. The experimental results demonstrate that our approach outperforms state-of-the-art (SOTA) techniques in terms of accuracy, generalization ability, and robustness.
Collapse
|
7
|
Zhang Q, Li J, Sun Y, Wang S, Gao J, Yin B. Beyond low-pass filtering on large-scale graphs via Adaptive Filtering Graph Neural Networks. Neural Netw 2024; 169:1-10. [PMID: 37852165 DOI: 10.1016/j.neunet.2023.09.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 08/02/2023] [Accepted: 09/24/2023] [Indexed: 10/20/2023]
Abstract
Graph Neural Networks (GNNs) have emerged as a crucial deep learning framework for graph-structured data. However, existing GNNs suffer from the scalability limitation, which hinders their practical implementation in industrial settings. Many scalable GNNs have been proposed to address this limitation. However, they have been proven to act as low-pass graph filters, which discard the valuable middle- and high-frequency information. This paper proposes a novel graph neural network named Adaptive Filtering Graph Neural Networks (AFGNN), which can capture all frequency information on large-scale graphs. AFGNN consists of two stages. The first stage utilizes low-, middle-, and high-pass graph filters to extract comprehensive frequency information without introducing additional parameters. This computation is a one-time task and is pre-computed before training, ensuring its scalability. The second stage incorporates a node-level attention-based feature combination, enabling the generation of customized graph filters for each node, contrary to existing spectral GNNs that employ uniform graph filters for the entire graph. AFGNN is suitable for mini-batch training, and can enhance scalability and efficiently capture all frequency information from large-scale graphs. We evaluate AFGNN by comparing its ability to capture all frequency information with spectral GNNs, and its scalability with scalable GNNs. Experimental results illustrate that AFGNN surpasses both scalable GNNs and spectral GNNs, highlighting its superiority.
Collapse
Affiliation(s)
- Qi Zhang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Jinghua Li
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Yanfeng Sun
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China.
| | - Shaofan Wang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Junbin Gao
- Discipline of Business Analytics, The University of Sydney Business School, The University of Sydney, NSW 2006, Australia
| | - Baocai Yin
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| |
Collapse
|
8
|
Liu Y, Long C, Zhang Z, Liu B, Zhang Q, Yin B, Yang X. Explore Contextual Information for 3D Scene Graph Generation. IEEE Trans Vis Comput Graph 2023; 29:5556-5568. [PMID: 36367917 DOI: 10.1109/tvcg.2022.3219451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
3D scene graph generation (SGG) has been of high interest in computer vision. Although the accuracy of 3D SGG on coarse classification and single relation label has been gradually improved, the performance of existing works is still far from being perfect for fine-grained and multi-label situations. In this article, we propose a framework fully exploring contextual information for the 3D SGG task, which attempts to satisfy the requirements of fine-grained entity class, multiple relation labels, and high accuracy simultaneously. Our proposed approach is composed of a Graph Feature Extraction module and a Graph Contextual Reasoning module, achieving appropriate information-redundancy feature extraction, structured organization, and hierarchical inferring. Our approach achieves superior or competitive performance over previous methods on the 3DSSG dataset, especially on the relationship prediction sub-task.
Collapse
|
9
|
Tan H, Liu X, Yin B, Li X. DR-GAN: Distribution Regularization for Text-to-Image Generation. IEEE Trans Neural Netw Learn Syst 2023; 34:10309-10323. [PMID: 35442894 DOI: 10.1109/tnnls.2022.3165573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This article presents a new text-to-image (T2I) generation model, named distribution regularization generative adversarial network (DR-GAN), to generate images from text descriptions from improved distribution learning. In DR-GAN, we introduce two novel modules: a semantic disentangling module (SDM) and a distribution normalization module (DNM). SDM combines the spatial self-attention mechanism (SSAM) and a new semantic disentangling loss (SDL) to help the generator distill key semantic information for the image generation. DNM uses a variational auto-encoder (VAE) to normalize and denoise the image latent distribution, which can help the discriminator better distinguish synthesized images from real images. DNM also adopts a distribution adversarial loss (DAL) to guide the generator to align with normalized real image distributions in the latent space. Extensive experiments on two public datasets demonstrated that our DR-GAN achieved a competitive performance in the T2I task. The code link: https://github.com/Tan-H-C/DR-GAN-Distribution-Regularization-for-Text-to-Image-Generation.
Collapse
|
10
|
Fu Y, Li M, Liu W, Wang Y, Zhang J, Yin B, Wei X, Yang X. Distractor-Aware Event-Based Tracking. IEEE Trans Image Process 2023; 32:6129-6141. [PMID: 37889807 DOI: 10.1109/tip.2023.3326683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Event cameras, or dynamic vision sensors, have recently achieved success from fundamental vision tasks to high-level vision researches. Due to its ability to asynchronously capture light intensity changes, event camera has an inherent advantage to capture moving objects in challenging scenarios including objects under low light, high dynamic range, or fast moving objects. Thus event camera are natural for visual object tracking. However, the current event-based trackers derived from RGB trackers simply modify the input images to event frames and still follow conventional tracking pipeline that mainly focus on object texture for target distinction. As a result, the trackers may not be robust dealing with challenging scenarios such as moving cameras and cluttered foreground. In this paper, we propose a distractor-aware event-based tracker that introduces transformer modules into Siamese network architecture (named DANet). Specifically, our model is mainly composed of a motion-aware network and a target-aware network, which simultaneously exploits both motion cues and object contours from event data, so as to discover motion objects and identify the target object by removing dynamic distractors. Our DANet can be trained in an end-to-end manner without any post-processing and can run at over 80 FPS on a single V100. We conduct comprehensive experiments on two large event tracking datasets to validate the proposed model. We demonstrate that our tracker has superior performance against the state-of-the-art trackers in terms of both accuracy and efficiency.
Collapse
|
11
|
Tan H, Liu X, Yin B, Li X. MHSA-Net: Multihead Self-Attention Network for Occluded Person Re-Identification. IEEE Trans Neural Netw Learn Syst 2023; 34:8210-8224. [PMID: 35312622 DOI: 10.1109/tnnls.2022.3144163] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This article presents a novel person reidentification model, named multihead self-attention network (MHSA-Net), to prune unimportant information and capture key local information from person images. MHSA-Net contains two main novel components: multihead self-attention branch (MHSAB) and attention competition mechanism (ACM). The MHSAB adaptively captures key local person information and then produces effective diversity embeddings of an image for the person matching. The ACM further helps filter out attention noise and nonkey information. Through extensive ablation studies, we verified that the MHSAB and ACM both contribute to the performance improvement of the MHSA-Net. Our MHSA-Net achieves competitive performance in the standard and occluded person Re-ID tasks.
Collapse
|
12
|
Long T, Sun Y, Gao J, Hu Y, Yin B. Domain Adaptation as Optimal Transport on Grassmann Manifolds. IEEE Trans Neural Netw Learn Syst 2023; 34:7196-7209. [PMID: 35061594 DOI: 10.1109/tnnls.2021.3139119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Domain adaptation in the Euclidean space is a challenging task on which researchers recently have made great progress. However, in practice, there are rich data representations that are not Euclidean. For example, many high-dimensional data in computer vision are in general modeled by a low-dimensional manifold. This prompts the demand of exploring domain adaptation between non-Euclidean manifold spaces. This article is concerned with domain adaption over the classic Grassmann manifolds. An optimal transport-based domain adaptation model on Grassmann manifolds has been proposed. The model implements the adaption between datasets by minimizing the Wasserstein distances between the projected source data and the target data on Grassmann manifolds. Four regularization terms are introduced to keep task-related consistency in the adaptation process. Furthermore, to reduce the computational cost, a simplified model preserving the necessary adaption property and its efficient algorithm is proposed and tested. The experiments on several publicly available datasets prove the proposed model outperforms several relevant baseline domain adaptation methods.
Collapse
|
13
|
Liu T, Hu Y, Wang B, Sun Y, Gao J, Yin B. Hierarchical Graph Convolutional Networks for Structured Long Document Classification. IEEE Trans Neural Netw Learn Syst 2023; 34:8071-8085. [PMID: 35767491 DOI: 10.1109/tnnls.2022.3185295] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Long document classification (LDC) has been a focused interest in natural language processing (NLP) recently with the exponential increase of publications. Based on the pretrained language models, many LDC methods have been proposed and achieved considerable progression. However, most of the existing methods model long documents as sequences of text while omitting the document structure, thus limiting the capability of effectively representing long texts carrying structure information. To mitigate such limitation, we propose a novel hierarchical graph convolutional network (HGCN) for structured LDC in this article, in which a section graph network is proposed to model the macrostructure of a document and a word graph network with a decoupled graph convolutional block is designed to extract the fine-grained features of a document. In addition, an interaction strategy is proposed to integrate these two networks as a whole by propagating features between them. To verify the effectiveness of the proposed model, four structured long document datasets are constructed, and the extensive experiments conducted on these datasets and another unstructured dataset show that the proposed method outperforms the state-of-the-art related classification methods.
Collapse
|
14
|
Liu J, Tan H, Hu Y, Sun Y, Wang H, Yin B. Global and Local Interactive Perception Network for Referring Image Segmentation. IEEE Trans Neural Netw Learn Syst 2023; PP:1-14. [PMID: 37695953 DOI: 10.1109/tnnls.2023.3308550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Abstract
The effective modal fusion and perception between the language and the image are necessary for inferring the reference instance in the referring image segmentation (RIS) task. In this article, we propose a novel RIS network, the global and local interactive perception network (GLIPN), to enhance the quality of modal fusion between the language and the image from the local and global perspectives. The core of GLIPN is the global and local interactive perception (GLIP) scheme. Specifically, the GLIP scheme contains the local perception module (LPM) and the global perception module (GPM). The LPM is designed to enhance the local modal fusion by the correspondence between word and image local semantics. The GPM is designed to inject the global structured semantics of images into the modal fusion process, which can better guide the word embedding to perceive the whole image's global structure. Combined with the local-global context semantics fusion, extensive experiments on several benchmark datasets demonstrate the advantage of the proposed GLIPN over most state-of-the-art approaches.
Collapse
|
15
|
He X, Wang B, Li R, Gao J, Hu Y, Huo G, Yin B. Graph structure learning layer and its graph convolution clustering application. Neural Netw 2023; 165:1010-1020. [PMID: 37467583 DOI: 10.1016/j.neunet.2023.06.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 05/06/2023] [Accepted: 06/22/2023] [Indexed: 07/21/2023]
Abstract
To learn the embedding representation of graph structure data corrupted by noise and outliers, existing graph structure learning networks usually follow the two-step paradigm, i.e., constructing a "good" graph structure and achieving the message passing for signals supported on the learned graph. However, the data corrupted by noise may make the learned graph structure unreliable. In this paper, we propose an adaptive graph convolutional clustering network that alternatively adjusts the graph structure and node representation layer-by-layer with back-propagation. Specifically, we design a Graph Structure Learning layer before each Graph Convolutional layer to learn the sparse graph structure from the node representations, where the graph structure is implicitly determined by the solution to the optimal self-expression problem. This is one of the first works that uses an optimization process as a Graph Network layer, which is obviously different from the function operation in traditional deep learning layers. An efficient iterative optimization algorithm is given to solve the optimal self-expression problem in the Graph Structure Learning layer. Experimental results show that the proposed method can effectively defend the negative effects of inaccurate graph structures. The code is available at https://github.com/HeXiax/SSGNN.
Collapse
Affiliation(s)
- Xiaxia He
- Beijing Municipa Key Laboratory of Multimedia and Intelligent Software Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Boyue Wang
- Beijing Municipa Key Laboratory of Multimedia and Intelligent Software Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Ruikun Li
- The University of Sydney Business School, The University of Sydney, Camperdown NSW 2006, Australia.
| | - Junbin Gao
- The University of Sydney Business School, The University of Sydney, Camperdown NSW 2006, Australia.
| | - Yongli Hu
- Beijing Municipa Key Laboratory of Multimedia and Intelligent Software Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Guangyu Huo
- Beijing Municipa Key Laboratory of Multimedia and Intelligent Software Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Baocai Yin
- Beijing Municipa Key Laboratory of Multimedia and Intelligent Software Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
16
|
Wang J, Wang B, Gao J, Li X, Hu Y, Yin B. QDN: A Quadruplet Distributor Network for Temporal Knowledge Graph Completion. IEEE Trans Neural Netw Learn Syst 2023; PP:1-13. [PMID: 37224351 DOI: 10.1109/tnnls.2023.3274230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Temporal knowledge graph completion (TKGC) is an extension of the traditional static knowledge graph completion (SKGC) by introducing the timestamp. The existing TKGC methods generally translate the original quadruplet to the form of the triplet by integrating the timestamp into the entity/relation, and then use SKGC methods to infer the missing item. However, such an integrating operation largely limits the expressive ability of temporal information and ignores the semantic loss problem due to the fact that entities, relations, and timestamps are located in different spaces. In this article, we propose a novel TKGC method called the quadruplet distributor network (QDN), which independently models the embeddings of entities, relations, and timestamps in their specific spaces to fully capture the semantics and builds the QD to facilitate the information aggregation and distribution among them. Furthermore, the interaction among entities, relations, and timestamps is integrated using a novel quadruplet-specific decoder, which stretches the third-order tensor to the fourth-order to satisfy the TKGC criterion. Equally important, we design a novel temporal regularization that imposes a smoothness constraint on temporal embeddings. Experimental results show that the proposed method outperforms the existing state-of-the-art TKGC methods. The source codes of this article are available at https://github.com/QDN for Temporal Knowledge Graph Completion.git.
Collapse
|
17
|
Li MJ, Kumari P, Lin YS, Yao ML, Zhang BH, Yin B, Duan SJ, Cornell RA, Marazita ML, Shi B, Jia ZL. A Variant in the IRF6 Promoter Associated with the Risk for Orofacial Clefting. J Dent Res 2023:220345231165210. [PMID: 37161310 PMCID: PMC10399074 DOI: 10.1177/00220345231165210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023] Open
Abstract
The single-nucleotide polymorphism (SNP) rs2235371 (IRF6 V274I) is associated with nonsyndromic cleft lip with or without cleft palate (NSCL/P) in Han Chinese and other populations but appears to be without a functional effect. To find the common etiologic variant or variants within the haplotype tagged by rs2235371, we carried out targeted sequencing of an interval containing IRF6 in 159 Han Chinese with NSCL/P. This study revealed that the SNP rs12403599, within the IRF6 promoter, is associated with all phenotypes of NSCL/P, especially nonsyndromic cleft lip (NSCLO) and a subphenotype of it, microform cleft lip (MCL). This association was replicated in 2 additional much larger cohorts of cases and controls from the Han Chinese. Conditional logistic analysis indicated that association of rs2235371 with NSCL/P was lost if rs12403599 was excluded. rs12403599 contributes the most risk to MCL: its G allele is responsible for 38.47% of the genetic contribution to MCL, and the odds ratios of G/C and G/G genotypes were 2.91 and 6.58, respectively, for MCL. To test if rs12403599 is functional, we carried out reporter assays in a fetal oral epithelium cells (GMSM-K). Unexpectedly, the risk allele G yielded higher promoter activity in GMSM-K. Consistent with the reporter studies, expression of IRF6 in lip tissues from NSCLO and MCL patients with the G/G phenotype was higher than in those from patients with the C/C phenotype. These results indicate that rs12403599 is tagging the risk haplotype for NSCL/P better than rs2235371 in Han Chinese and supports investigation of the mechanisms by which the allele of rs12403599 affects IRF6 expression and tests of this association in different populations.
Collapse
Affiliation(s)
- M-J Li
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - P Kumari
- Department of Oral Health Sciences, University of Washington, Seattle, WA, USA
| | - Y-S Lin
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - M-L Yao
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - B-H Zhang
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - B Yin
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - S-J Duan
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - R A Cornell
- Department of Oral Health Sciences, University of Washington, Seattle, WA, USA
| | - M L Marazita
- Centre for Craniofacial and Dental Genetics, Department of Oral Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - B Shi
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Z-L Jia
- State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| |
Collapse
|
18
|
Zhang Z, Han X, Dong B, Li T, Yin B, Yang X. Point Cloud Scene Completion with Joint Color and Semantic Estimation from Single RGB-D Image. IEEE Trans Pattern Anal Mach Intell 2023; PP:1-18. [PMID: 37018106 DOI: 10.1109/tpami.2023.3264449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
We present a deep reinforcement learning method of progressive view inpainting for colored semantic point cloud scene completion under volume guidance, achieving high-quality scene reconstruction from only a single RGB-D image with severe occlusion. Our approach is end-to-end, consisting of three modules: 3D scene volume reconstruction, 2D RGB-D and segmentation image inpainting, and multi-view selection for completion. Given a single RGB-D image, our method first predicts its semantic segmentation map and goes through the 3D volume branch to obtain a volumetric scene reconstruction as a guide to the next view inpainting step, which attempts to make up the missing information; the third step involves projecting the volume under the same view of the input, concatenating them to complete the current view RGB-D and segmentation map, and integrating all RGB-D and segmentation maps into the point cloud. Since the occluded areas are unavailable, we resort to a A3C network to glance around and pick the next best view for large hole completion progressively until a scene is adequately reconstructed while guaranteeing validity. All steps are learned jointly to achieve robust and consistent results. We perform qualitative and quantitative evaluations with extensive experiments on the 3D-FUTURE data, obtaining better results than state-of-the-arts.
Collapse
|
19
|
Kong Y, Wang H, Kong L, Liu Y, Yao C, Yin B. Absolute and Relative Depth-Induced Network for RGB-D Salient Object Detection. Sensors (Basel) 2023; 23:3611. [PMID: 37050670 PMCID: PMC10098920 DOI: 10.3390/s23073611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 03/21/2023] [Accepted: 03/23/2023] [Indexed: 06/19/2023]
Abstract
Detecting salient objects in complicated scenarios is a challenging problem. Except for semantic features from the RGB image, spatial information from the depth image also provides sufficient cues about the object. Therefore, it is crucial to rationally integrate RGB and depth features for the RGB-D salient object detection task. Most existing RGB-D saliency detectors modulate RGB semantic features with absolution depth values. However, they ignore the appearance contrast and structure knowledge indicated by relative depth values between pixels. In this work, we propose a depth-induced network (DIN) for RGB-D salient object detection, to take full advantage of both absolute and relative depth information, and further, enforce the in-depth fusion of the RGB-D cross-modalities. Specifically, an absolute depth-induced module (ADIM) is proposed, to hierarchically integrate absolute depth values and RGB features, to allow the interaction between the appearance and structural information in the encoding stage. A relative depth-induced module (RDIM) is designed, to capture detailed saliency cues, by exploring contrastive and structural information from relative depth values in the decoding stage. By combining the ADIM and RDIM, we can accurately locate salient objects with clear boundaries, even from complex scenes. The proposed DIN is a lightweight network, and the model size is much smaller than that of state-of-the-art algorithms. Extensive experiments on six challenging benchmarks, show that our method outperforms most existing RGB-D salient object detection models.
Collapse
Affiliation(s)
- Yuqiu Kong
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian 116024, China; (Y.K.)
| | - He Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Lingwei Kong
- School of Information and Communication Engineering, Dalian University of Technology, Dalian 116024, China
| | - Yang Liu
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian 116024, China; (Y.K.)
| | - Cuili Yao
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian 116024, China; (Y.K.)
| | - Baocai Yin
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
20
|
Guo J, Sun Y, Gao J, Hu Y, Yin B. Logarithmic Schatten- p Norm Minimization for Tensorial Multi-View Subspace Clustering. IEEE Trans Pattern Anal Mach Intell 2023; 45:3396-3410. [PMID: 35648873 DOI: 10.1109/tpami.2022.3179556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The low-rank tensor could characterize inner structure and explore high-order correlation among multi-view representations, which has been widely used in multi-view clustering. Existing approaches adopt the tensor nuclear norm (TNN) as a convex approximation of non-convex tensor rank function. However, TNN treats the different singular values equally and over-penalizes the main rank components, leading to sub-optimal tensor representation. In this paper, we devise a better surrogate of tensor rank, namely the tensor logarithmic Schatten- p norm ([Formula: see text]N), which fully considers the physical difference between singular values by the non-convex and non-linear penalty function. Further, a tensor logarithmic Schatten- p norm minimization ([Formula: see text]NM)-based multi-view subspace clustering ([Formula: see text]NM-MSC) model is proposed. Specially, the proposed [Formula: see text]NM can not only protect the larger singular values encoded with useful structural information, but also remove the smaller ones encoded with redundant information. Thus, the learned tensor representation with compact low-rank structure will well explore the complementary information and accurately characterize the high-order correlation among multi-views. The alternating direction method of multipliers (ADMM) is used to solve the non-convex multi-block [Formula: see text]NM-MSC model where the challenging [Formula: see text]NM problem is carefully handled. Importantly, the algorithm convergence analysis is mathematically established by showing that the sequence generated by the algorithm is of Cauchy and converges to a Karush-Kuhn-Tucker (KKT) point. Experimental results on nine benchmark databases reveal the superiority of the [Formula: see text]NM-MSC model.
Collapse
|
21
|
Li M, Zhang Y, Li X, Lin X, Yin B. Inferring student social link from spatiotemporal behavior data via entropy-based analyzing model. INTELL DATA ANAL 2023. [DOI: 10.3233/ida-216318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Social link is an important index to understand master students’ mental health and social ability in educational management. Extracting hidden social strength from students’ rich daily life behaviors has also become an attractive research hotspot. Devices with positioning functions record many students’ spatiotemporal behavior data, which can infer students’ social links. However, under the guidance of school regulations, students’ daily activities have a certain regularity and periodicity. Traditional methods usually compare the co-occurrence frequency of two users to infer social association but do not consider the location-intensive and time-sensitive in campus scenes. Aiming at the campus environment, a Spatiotemporal Entropy-Based Analyzing (S-EBA) model for inferring students’ social strength is proposed. The model is based on students’ multi-source heterogeneous behavioral data to calculate the frequency of co-occurrence under the influence of time intervals. Then, the three features of diversity, spatiotemporal hotspot and behavior similarity are introduced to calculate social strength. Experiments show that our method is superior to the traditional methods under many evaluating criteria. The inferred social strength is used as the weight of the edge to construct a social network further to analyze its important impact on students’ education management.
Collapse
|
22
|
Yang Y, Sun Y, Ju F, Wang S, Gao J, Yin B. Multi-graph Fusion Graph Convolutional Networks with pseudo-label supervision. Neural Netw 2023; 158:305-317. [PMID: 36493533 DOI: 10.1016/j.neunet.2022.11.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 09/13/2022] [Accepted: 11/21/2022] [Indexed: 11/29/2022]
Abstract
Graph convolutional networks (GCNs) have become a popular tool for learning unstructured graph data due to their powerful learning ability. Many researchers have been interested in fusing topological structures and node features to extract the correlation information for classification tasks. However, it is inadequate to integrate the embedding from topology and feature spaces to gain the most correlated information. At the same time, most GCN-based methods assume that the topology graph or feature graph is compatible with the properties of GCNs, but this is usually not satisfied since meaningless, missing, or even unreal edges are very common in actual graphs. To obtain a more robust and accurate graph structure, we intend to construct an adaptive graph with topology and feature graphs. We propose Multi-graph Fusion Graph Convolutional Networks with pseudo-label supervision (MFGCN), which learn a connected embedding by fusing the multi-graphs and node features. We can obtain the final node embedding for semi-supervised node classification by propagating node features over multi-graphs. Furthermore, to alleviate the problem of labels missing in semi-supervised classification, a pseudo-label generation mechanism is proposed to generate more reliable pseudo-labels based on the similarity of node features. Extensive experiments on six benchmark datasets demonstrate the superiority of MFGCN over state-of-the-art classification methods.
Collapse
Affiliation(s)
- Yachao Yang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Yanfeng Sun
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China.
| | - Fujiao Ju
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Shaofan Wang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Junbin Gao
- Discipline of Business Analytics, The University of Sydney Business School, The University of Sydney, NSW 2006, Australia
| | - Baocai Yin
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| |
Collapse
|
23
|
Wei B, Ye X, Long C, Du Z, Li B, Yin B, Yang X. Discriminative Active Learning for Robotic Grasping in Cluttered Scene. IEEE Robot Autom Lett 2023. [DOI: 10.1109/lra.2023.3243474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Affiliation(s)
- Boyan Wei
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Xianfeng Ye
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | | | - Zhenjun Du
- Shenyang SIASUN Robot and Automation Co. Ltd, ShenYang, China
| | - Bangyu Li
- Shenyang SIASUN Robot and Automation Co. Ltd, ShenYang, China
| | - Baocai Yin
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Xin Yang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| |
Collapse
|
24
|
Guo J, Sun Y, Gao J, Hu Y, Yin B. Multi-Attribute Subspace Clustering via Auto-Weighted Tensor Nuclear Norm Minimization. IEEE Trans Image Process 2022; 31:7191-7205. [PMID: 36355733 DOI: 10.1109/tip.2022.3220949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Self-expressiveness based subspace clustering methods have received wide attention for unsupervised learning tasks. However, most existing subspace clustering methods consider data features as a whole and then focus only on one single self-representation. These approaches ignore the intrinsic multi-attribute information embedded in the original data feature and result in one-attribute self-representation. This paper proposes a novel multi-attribute subspace clustering (MASC) model that understands data from multiple attributes. MASC simultaneously learns multiple subspace representations corresponding to each specific attribute by exploiting the intrinsic multi-attribute features drawn from original data. In order to better capture the high-order correlation among multi-attribute representations, we represent them as a tensor in low-rank structure and propose the auto-weighted tensor nuclear norm (AWTNN) as a superior low-rank tensor approximation. Especially, the non-convex AWTNN fully considers the difference between singular values through the implicit and adaptive weights splitting during the AWTNN optimization procedure. We further develop an efficient algorithm to optimize the non-convex and multi-block MASC model and establish the convergence guarantees. A more comprehensive subspace representation can be obtained via aggregating these multi-attribute representations, which can be used to construct a clustering-friendly affinity matrix. Extensive experiments on eight real-world databases reveal that the proposed MASC exhibits superior performance over other subspace clustering methods.
Collapse
|
25
|
Ji Q, Sun Y, Gao J, Hu Y, Yin B. A Decoder-Free Variational Deep Embedding for Unsupervised Clustering. IEEE Trans Neural Netw Learn Syst 2022; 33:5681-5693. [PMID: 33882000 DOI: 10.1109/tnnls.2021.3071275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In deep clustering frameworks, autoencoder (AE)- or variational AE-based clustering approaches are the most popular and competitive ones that encourage the model to obtain suitable representations and avoid the tendency for degenerate solutions simultaneously. However, for the clustering task, the decoder for reconstructing the original input is usually useless when the model is finished training. The encoder-decoder architecture limits the depth of the encoder so that the learning capacity is reduced severely. In this article, we propose a decoder-free variational deep embedding for unsupervised clustering (DFVC). It is well known that minimizing reconstruction error amounts to maximizing a lower bound on the mutual information (MI) between the input and its representation. That provides a theoretical guarantee for us to discard the bloated decoder. Inspired by contrastive self-supervised learning, we can directly calculate or estimate the MI of the continuous variables. Specifically, we investigate unsupervised representation learning by simultaneously considering the MI estimation of continuous representations and the MI computation of categorical representations. By introducing the data augmentation technique, we incorporate the original input, the augmented input, and their high-level representations into the MI estimation framework to learn more discriminative representations. Instead of matching to a simple standard normal distribution adversarially, we use end-to-end learning to constrain the latent space to be cluster-friendly by applying the Gaussian mixture distribution as the prior. Extensive experiments on challenging data sets show that our model achieves higher performance over a wide range of state-of-the-art clustering approaches.
Collapse
|
26
|
Xin J, Wang L, Wang S, Liu Y, Yang C, Yin B. Recommending Fine-Grained Tool Consistent With Common Sense Knowledge for Robot. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3187536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Jianjia Xin
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Lichun Wang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Shaofan Wang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Yukun Liu
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Chao Yang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Baocai Yin
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| |
Collapse
|
27
|
Pandey S, Krause E, DeRose J, MacCrann N, Jain B, Crocce M, Blazek J, Choi A, Huang H, To C, Fang X, Elvin-Poole J, Prat J, Porredon A, Secco L, Rodriguez-Monroy M, Weaverdyck N, Park Y, Raveri M, Rozo E, Rykoff E, Bernstein G, Sánchez C, Jarvis M, Troxel M, Zacharegkas G, Chang C, Alarcon A, Alves O, Amon A, Andrade-Oliveira F, Baxter E, Bechtol K, Becker M, Camacho H, Campos A, Carnero Rosell A, Carrasco Kind M, Cawthon R, Chen R, Chintalapati P, Davis C, Di Valentino E, Diehl H, Dodelson S, Doux C, Drlica-Wagner A, Eckert K, Eifler T, Elsner F, Everett S, Farahi A, Ferté A, Fosalba P, Friedrich O, Gatti M, Giannini G, Gruen D, Gruendl R, Harrison I, Hartley W, Huff E, Huterer D, Kovacs A, Leget P, McCullough J, Muir J, Myles J, Navarro-Alsina A, Omori Y, Rollins R, Roodman A, Rosenfeld R, Sevilla-Noarbe I, Sheldon E, Shin T, Troja A, Tutusaus I, Varga T, Wechsler R, Yanny B, Yin B, Zhang Y, Zuntz J, Abbott T, Aguena M, Allam S, Annis J, Bacon D, Bertin E, Brooks D, Burke D, Carretero J, Conselice C, Costanzi M, da Costa L, Pereira M, De Vicente J, Dietrich J, Doel P, Evrard A, Ferrero I, Flaugher B, Frieman J, García-Bellido J, Gaztanaga E, Gerdes D, Giannantonio T, Gschwend J, Gutierrez G, Hinton S, Hollowood D, Honscheid K, James D, Jeltema T, Kuehn K, Kuropatkin N, Lahav O, Lima M, Lin H, Maia M, Marshall J, Melchior P, Menanteau F, Miller C, Miquel R, Mohr J, Morgan R, Palmese A, Paz-Chinchón F, Petravick D, Pieres A, Plazas Malagón A, Sanchez E, Scarpine V, Serrano S, Smith M, Soares-Santos M, Suchyta E, Tarle G, Thomas D, Weller J. Dark Energy Survey year 3 results: Constraints on cosmological parameters and galaxy-bias models from galaxy clustering and galaxy-galaxy lensing using the redMaGiC sample. Int J Clin Exp Med 2022. [DOI: 10.1103/physrevd.106.043520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
28
|
Hu Y, Zhang H, Jiang H, Bi Y, Yin B. CGNN: Caption-Assisted Graph Neural Network for Image-Text Retrieval. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
29
|
Da Q, Huang X, Li Z, Zuo Y, Zhang C, Liu J, Chen W, Li J, Xu D, Hu Z, Yi H, Guo Y, Wang Z, Chen L, Zhang L, He X, Zhang X, Mei K, Zhu C, Lu W, Shen L, Shi J, Li J, S S, Krishnamurthi G, Yang J, Lin T, Song Q, Liu X, Graham S, Bashir RMS, Yang C, Qin S, Tian X, Yin B, Zhao J, Metaxas DN, Li H, Wang C, Zhang S. DigestPath: A benchmark dataset with challenge review for the pathological detection and segmentation of digestive-system. Med Image Anal 2022; 80:102485. [DOI: 10.1016/j.media.2022.102485] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Revised: 04/08/2022] [Accepted: 05/20/2022] [Indexed: 12/19/2022]
|
30
|
Guo J, Sun Y, Gao J, Hu Y, Yin B. Rank Consistency Induced Multiview Subspace Clustering via Low-Rank Matrix Factorization. IEEE Trans Neural Netw Learn Syst 2022; 33:3157-3170. [PMID: 33882005 DOI: 10.1109/tnnls.2021.3071797] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Multiview subspace clustering has been demonstrated to achieve excellent performance in practice by exploiting multiview complementary information. One of the strategies used in most existing methods is to learn a shared self-expressiveness coefficient matrix for all the view data. Different from such a strategy, this article proposes a rank consistency induced multiview subspace clustering model to pursue a consistent low-rank structure among view-specific self-expressiveness coefficient matrices. To facilitate a practical model, we parameterize the low-rank structure on all self-expressiveness coefficient matrices through the tri-factorization along with orthogonal constraints. This specification ensures that self-expressiveness coefficient matrices of different views have the same rank to effectively promote structural consistency across multiviews. Such a model can learn a consistent subspace structure and fully exploit the complementary information from the view-specific self-expressiveness coefficient matrices, simultaneously. The proposed model is formulated as a nonconvex optimization problem. An efficient optimization algorithm with guaranteed convergence under mild conditions is proposed. Extensive experiments on several benchmark databases demonstrate the advantage of the proposed model over the state-of-the-art multiview clustering approaches.
Collapse
|
31
|
Affiliation(s)
- Y Xie
- Department of Dermatovenereology, Chengdu Second People's Hospital, Chengdu, China
| | - B Yin
- Department of Dermatovenereology, Chengdu Second People's Hospital, Chengdu, China
| | - X Shi
- Department of Dermatovenereology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| |
Collapse
|
32
|
Yin Y, Yin B, Bi X. P-290 Real-world evidence of anlotinib in patients with advanced hepatocellular carcinoma and clinical role of α-fetoprotein. Ann Oncol 2022. [DOI: 10.1016/j.annonc.2022.04.379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
|
33
|
Rong Z, Wang S, Kong D, Yin B. Improving object detection quality with structural constraints. PLoS One 2022; 17:e0267863. [PMID: 35584103 PMCID: PMC9116628 DOI: 10.1371/journal.pone.0267863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 04/16/2022] [Indexed: 12/03/2022] Open
Abstract
Recent researches revealed object detection networks using the simple “classification loss + localization loss” training objective are not effectively optimized in many cases, while providing additional constraints on network features could effectively improve object detection quality. Specifically, some works used constraints on training sample relations to successfully learn discriminative network features. Based on these observations, we propose Structural Constraint for improving object detection quality. Structural constraint supervises feature learning in classification and localization network branches with Fisher Loss and Equi-proportion Loss respectively, by requiring feature similarities of training sample pairs to be consistent with corresponding ground truth label similarities. Structural constraint could be applied to all object detection network architectures with the assist of our Proxy Feature design. Our experiment results showed that structural constraint mechanism is able to optimize object class instances’ distribution in network feature space, and consequently detection results. Evaluations on MSCOCO2017 and KITTI datasets showed that our structural constraint mechanism is able to assist baseline networks to outperform modern counterpart detectors in terms of object detection quality.
Collapse
Affiliation(s)
- Zihao Rong
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Shaofan Wang
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China
- * E-mail:
| | - Dehui Kong
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Baocai Yin
- Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China
| |
Collapse
|
34
|
Kong D, Li X, Wang S, Li J, Yin B. Learning visual-and-semantic knowledge embedding for zero-shot image classification. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03443-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
35
|
|
36
|
Wang L, Zhang Y, Zeng Z, Zhou H, He J, Liu P, Chen M, Han J, Srolovitz DJ, Teng J, Guo Y, Yang G, Kong D, Ma E, Hu Y, Yin B, Huang X, Zhang Z, Zhu T, Han X. Tracking the sliding of grain boundaries at the atomic scale. Science 2022; 375:1261-1265. [PMID: 35298254 DOI: 10.1126/science.abm2612] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Grain boundaries (GBs) play an important role in the mechanical behavior of polycrystalline materials. Despite decades of investigation, the atomic-scale dynamic processes of GB deformation remain elusive, particularly for the GBs in polycrystals, which are commonly of the asymmetric and general type. We conducted an in situ atomic-resolution study to reveal how sliding-dominant deformation is accomplished at general tilt GBs in platinum bicrystals. We observed either direct atomic-scale sliding along the GB or sliding with atom transfer across the boundary plane. The latter sliding process was mediated by movements of disconnections that enabled the transport of GB atoms, leading to a previously unrecognized mode of coupled GB sliding and atomic plane transfer. These results enable an atomic-scale understanding of how general GBs slide in polycrystalline materials.
Collapse
Affiliation(s)
- Lihua Wang
- Institute of Microstructure and Property of Advanced Materials, Beijing Key Lab of Microstructure and Property of Advanced Materials, Beijing University of Technology, Beijing 100124, China
| | - Yin Zhang
- Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Zhi Zeng
- Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Hao Zhou
- Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Jian He
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634 USA
| | - Pan Liu
- Shanghai Key Laboratory of Advanced High-Temperature Materials and Precision Forming, State Key Laboratory of Metal Matrix Composites, School of Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
| | - Mingwei Chen
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jian Han
- Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - David J Srolovitz
- Department of Mechanical Engineering, The University of Hong Kong, Hong Kong SAR, China.,International Digital Economy Academy (IDEA), Shenzhen, China
| | - Jiao Teng
- Department of Material Physics and Chemistry, University of Science and Technology Beijing, Beijing 100083, China
| | - Yizhong Guo
- Institute of Microstructure and Property of Advanced Materials, Beijing Key Lab of Microstructure and Property of Advanced Materials, Beijing University of Technology, Beijing 100124, China
| | - Guo Yang
- Institute of Microstructure and Property of Advanced Materials, Beijing Key Lab of Microstructure and Property of Advanced Materials, Beijing University of Technology, Beijing 100124, China
| | - Deli Kong
- Institute of Microstructure and Property of Advanced Materials, Beijing Key Lab of Microstructure and Property of Advanced Materials, Beijing University of Technology, Beijing 100124, China
| | - En Ma
- Center for Alloy Innovation and Design (CAID), State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yongli Hu
- Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, Beijing 100124, China
| | - Baocai Yin
- Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, Beijing 100124, China
| | - XiaoXu Huang
- College of Materials Science and Engineering, Chongqing University, Chongqing 40044, China
| | - Ze Zhang
- Institute of Microstructure and Property of Advanced Materials, Beijing Key Lab of Microstructure and Property of Advanced Materials, Beijing University of Technology, Beijing 100124, China.,Department of Materials Science, Zhejiang University, Hangzhou 310008, China
| | - Ting Zhu
- Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Xiaodong Han
- Institute of Microstructure and Property of Advanced Materials, Beijing Key Lab of Microstructure and Property of Advanced Materials, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
37
|
Hu X, Sun Y, Gao J, Hu Y, Ju F, Yin B. Probabilistic Linear Discriminant Analysis Based on L 1-Norm and Its Bayesian Variational Inference. IEEE Trans Cybern 2022; 52:1616-1627. [PMID: 32386179 DOI: 10.1109/tcyb.2020.2985997] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Probabilistic linear discriminant analysis (PLDA) is a very effective feature extraction approach and has obtained extensive and successful applications in supervised learning tasks. It employs the squared L2 -norm to measure the model errors, which assumes a Gaussian noise distribution implicitly. However, the noise in real-life applications may not follow a Gaussian distribution. Particularly, the squared L2 -norm could extremely exaggerate data outliers. To address this issue, this article proposes a robust PLDA model under the assumption of a Laplacian noise distribution, called L1-PLDA. The learning process employs the approach by expressing the Laplacian density function as a superposition of an infinite number of Gaussian distributions via introducing a new latent variable and then adopts the variational expectation-maximization (EM) algorithm to learn parameters. The most significant advantage of the new model is that the introduced latent variable can be used to detect data outliers. The experiments on several public databases show the superiority of the proposed L1-PLDA model in terms of classification and outlier detection.
Collapse
|
38
|
Jia X, Zhou Y, Li W, Li J, Yin B. Data-aware relation learning-based graph convolution neural network for facial action unit recognition. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.02.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
39
|
Abbott T, Aguena M, Alarcon A, Allam S, Alves O, Amon A, Andrade-Oliveira F, Annis J, Avila S, Bacon D, Baxter E, Bechtol K, Becker M, Bernstein G, Bhargava S, Birrer S, Blazek J, Brandao-Souza A, Bridle S, Brooks D, Buckley-Geer E, Burke D, Camacho H, Campos A, Carnero Rosell A, Carrasco Kind M, Carretero J, Castander F, Cawthon R, Chang C, Chen A, Chen R, Choi A, Conselice C, Cordero J, Costanzi M, Crocce M, da Costa L, da Silva Pereira M, Davis C, Davis T, De Vicente J, DeRose J, Desai S, Di Valentino E, Diehl H, Dietrich J, Dodelson S, Doel P, Doux C, Drlica-Wagner A, Eckert K, Eifler T, Elsner F, Elvin-Poole J, Everett S, Evrard A, Fang X, Farahi A, Fernandez E, Ferrero I, Ferté A, Fosalba P, Friedrich O, Frieman J, García-Bellido J, Gatti M, Gaztanaga E, Gerdes D, Giannantonio T, Giannini G, Gruen D, Gruendl R, Gschwend J, Gutierrez G, Harrison I, Hartley W, Herner K, Hinton S, Hollowood D, Honscheid K, Hoyle B, Huff E, Huterer D, Jain B, James D, Jarvis M, Jeffrey N, Jeltema T, Kovacs A, Krause E, Kron R, Kuehn K, Kuropatkin N, Lahav O, Leget PF, Lemos P, Liddle A, Lidman C, Lima M, Lin H, MacCrann N, Maia M, Marshall J, Martini P, McCullough J, Melchior P, Mena-Fernández J, Menanteau F, Miquel R, Mohr J, Morgan R, Muir J, Myles J, Nadathur S, Navarro-Alsina A, Nichol R, Ogando R, Omori Y, Palmese A, Pandey S, Park Y, Paz-Chinchón F, Petravick D, Pieres A, Plazas Malagón A, Porredon A, Prat J, Raveri M, Rodriguez-Monroy M, Rollins R, Romer A, Roodman A, Rosenfeld R, Ross A, Rykoff E, Samuroff S, Sánchez C, Sanchez E, Sanchez J, Sanchez Cid D, Scarpine V, Schubnell M, Scolnic D, Secco L, Serrano S, Sevilla-Noarbe I, Sheldon E, Shin T, Smith M, Soares-Santos M, Suchyta E, Swanson M, Tabbutt M, Tarle G, Thomas D, To C, Troja A, Troxel M, Tucker D, Tutusaus I, Varga T, Walker A, Weaverdyck N, Wechsler R, Weller J, Yanny B, Yin B, Zhang Y, Zuntz J. Dark Energy Survey Year 3 results: Cosmological constraints from galaxy clustering and weak lensing. Int J Clin Exp Med 2022. [DOI: 10.1103/physrevd.105.023520] [Citation(s) in RCA: 106] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
40
|
Amon A, Gruen D, Troxel M, MacCrann N, Dodelson S, Choi A, Doux C, Secco L, Samuroff S, Krause E, Cordero J, Myles J, DeRose J, Wechsler R, Gatti M, Navarro-Alsina A, Bernstein G, Jain B, Blazek J, Alarcon A, Ferté A, Lemos P, Raveri M, Campos A, Prat J, Sánchez C, Jarvis M, Alves O, Andrade-Oliveira F, Baxter E, Bechtol K, Becker M, Bridle S, Camacho H, Carnero Rosell A, Carrasco Kind M, Cawthon R, Chang C, Chen R, Chintalapati P, Crocce M, Davis C, Diehl H, Drlica-Wagner A, Eckert K, Eifler T, Elvin-Poole J, Everett S, Fang X, Fosalba P, Friedrich O, Gaztanaga E, Giannini G, Gruendl R, Harrison I, Hartley W, Herner K, Huang H, Huff E, Huterer D, Kuropatkin N, Leget P, Liddle A, McCullough J, Muir J, Pandey S, Park Y, Porredon A, Refregier A, Rollins R, Roodman A, Rosenfeld R, Ross A, Rykoff E, Sanchez J, Sevilla-Noarbe I, Sheldon E, Shin T, Troja A, Tutusaus I, Tutusaus I, Varga T, Weaverdyck N, Yanny B, Yin B, Zhang Y, Zuntz J, Aguena M, Allam S, Annis J, Bacon D, Bertin E, Bhargava S, Brooks D, Buckley-Geer E, Burke D, Carretero J, Costanzi M, da Costa L, Pereira M, De Vicente J, Desai S, Dietrich J, Doel P, Ferrero I, Flaugher B, Frieman J, García-Bellido J, Gaztanaga E, Gerdes D, Giannantonio T, Gschwend J, Gutierrez G, Hinton S, Hollowood D, Honscheid K, Hoyle B, James D, Kron R, Kuehn K, Lahav O, Lima M, Lin H, Maia M, Marshall J, Martini P, Melchior P, Menanteau F, Miquel R, Mohr J, Morgan R, Ogando R, Palmese A, Paz-Chinchón F, Petravick D, Pieres A, Romer A, Sanchez E, Scarpine V, Schubnell M, Serrano S, Smith M, Soares-Santos M, Tarle G, Thomas D, To C, Weller J. Dark Energy Survey Year 3 results: Cosmology from cosmic shear and robustness to data calibration. Int J Clin Exp Med 2022. [DOI: 10.1103/physrevd.105.023514] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
41
|
Yin B, Xia B. Expression and Clinical Significance of Micro Ribonucleic Acid-132 and Sex-Determining Region Y-Box 4 in Colon Cancer. Indian J Pharm Sci 2022. [DOI: 10.36468/pharmaceutical-sciences.spl.505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
42
|
Zhao Y, Yin B, Xia B. Expression and Clinical Significance of Long Non-Coding Ribonucleic Acid LOC554202 and H19 in Serum of Cervical Cancer. Indian J Pharm Sci 2022. [DOI: 10.36468/pharmaceutical-sciences.spl.515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
43
|
|
44
|
Chen XQ, Zheng DY, Xiao YY, Dong BL, Cao CW, Ma L, Tong ZS, Zhu M, Liu ZH, Xi LY, Fu M, Jin Y, Yin B, Li FQ, Li XF, Abliz P, Liu HF, Zhang Y, Yu N, Wu WW, Xiong XC, Zeng JS, Huang HQ, Jiang YP, Chen GZ, Pan WH, Sang H, Wang Y, Guo Y, Shi DM, Yang JX, Chen W, Wan Z, Li RY, Wang AP, Ran YP, Yu J. Aetiology of tinea capitis in China: A multicentre prospective study. Br J Dermatol 2021; 186:705-712. [PMID: 34741300 DOI: 10.1111/bjd.20875] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/30/2021] [Indexed: 02/05/2023]
Abstract
BACKGROUND Tinea capitis is still common in developing countries, such as China. Its pathogen spectrum varies across regions and changes over time. OBJECTIVES This study aimed to clarify the current epidemiological characteristics and pathogen spectrum of tinea capitis in China. METHODS A multicentre, prospective descriptive study involving 29 tertiary hospitals in China was conducted. From August 2019 to July 2020, 611 patients with tinea capitis were enrolled. Data concerning demography, risk factors and fungal tests were collected. The pathogens were further identified by morphology or molecular sequencing when necessary in the central laboratory. RESULTS Among all enrolled patients, 74.1% of the cases were 2- to 8-year-olds. The children with tinea capitis were mainly boys (56.2%) and more likely to have an animal contact history (57.4% vs. 35.3%, P = 0.012) and zoophilic dermatophyte infection (73.5%). The adults were mainly females (83.3%) and more likely to have anthropophilic agent infection (53.5%). The most common pathogen was zoophilic Microsporum canis (354, 65.2%), followed by anthropophilic Trichophyton violaceum (74, 13.6%). In contrast to the eastern, western and northeastern regions where zoophilic M. canis predominated, anthropophilic T. violaceum predominated in central China (69.2%, P < 0.0001), where the patients had the most tinea at other sites (20.3%) and dermatophytosis contact (25.9%) with the least animal contact (38.8%). Microsporum ferrugineum was the most common anthropophilic agent in the western area, especially in Xinjiang Province. CONCLUSIONS Boys aged approximately 5 years were mainly affected. Dermatologists are advised to pay more attention to the different transmission routes and pathogen spectra in different age groups from different regions.
Collapse
Affiliation(s)
- X-Q Chen
- Department of Dermatology and Venereology, Peking University First Hospital, National Clinical Research Centre for Skin and Immune Diseases, Beijing Key Laboratory of Molecular Diagnosis on Dermatoses, NMPA Key Laboratory for Quality Control and Evaluation of Cosmetics, Beijing, China
| | - D-Y Zheng
- Department of Dermatology and Venereology, the First Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Y-Y Xiao
- Department of Dermatology, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China
| | - B-L Dong
- Department of Dermatology, Wuhan No.1 Hospital, Wuhan, China
| | - C-W Cao
- Department of Dermatology and Venereology, the First Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - L Ma
- Department of Dermatology, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China
| | - Z-S Tong
- Department of Dermatology, Wuhan No.1 Hospital, Wuhan, China
| | - M Zhu
- Department of Dermatology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Z-H Liu
- Department of Dermatology, Hangzhou Third People's Hospital, Affiliated Hangzhou Dermatology Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - L-Y Xi
- Department of Dermatology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| | - M Fu
- Department of Dermatology, Xijing Hospital, Xi'an, China
| | - Y Jin
- Department of Dermatology, Dermatology Hospital of Jiangxi Province, Nanchang, China
| | - B Yin
- Department of Dermatology, Chengdu Second People's Hospital, Chengdu, China
| | - F-Q Li
- Department of Dermatology, the Second Hospital of Jilin University, Changchun, China
| | - X-F Li
- Institute of Dermatology, Chinese Academy of Medical Sciences and Peking Union Medical College, Nanjing, China
| | - P Abliz
- Department of Dermatology, the First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| | - H-F Liu
- Department of Dermatology, Dermatology Hospital of Southern Medical University, Guangzhou, China
| | - Y Zhang
- Department of Dermatology, Tianjin Academy of Traditional Chinese Medicine Affiliated Hospital, Tianjin, China
| | - N Yu
- Department of Dermatology, General Hospital of Ningxia Medical University, Yinchuan, China
| | - W-W Wu
- Department of Dermatology, the Fifth People's Hospital of Hainan Province, Haikou, China
| | - X-C Xiong
- Department of Dermatology, Affiliated Hospital of North Sichuan Medical College, Nanchong, China
| | - J-S Zeng
- Department of Dermatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - H-Q Huang
- Department of Dermatology and Venereology, Third Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Y-P Jiang
- Department of Dermatology, the Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - G-Z Chen
- Department of Dermatology, the Affiliated Hospital of Qingdao University, Qingdao, China
| | - W-H Pan
- Department of Dermatology, Shanghai Changzheng Hospital, Naval Military Medical University, Shanghai, China
| | - H Sang
- Department of Dermatology, Jinling Hospital, Medical School of Nanjing University, Nanjing, China
| | - Y Wang
- Department of Dermatology, Changhai Hospital of Shanghai, Shanghai, China
| | - Y Guo
- Department of Dermatology, the Second Affiliated Hospital of Kunming Medical University, Kunming, China
| | - D-M Shi
- Department of Dermatology, Jining No, People's Hospital, Jining, China
| | - J-X Yang
- Department of Dermatology, 2nd Affiliated Hospital of Harbin Medical University, Harbin, China
| | - W Chen
- Department of Dermatology and Venereology, Peking University First Hospital, National Clinical Research Centre for Skin and Immune Diseases, Beijing Key Laboratory of Molecular Diagnosis on Dermatoses, NMPA Key Laboratory for Quality Control and Evaluation of Cosmetics, Beijing, China
| | - Z Wan
- Department of Dermatology and Venereology, Peking University First Hospital, National Clinical Research Centre for Skin and Immune Diseases, Beijing Key Laboratory of Molecular Diagnosis on Dermatoses, NMPA Key Laboratory for Quality Control and Evaluation of Cosmetics, Beijing, China
| | - R-Y Li
- Department of Dermatology and Venereology, Peking University First Hospital, National Clinical Research Centre for Skin and Immune Diseases, Beijing Key Laboratory of Molecular Diagnosis on Dermatoses, NMPA Key Laboratory for Quality Control and Evaluation of Cosmetics, Beijing, China
| | - A-P Wang
- Department of Dermatology and Venereology, Peking University First Hospital, National Clinical Research Centre for Skin and Immune Diseases, Beijing Key Laboratory of Molecular Diagnosis on Dermatoses, NMPA Key Laboratory for Quality Control and Evaluation of Cosmetics, Beijing, China
| | - Y-P Ran
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, China
| | - J Yu
- Department of Dermatology and Venereology, Peking University First Hospital, National Clinical Research Centre for Skin and Immune Diseases, Beijing Key Laboratory of Molecular Diagnosis on Dermatoses, NMPA Key Laboratory for Quality Control and Evaluation of Cosmetics, Beijing, China
| |
Collapse
|
45
|
Xu K, Tian X, Yang X, Yin B, Lau RWH. Intensity-Aware Single-Image Deraining With Semantic and Color Regularization. IEEE Trans Image Process 2021; 30:8497-8509. [PMID: 34623268 DOI: 10.1109/tip.2021.3116794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Rain degrades image visual quality and disrupts object structures, obscuring their details and erasing their colors. Existing deraining methods are primarily based on modeling either visual appearances of rain or its physical characteristics (e.g., rain direction and density), and thus suffer from two common problems. First, due to the stochastic nature of rain, they tend to fail in recognizing rain streaks correctly, and wrongly remove image structures and details. Second, they fail to recover the image colors erased by heavy rain. In this paper, we address these two problems with the following three contributions. First, we propose a novel PHP block to aggregate comprehensive spatial and hierarchical information for removing rain streaks of different sizes. Second, we propose a novel network to first remove rain streaks, then recover objects structures/colors, and finally enhance details. Third, to train the network, we prepare a new dataset, and propose a novel loss function to introduce semantic and color regularization for deraining. Extensive experiments demonstrate the superiority of the proposed method over state-of-the-art deraining methods on both synthesized and real-world data, in terms of visual quality, quantitative accuracy, and running speed.
Collapse
|
46
|
Abstract
BACKGROUND Chest computed tomography (CT) plays an important role in the diagnosis and assessment of coronavirus disease 2019 (COVID-19). OBJECTIVE To evaluate the value of an artificial intelligence (AI) scoring system for radiologically assessing the severity of COVID-19. MATERIALS AND METHODS Chest CT images of 81 patients (61 of normal type and 20 of severe type) with confirmed COVID-19 were used. The test data were anonymized. The scores achieved by four methods (junior radiologists; AI scoring system; human-AI segmentation system; human-AI scoring system) were compared with that by two experienced radiologists (reference score). The mean absolute errors (MAEs) between the four methods and experienced radiologists were calculated separately. The Wilcoxon test is used to predict the significance of the severity of COVID-19. Then use Spearman correlation analysis ROC analysis was used to evaluate the performance of different scores. RESULTS The AI score had a relatively low MAE (1.67-2.21). Score of human-AI scoring system had the lowest MAE (1.67), a diagnostic value almost equal to reference score (r= 0.97), and a strongest correlation with clinical severity (r= 0.59, p< 0.001). The AUCs of reference score, score of junior radiologists, AI score, score of human-AI segmentation system, and score of human-AI scoring system were 0.874, 0.841, 0.852, 0.857 and 0.865, respectively. CONCLUSION The human-AI scoring system can help radiologists to improve the accuracy of COVID-19 severity assessment.
Collapse
Affiliation(s)
- Mingzhu Liu
- Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Weifu Lv
- Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Baocai Yin
- Anhui iFlytek Healthcare Information Technology Co., Ltd, Hefei, Anhui, China
| | | | - Wei Wei
- Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| |
Collapse
|
47
|
Sun B, Wang S, Kong D, Wang L, Yin B. Real-Time Human Action Recognition Using Locally Aggregated Kinematic-Guided Skeletonlet and Supervised Hashing-by-Analysis Model. IEEE Trans Cybern 2021; PP:4837-4849. [PMID: 34437085 DOI: 10.1109/tcyb.2021.3100507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
3-D action recognition is referred to as the classification of action sequences which consist of 3-D skeleton joints. While many research works are devoted to 3-D action recognition, it mainly suffers from three problems: 1) highly complicated articulation; 2) a great amount of noise; and 3) low implementation efficiency. To tackle all these problems, we propose a real-time 3-D action-recognition framework by integrating the locally aggregated kinematic-guided skeletonlet (LAKS) with a supervised hashing-by-analysis (SHA) model. We first define the skeletonlet as a few combinations of joint offsets grouped in terms of the kinematic principle and then represent an action sequence using LAKS, which consists of a denoising phase and a locally aggregating phase. The denoising phase detects the noisy action data and adjusts it by replacing all the features within it with the features of the corresponding previous frame, while the locally aggregating phase sums the difference between an offset feature of the skeletonlet and its cluster center together over all the offset features of the sequence. Finally, the SHA model combines sparse representation with a hashing model, aiming at promoting the recognition accuracy while maintaining high efficiency. Experimental results on MSRAction3D, UTKinectAction3D, and Florence3DAction datasets demonstrate that the proposed method outperforms state-of-the-art methods in both recognition accuracy and implementation efficiency.
Collapse
|
48
|
Wang B, Hu Y, Gao J, Sun Y, Ju F, Yin B. Adaptive Fusion of Heterogeneous Manifolds for Subspace Clustering. IEEE Trans Neural Netw Learn Syst 2021; 32:3484-3497. [PMID: 32776883 DOI: 10.1109/tnnls.2020.3011717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Multiview clustering (MVC) has recently received great interest due to its pleasing efficacy in combining the abundant and complementary information to improve clustering performance, which overcomes the drawbacks of view limitation existed in the standard single-view clustering. However, the existing MVC methods are mostly designed for vectorial data from linear spaces and, thus, are not suitable for multiple dimensional data with intrinsic nonlinear manifold structures, e.g., videos or image sets. Some works have introduced manifolds' representation methods of data into MVC and obtained considerable improvements, but how to fuse multiple manifolds efficiently for clustering is still a challenging problem. Particularly, for heterogeneous manifolds, it is an entirely new problem. In this article, we propose to represent the complicated multiviews' data as heterogeneous manifolds and a fusion framework of heterogeneous manifolds for clustering. Different from the empirical weighting methods, an adaptive fusion strategy is designed to weight the importance of different manifolds in a data-driven manner. In addition, the low-rank representation is generalized onto the fused heterogeneous manifolds to explore the low-dimensional subspace structures embedded in data for clustering. We assessed the proposed method on several public data sets, including human action video, facial image, and traffic scenario video. The experimental results show that our method obviously outperforms a number of state-of-the-art clustering methods.
Collapse
|
49
|
Wu G, Shi Y, Sun X, Wang J, Yin B. SMSIR: Spherical Measure Based Spherical Image Representation. IEEE Trans Image Process 2021; 30:6377-6391. [PMID: 34003750 DOI: 10.1109/tip.2021.3079797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This paper presents a spherical measure based spherical image representation(SMSIR) and sphere-based resampling methods for generating our representation. On this basis, a spherical wavelet transform is also proposed. We first propose a formal recursive definition of the spherical triangle elements of SMSIR and a dyadic index scheme. The index scheme, which supports global random access and needs not to be pre-computed and stored, can efficiently index the elements of SMSIR like planar images. Two resampling methods to generate SMSIR from the most commonly used ERP(Equirectangular Projection) representation are presented. Notably, the spherical measure based resampling, which exploits the mapping between the spherical and the parameter domain, achieves higher computational efficiency than the spherical RBF(Radial Basis Function) based resampling. Finally, we design high-pass and low-pass filters with lifting schemes based on the dyadic index to further verify the efficiency of our index and deal with the spherical isotropy. It provides novel Multi-Resolution Analysis(MRA) for spherical images. Experiments on continuous synthetic spherical images indicate that our representation can recover the original image signals with higher accuracy than the ERP and CMP(Cubemap) representations at the same sampling rate. Besides, the resampling experiments on natural spherical images show that our resampling methods outperform the bilinear and bicubic interpolations concerning the subjective and objective quality. Particularly, as high as 2dB gain in terms of S-PSNR is achieved. Experiments also show that our spherical image transform can capture more geometric features of spherical images than traditional wavelet transform.
Collapse
|
50
|
|