1
|
Liu B, Liu J, Xiao Y, Chen Q, Wang K, Huang R, Li L. A new self-paced learning method for privilege-based positive and unlabeled learning. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
2
|
Zhang J, Liu M, Lu K, Gao Y. Group-Wise Learning for Aurora Image Classification With Multiple Representations. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4112-4124. [PMID: 30932858 DOI: 10.1109/tcyb.2019.2903591] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In conventional aurora image classification methods, it is general to employ only one single feature representation to capture the morphological characteristics of aurora images, which is difficult to describe the complicated morphologies of different aurora categories. Although several studies have proposed to use multiple feature representations, the inherent correlation among these representations are usually neglected. To address this problem, we propose a group-wise learning (GWL) method for the automatic aurora image classification using multiple representations. Specifically, we first extract the multiple feature representations for aurora images, and then construct a graph in each of multiple feature spaces. To model the correlation among different representations, we partition multiple graphs into several groups via a clustering algorithm. We further propose a GWL model to automatically estimate class labels for aurora images and optimal weights for the multiple representations in a data-driven manner. Finally, we develop a label fusion approach to make a final classification decision for new testing samples. The proposed GWL method focuses on the diverse properties of multiple feature representations, by clustering the correlated representations into the same group. We evaluate our method on an aurora image data set that contains 12 682 aurora images from 19 days. The experimental results demonstrate that the proposed GWL method achieves approximately 6% improvement in terms of classification accuracy, compared to the methods using a single feature representation.
Collapse
|
3
|
A supervised and distributed framework for cold-start author disambiguation in large-scale publications. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05684-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
4
|
Wu Z, Cao J, Wang Y, Wang Y, Zhang L, Wu J. hPSD: A Hybrid PU-Learning-Based Spammer Detection Model for Product Reviews. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1595-1606. [PMID: 30403648 DOI: 10.1109/tcyb.2018.2877161] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Spammers, who manipulate online reviews to promote or suppress products, are flooding in online commerce. To combat this trend, there has been a great deal of research focused on detecting review spammers, most of which design diversified features and thus develop various classifiers. The widespread growth of crowdsourcing platforms has created large-scale deceptive review writers who behave more like normal users, that the way they can more easily evade detection by the classifiers that are purely based on fixed characteristics. In this paper, we propose a hybrid semisupervised learning model titled hybrid PU-learning-based spammer detection (hPSD) for spammer detection to leverage both the users' characteristics and the user-product relations. Specifically, the hPSD model can iteratively detect multitype spammers by injecting different positive samples, and allows the construction of classifiers in a semisupervised hybrid learning framework. Comprehensive experiments on movie dataset with shilling injection confirm the superior performance of hPSD over existing baseline methods. The hPSD is then utilized to detect the hidden spammers from real-life Amazon data. A set of spammers and their underlying employers (e.g., book publishers) are successfully discovered and validated. These demonstrate that hPSD meets the real-world application scenarios and can thus effectively detect the potentially deceptive review writers.
Collapse
|
5
|
Manuweera B, Reynolds G, Kahanda I. Computational methods for the ab initio identification of novel microRNA in plants: a systematic review. PeerJ Comput Sci 2019; 5:e233. [PMID: 33816886 PMCID: PMC7924660 DOI: 10.7717/peerj-cs.233] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 10/14/2019] [Indexed: 06/12/2023]
Abstract
BACKGROUND MicroRNAs (miRNAs) play a vital role as post-transcriptional regulators in gene expression. Experimental determination of miRNA sequence and structure is both expensive and time consuming. The next-generation sequencing revolution, which facilitated the rapid accumulation of biological data has brought biology into the "big data" domain. As such, developing computational methods to predict miRNAs has become an active area of inter-disciplinary research. OBJECTIVE The objective of this systematic review is to focus on the developments of ab initio plant miRNA identification methods over the last decade. DATA SOURCES Five databases were searched for relevant articles, according to a well-defined review protocol. STUDY SELECTION The search results were further filtered using the selection criteria that only included studies on novel plant miRNA identification using machine learning. DATA EXTRACTION Relevant data from each study were extracted in order to carry out an analysis on their methodologies and findings. RESULTS Results depict that in the last decade, there were 20 articles published on novel miRNA identification methods in plants of which only 11 of them were primarily focused on plant microRNA identification. Our findings suggest a need for more stringent plant-focused miRNA identification studies. CONCLUSION Overall, the study accuracies are of a satisfactory level, although they may generate a considerable number of false negatives. In future, attention must be paid to the biological plausibility of computationally identified miRNAs to prevent further propagation of biologically questionable miRNA sequences.
Collapse
Affiliation(s)
- Buwani Manuweera
- Gianforte School of Computing, Montana State University, Bozeman, MT, United States of America
| | - Gillian Reynolds
- Gianforte School of Computing, Montana State University, Bozeman, MT, United States of America
- Department of Plant Sciences and Plant Pathology, Montana State University, Bozeman, MT, United States of America
| | - Indika Kahanda
- Gianforte School of Computing, Montana State University, Bozeman, MT, United States of America
| |
Collapse
|
6
|
Luo F, Du B, Zhang L, Zhang L, Tao D. Feature Learning Using Spatial-Spectral Hypergraph Discriminant Analysis for Hyperspectral Image. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2406-2419. [PMID: 29994036 DOI: 10.1109/tcyb.2018.2810806] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Hyperspectral image (HSI) contains a large number of spatial-spectral information, which will make the traditional classification methods face an enormous challenge to discriminate the types of land-cover. Feature learning is very effective to improve the classification performances. However, the current feature learning approaches are mostly based on a simple intrinsic structure. To represent the complex intrinsic spatial-spectral of HSI, a novel feature learning algorithm, termed spatial-spectral hypergraph discriminant analysis (SSHGDA), has been proposed on the basis of spatial-spectral information, discriminant information, and hypergraph learning. SSHGDA constructs a reconstruction between-class scatter matrix, a weighted within-class scatter matrix, an intraclass spatial-spectral hypergraph, and an interclass spatial-spectral hypergraph to represent the intrinsic properties of HSI. Then, in low-dimensional space, a feature learning model is designed to compact the intraclass information and separate the interclass information. With this model, an optimal projection matrix can be obtained to extract the spatial-spectral features of HSI. SSHGDA can effectively reveal the complex spatial-spectral structures of HSI and enhance the discriminating power of features for land-cover classification. Experimental results on the Indian Pines and PaviaU HSI data sets show that SSHGDA can achieve better classification accuracies in comparison with some state-of-the-art methods.
Collapse
|
7
|
Liu Z, Xie G, Zhang L, Pu J. Fusion linear representation-based classification. Soft comput 2019. [DOI: 10.1007/s00500-017-2898-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
8
|
Learning from crowds with active learning and self-healing. Neural Comput Appl 2018. [DOI: 10.1007/s00521-017-2878-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
9
|
Şeyma Küçükaşcı E, Gökçe Baydoğan M. Bag encoding strategies in multiple instance learning problems. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.08.020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
10
|
Wu J, Pan S, Zhu X, Zhang C, Yu PS. Multiple Structure-View Learning for Graph Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:3236-3251. [PMID: 28945603 DOI: 10.1109/tnnls.2017.2703832] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Many applications involve objects containing structure and rich content information, each describing different feature aspects of the object. Graph learning and classification is a common tool for handling such objects. To date, existing graph classification has been limited to the single-graph setting with each object being represented as one graph from a single structure-view. This inherently limits its use to the classification of complicated objects containing complex structures and uncertain labels. In this paper, we advance graph classification to handle multigraph learning for complicated objects from multiple structure views, where each object is represented as a bag containing several graphs and the label is only available for each graph bag but not individual graphs inside the bag. To learn such graph classification models, we propose a multistructure-view bag constrained learning (MSVBL) algorithm, which aims to explore substructure features across multiple structure views for learning. By enabling joint regularization across multiple structure views and enforcing labeling constraints at the bag and graph levels, MSVBL is able to discover the most effective substructure features across all structure views. Experiments and comparisons on real-world data sets validate and demonstrate the superior performance of MSVBL in representing complicated objects as multigraph for classification, e.g., MSVBL outperforms the state-of-the-art multiview graph classification and multiview multi-instance learning approaches.
Collapse
|
11
|
Bao H, Sakai T, Sato I, Sugiyama M. Convex formulation of multiple instance learning from positive and unlabeled bags. Neural Netw 2018; 105:132-141. [PMID: 29804041 DOI: 10.1016/j.neunet.2018.05.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2017] [Revised: 05/01/2018] [Accepted: 05/01/2018] [Indexed: 10/16/2022]
Abstract
Multiple instance learning (MIL) is a variation of traditional supervised learning problems where data (referred to as bags) are composed of sub-elements (referred to as instances) and only bag labels are available. MIL has a variety of applications such as content-based image retrieval, text categorization, and medical diagnosis. Most of the previous work for MIL assume that training bags are fully labeled. However, it is often difficult to obtain an enough number of labeled bags in practical situations, while many unlabeled bags are available. A learning framework called PU classification (positive and unlabeled classification) can address this problem. In this paper, we propose a convex PU classification method to solve an MIL problem. We experimentally show that the proposed method achieves better performance with significantly lower computation costs than an existing method for PU-MIL.
Collapse
Affiliation(s)
- Han Bao
- Department of Computer Science, The University of Tokyo, Japan; Center for Advanced Intelligence Project, RIKEN, Japan.
| | - Tomoya Sakai
- Department of Complexity Science and Engineering, The University of Tokyo, Japan; Center for Advanced Intelligence Project, RIKEN, Japan.
| | - Issei Sato
- Department of Complexity Science and Engineering, The University of Tokyo, Japan; Department of Computer Science, The University of Tokyo, Japan; Center for Advanced Intelligence Project, RIKEN, Japan.
| | - Masashi Sugiyama
- Center for Advanced Intelligence Project, RIKEN, Japan; International Research Center for Neurointelligence, The University of Tokyo, Japan; Department of Complexity Science and Engineering, The University of Tokyo, Japan; Department of Computer Science, The University of Tokyo, Japan.
| |
Collapse
|
12
|
Zhu Z, Zhao Y. Multi-Graph Multi-Label Learning Based on Entropy. ENTROPY (BASEL, SWITZERLAND) 2018; 20:e20040245. [PMID: 33265336 PMCID: PMC7512760 DOI: 10.3390/e20040245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2018] [Revised: 03/30/2018] [Accepted: 03/30/2018] [Indexed: 06/12/2023]
Abstract
Recently, Multi-Graph Learning was proposed as the extension of Multi-Instance Learning and has achieved some successes. However, to the best of our knowledge, currently, there is no study working on Multi-Graph Multi-Label Learning, where each object is represented as a bag containing a number of graphs and each bag is marked with multiple class labels. It is an interesting problem existing in many applications, such as image classification, medicinal analysis and so on. In this paper, we propose an innovate algorithm to address the problem. Firstly, it uses more precise structures, multiple Graphs, instead of Instances to represent an image so that the classification accuracy could be improved. Then, it uses multiple labels as the output to eliminate the semantic ambiguity of the image. Furthermore, it calculates the entropy to mine the informative subgraphs instead of just mining the frequent subgraphs, which enables selecting the more accurate features for the classification. Lastly, since the current algorithms cannot directly deal with graph-structures, we degenerate the Multi-Graph Multi-Label Learning into the Multi-Instance Multi-Label Learning in order to solve it by MIML-ELM (Improving Multi-Instance Multi-Label Learning by Extreme Learning Machine). The performance study shows that our algorithm outperforms the competitors in terms of both effectiveness and efficiency.
Collapse
|
13
|
Yan J, Li C, Li Y, Cao G. Adaptive Discrete Hypergraph Matching. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:765-779. [PMID: 28222006 DOI: 10.1109/tcyb.2017.2655538] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper addresses the problem of hypergraph matching using higher-order affinity information. We propose a solver that iteratively updates the solution in the discrete domain by linear assignment approximation. The proposed method is guaranteed to converge to a stationary discrete solution and avoids the annealing procedure and ad-hoc post binarization step that are required in several previous methods. Specifically, we start with a simple iterative discrete gradient assignment solver. This solver can be trapped in an -circle sequence under moderate conditions, where is the order of the graph matching problem. We then devise an adaptive relaxation mechanism to jump out this degenerating case and show that the resulting new path will converge to a fixed solution in the discrete domain. The proposed method is tested on both synthetic and real-world benchmarks. The experimental results corroborate the efficacy of our method.
Collapse
|
14
|
Yu J, Yang X, Gao F, Tao D. Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4014-4024. [PMID: 27529881 DOI: 10.1109/tcyb.2016.2591583] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
How do we retrieve images accurately? Also, how do we rank a group of images precisely and efficiently for specific queries? These problems are critical for researchers and engineers to generate a novel image searching engine. First, it is important to obtain an appropriate description that effectively represent the images. In this paper, multimodal features are considered for describing images. The images unique properties are reflected by visual features, which are correlated to each other. However, semantic gaps always exist between images visual features and semantics. Therefore, we utilize click feature to reduce the semantic gap. The second key issue is learning an appropriate distance metric to combine these multimodal features. This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method. A structured ranking model is adopted to utilize both visual and click features in distance metric learning (DML). Specifically, images and their related ranking results are first collected to form the training set. Multimodal features, including click and visual features, are collected with these images. Next, a group of autoencoders is applied to obtain initially a distance metric in different visual spaces, and an MDML method is used to assign optimal weights for different modalities. Next, we conduct alternating optimization to train the ranking model, which is used for the ranking of new queries with click features. Compared with existing image ranking methods, the proposed method adopts a new ranking model to use multimodal features, including click features and visual features in DML. We operated experiments to analyze the proposed Deep-MDML in two benchmark data sets, and the results validate the effects of the method.
Collapse
|
15
|
Wang Y, Liu F, Xia ST, Wu J. Link sign prediction by Variational Bayesian Probabilistic Matrix Factorization with Student-t Prior. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2017.04.014] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|