1
|
Cao X, Tsang IW, Xu J. Cold-Start Active Sampling Via γ-Tube. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:6034-6045. [PMID: 33878008 DOI: 10.1109/tcyb.2021.3069956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Active learning (AL) improves the generalization performance for the current classification hypothesis by querying labels from a pool of unlabeled data. The sampling process is typically assessed by an informative, representative, or diverse evaluation policy. However, the policy, which needs an initial labeled set to start, may degenerate its performance in a cold-start hypothesis. In this article, we first show that typical AL sampling can be equivalently formulated as geometric sampling over minimum enclosing ballsMEB of this article denotes a conceptual geometry over the cluster in generalization analysis. In the SVM community, it is related to hard-margin support vector data description.(MEBs) of clusters. Following the γ -tube structure in geometric clustering, we then divide one MEB covering a cluster into two parts: 1) a γ -tube and 2) a γ -ball. By estimating the error disagreement between sampling in MEB and γ -ball, our theoretical insight reveals that γ -tube can effectively measure the disagreement of hypotheses in original space over MEB and sampling space over γ -ball. To tighten our insight, we present generalization analysis, and the results show that sampling in γ -tube can derive higher probability bound to achieve a nearly zero generalization error. With these analyses, we finally apply the informative sampling policy of AL over γ -tube to present a tube AL (TAL) algorithm against the cold-start sampling issue. As a result, the dependency between the querying process and the evaluation policy of active sampling can be alleviated. Experimental results show that by using the γ -tube structure to deal with cold-start sampling, TAL achieves the superior performance than standard AL evaluation baselines by presenting substantial accuracy improvements. Image edge recognition extends our theoretical results.
Collapse
|
2
|
Ghafarian SH, Yazdi HS. Prepare for the Worst, Hope for the Best: Active Robust Learning On Distributions. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5573-5586. [PMID: 34033565 DOI: 10.1109/tcyb.2021.3071547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In recent years, many learning systems have been developed for higher level forms of data, such as learning on distributions in which each example itself is a distribution. This article proposes active robust learning on distributions. In learning on distributions, there is no access to distributions themselves but rather access is through a sample drawn from a distribution. Therefore, similar to robust learning, any estimates of examples are inexact. In order to address these difficulties, we provide an upper bound on the risk of the classifier in the next stage of active learning, where the size of the labeled dataset increases. Based on this upper bound, we propose probabilistic minimax active learning (PMAL) as a general multiclass active learning method that is easy to use in many Bayesian settings, which provably selects an example with knowledge of its label minimizing the expected risk. We present an efficient approximation of the objective with a known error bound to deal with the intractability of the proposed method for active robust learning. Here, we face a nonconvex problem, which we solve by means of a related convex problem with a bound on the norm of the difference between their solutions. To utilize the information about the estimates of distributions, we propose active robust learning on the distributions method based on learning the kernel embedding of distributions by a recent Bayesian method. The experiments demonstrate the effectiveness of the resulting method on a set of synthetic and real-world distributional datasets.
Collapse
|
3
|
Chen X, Zhou G, Wang Y, Hou M, Zhao Q, Xie S. Accommodating Multiple Tasks' Disparities With Distributed Knowledge-Sharing Mechanism. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:2440-2452. [PMID: 32649285 DOI: 10.1109/tcyb.2020.3002911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deep multitask learning (MTL) shares beneficial knowledge across participating tasks, alleviating the impacts of extreme learning conditions on their performances such as the data scarcity problem. In practice, participators stemming from different domain sources often have varied complexities and input sizes, for example, in the joint learning of computer vision tasks with RGB and grayscale images. For adapting to these differences, it is appropriate to design networks with proper representational capacities and construct neural layers with corresponding widths. Nevertheless, most of the state-of-the-art methods pay little attention to such situations, and actually fail to handle the disparities. To work with the dissimilitude of tasks' network designs, this article presents a distributed knowledge-sharing framework called tensor ring multitask learning (TRMTL), in which the relationship between knowledge sharing and original weight matrices is cut up. The framework of TRMTL is flexible, which is not only capable of sharing knowledge across heterogenous networks but also able to jointly learn tasks with varied input sizes, significantly improving performances of data-insufficient tasks. Comprehensive experiments on challenging datasets are conducted to empirically validate the effectiveness, efficiency, and flexibility of TRMTL in dealing with the disparities in MTL.
Collapse
|
4
|
Cao X, Tsang IW. Shattering Distribution for Active Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:215-228. [PMID: 33085620 DOI: 10.1109/tnnls.2020.3027605] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Active learning (AL) aims to maximize the learning performance of the current hypothesis by drawing as few labels as possible from an input distribution. Generally, most existing AL algorithms prune the hypothesis set via querying labels of unlabeled samples and could be deemed as a hypothesis-pruning strategy. However, this process critically depends on the initial hypothesis and its subsequent updates. This article presents a distribution-shattering strategy without an estimation of hypotheses by shattering the number density of the input distribution. For any hypothesis class, we halve the number density of an input distribution to obtain a shattered distribution, which characterizes any hypothesis with a lower bound on VC dimension. Our analysis shows that sampling in a shattered distribution reduces label complexity and error disagreement. With this paradigm guarantee, in an input distribution, a Shattered Distribution-based AL (SDAL) algorithm is derived to continuously split the shattered distribution into a number of representative samples. An empirical evaluation of benchmark data sets further verifies the effectiveness of the halving and querying abilities of SDAL in real-world AL tasks with limited labels. Experiments on active querying with adversarial examples and noisy labels further verify our theoretical insights on the performance disagreement of the hypothesis-pruning and distribution-shattering strategies. Our code is available at https://github.com/XiaofengCao-MachineLearning/Shattering-Distribution-for-Active-Learning.
Collapse
|
5
|
Zhang C, Cheng J, Tian Q. Multiview Semantic Representation for Visual Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2038-2049. [PMID: 30418893 DOI: 10.1109/tcyb.2018.2875728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Due to interclass and intraclass variations, the images of different classes are often cluttered which makes it hard for efficient classifications. The use of discriminative classification algorithms helps to alleviate this problem. However, it is still an open problem to accurately model the relationships between visual representations and human perception. To alleviate these problems, in this paper, we propose a novel multiview semantic representation (MVSR) algorithm for efficient visual recognition. First, we leverage visually based methods to get initial image representations. We then use both visual and semantic similarities to divide images into groups which are then used for semantic representations. We treat different image representation strategies, partition methods, and numbers as different views. A graph is then used to combine the discriminative power of different views. The similarities between images can be obtained by measuring the similarities of graphs. Finally, we train classifiers to predict the categories of images. We evaluate the discriminative power of the proposed MVSR method for visual recognition on several public image datasets. Experimental results show the effectiveness of the proposed method.
Collapse
|
6
|
Wang J, Wang Q, Zhang H, Chen J, Wang S, Shen D. Sparse Multiview Task-Centralized Ensemble Learning for ASD Diagnosis Based on Age- and Sex-Related Functional Connectivity Patterns. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:3141-3154. [PMID: 29994137 PMCID: PMC6411442 DOI: 10.1109/tcyb.2018.2839693] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Autism spectrum disorder (ASD) is an age- and sex-related neurodevelopmental disorder that alters the brain's functional connectivity (FC). The changes caused by ASD are associated with different age- and sex-related patterns in neuroimaging data. However, most contemporary computer-assisted ASD diagnosis methods ignore the aforementioned age-/sex-related patterns. In this paper, we propose a novel sparse multiview task-centralized (Sparse-MVTC) ensemble classification method for image-based ASD diagnosis. Specifically, with the age and sex information of each subject, we formulate the classification as a multitask learning problem, where each task corresponds to learning upon a specific age/sex group. We also extract multiview features per subject to better reveal the FC changes. Then, in Sparse-MVTC learning, we select a certain central task and treat the rest as auxiliary tasks. By considering both task-task and view-view relationships between the central task and each auxiliary task, we can learn better upon the entire dataset. Finally, by selecting the central task, in turn, we are able to derive multiple classifiers for each task/group. An ensemble strategy is further adopted, such that the final diagnosis can be integrated for each subject. Our comprehensive experiments on the ABIDE database demonstrate that our proposed Sparse-MVTC ensemble learning can significantly outperform the state-of-the-art classification methods for ASD diagnosis.
Collapse
Affiliation(s)
- Jun Wang
- Department of Radiology and BRIC, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA, also with the School of Digital Media, Jiangnan University, Wuxi 214122, China, and also with the Jiangsu Key Laboratory of Media Design and Software Technology, Jiangnan University, Wuxi 214122, China ()
| | - Qian Wang
- Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China ()
| | - Han Zhang
- Department of Radiology and BRIC, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ()
| | - Jiawei Chen
- Department of Radiology and BRIC, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ()
| | - Shitong Wang
- School of Digital Media, Jiangnan University, Wuxi 214122, China, and also with the Jiangsu Key Laboratory of Media Design and Software Technology, Jiangnan University, Wuxi 214122, China ()
| | - Dinggang Shen
- Department of Radiology and BRIC, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA, and also with the Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea ()
| |
Collapse
|
7
|
Fang M, Zhou T, Yin J, Wang Y, Tao D. Data Subset Selection With Imperfect Multiple Labels. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2212-2221. [PMID: 30507515 DOI: 10.1109/tnnls.2018.2875470] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We study the problem of selecting a subset of weakly labeled data where the labels of each data instance are redundant and imperfect. In real applications, less-than-expert labels are obtained at low cost in order to acquire many labels for each instance and then used for estimating the ground truth. However, on one side, preparing and processing data itself sometimes can be even more expensive than labeling. On the other side, noisy labels also decrease the performance of supervised learning methods. Thus, we introduce a new quality control mechanism on labels for each instance and use it to select an optimal subset of data. Based on the quality control mechanism, in which the labeling quality of each instance is estimated, it provides a way to know which instance has enough reliable labels or how many labels still need to be collected for a data instance. In this paper, first, we consider the data subset selection problem under the probably approximately correct model. Then, we show how to find an ϵ -optimal labeled instance based on expected labeling quality. Furthermore, we propose new algorithms to select the best k quality instances that have high expected labeling quality. Using a reliable subset of data provides substantial benefit over using all data with imperfect multiple labels, and the expected labeling quality is a good indicator of where to allocate labeling effort. It shows how many labels should be acquired for an instance and which instances are qualified to be selected comparing with others. Both the theoretical guarantees and the comprehensive experiments demonstrate the effectiveness and efficiency of our algorithms.
Collapse
|