1
|
Wen Z, Wu H, Ying S. Histopathology Image Classification With Noisy Labels via The Ranking Margins. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2790-2802. [PMID: 38526889 DOI: 10.1109/tmi.2024.3381775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
Clinically, histopathology images always offer a golden standard for disease diagnosis. With the development of artificial intelligence, digital histopathology significantly improves the efficiency of diagnosis. Nevertheless, noisy labels are inevitable in histopathology images, which lead to poor algorithm efficiency. Curriculum learning is one of the typical methods to solve such problems. However, existing curriculum learning methods either fail to measure the training priority between difficult samples and noisy ones or need an extra clean dataset to establish a valid curriculum scheme. Therefore, a new curriculum learning paradigm is designed based on a proposed ranking function, which is named The Ranking Margins (TRM). The ranking function measures the 'distances' between samples and decision boundaries, which helps distinguish difficult samples and noisy ones. The proposed method includes three stages: the warm-up stage, the main training stage and the fine-tuning stage. In the warm-up stage, the margin of each sample is obtained through the ranking function. In the main training stage, samples are progressively fed into the networks for training, starting from those with larger margins to those with smaller ones. Label correction is also performed in this stage. In the fine-tuning stage, the networks are retrained on the samples with corrected labels. In addition, we provide theoretical analysis to guarantee the feasibility of TRM. The experiments on two representative histopathologies image datasets show that the proposed method achieves substantial improvements over the latest Label Noise Learning (LNL) methods.
Collapse
|
2
|
Wang J, Tang Y, Xiao Y, Zhou JT, Fang Z, Yang F. GREnet: Gradually REcurrent Network With Curriculum Learning for 2-D Medical Image Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10018-10032. [PMID: 37022080 DOI: 10.1109/tnnls.2023.3238381] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Medical image segmentation is a vital stage in medical image analysis. Numerous deep-learning methods are booming to improve the performance of 2-D medical image segmentation, owing to the fast growth of the convolutional neural network. Generally, the manually defined ground truth is utilized directly to supervise models in the training phase. However, direct supervision of the ground truth often results in ambiguity and distractors as complex challenges appear simultaneously. To alleviate this issue, we propose a gradually recurrent network with curriculum learning, which is supervised by gradual information of the ground truth. The whole model is composed of two independent networks. One is the segmentation network denoted as GREnet, which formulates 2-D medical image segmentation as a temporal task supervised by pixel-level gradual curricula in the training phase. The other is a curriculum-mining network. To a certain degree, the curriculum-mining network provides curricula with an increasing difficulty in the ground truth of the training set by progressively uncovering hard-to-segmentation pixels via a data-driven manner. Given that segmentation is a pixel-level dense-prediction challenge, to the best of our knowledge, this is the first work to function 2-D medical image segmentation as a temporal task with pixel-level curriculum learning. In GREnet, the naive UNet is adopted as the backbone, while ConvLSTM is used to establish the temporal link between gradual curricula. In the curriculum-mining network, UNet++ supplemented by transformer is designed to deliver curricula through the outputs of the modified UNet++ at different layers. Experimental results have demonstrated the effectiveness of GREnet on seven datasets, i.e., three lesion segmentation datasets in dermoscopic images, an optic disc and cup segmentation dataset and a blood vessel segmentation dataset in retinal images, a breast lesion segmentation dataset in ultrasound images, and a lung segmentation dataset in computed tomography (CT).
Collapse
|
3
|
Ruan J, Zheng Q, Zhao R, Dong B. Biased Complementary-Label Learning Without True Labels. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2616-2627. [PMID: 35862328 DOI: 10.1109/tnnls.2022.3190528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In complementary-label learning (CLL), the complementary transition matrix, denoting the probabilities that true labels flip into complementary labels (CLs) which specify classes observations do not belong to, is crucial to building statistically consistent classifiers. Most existing works implicitly assume that the transition probabilities are identical, which is not true in practice and may lead to undesirable bias in solutions. Few recent works have extended the problem to a biased setting but limit their explorations to modeling the transition matrix by exploiting the complementary class posteriors of anchor points (i.e., instances that almost certainly belong to a specific class). However, due to the severe corruption and unevenness of biased CLs, both anchor points and complementary class posteriors are difficult to predict accurately in the absence of true labels. In this article, rather than directly predicting these two error-prone items, we instead propose a divided-T estimator as an alternative to effectively learn transition matrices from only biased CLs. Specifically, we exploit semantic clustering to mitigate the adverse effects arising from CLs. By introducing the learned semantic clusters as an intermediate class, we factorize the original transition matrix into the product of two easy-to-estimate matrices that are not reliant on the two error-prone items. Both theoretical analyses and empirical results justify the effectiveness of the divided- T estimator for estimating transition matrices under a mild assumption. Experimental results on benchmark datasets further demonstrate that the divided- T estimator outperforms state-of-the-art (SOTA) methods by a substantial margin.
Collapse
|
4
|
Ma F, Wu Y, Yu X, Yang Y. Learning With Noisy Labels via Self-Reweighting From Class Centroids. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6275-6285. [PMID: 33961567 DOI: 10.1109/tnnls.2021.3073248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Although deep neural networks have been proved effective in many applications, they are data hungry, and training deep models often requires laboriously labeled data. However, when labeled data contain erroneous labels, they often lead to model performance degradation. A common solution is to assign each sample with a dynamic weight during optimization, and the weight is adjusted in accordance with the loss. However, those weights are usually unreliable since they are measured by the losses of corrupted labels. Thus, this scheme might impede the discriminative ability of neural networks trained on noisy data. To address this issue, we propose a novel reweighting method, dubbed self-reweighting from class centroids (SRCC), by assigning sample weights based on the similarities between the samples and our online learned class centroids. Since we exploit statistical class centers in the image feature space to reweight data samples in learning, our method is robust to noise caused by corrupted labels. In addition, even after reweighting the noisy data, the decision boundaries might still suffer distortions. Thus, we leverage mixed inputs that are generated by linearly interpolating two random images and their labels to further regularize the boundaries. We employ the learned class centroids to evaluate the confidence of our generated mixed data via measuring feature similarities. During the network optimization, the class centroids are updated as more discriminative feature representations of original images are learned. In doing so, SRCC will generate more robust weighting coefficients for noisy and mixed data and facilitates our feature representation learning in return. Extensive experiments on both the synthetic and real image recognition tasks demonstrate that our method SRCC outperforms the state of the art on learning with noisy data.
Collapse
|
5
|
Algan G, Ulusoy I. MetaLabelNet: Learning to Generate Soft-Labels From Noisy-Labels. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4352-4362. [PMID: 35731778 DOI: 10.1109/tip.2022.3183841] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Real-world datasets commonly have noisy labels, which negatively affects the performance of deep neural networks (DNNs). In order to address this problem, we propose a label noise robust learning algorithm, in which the base classifier is trained on soft-labels that are produced according to a meta-objective. In each iteration, before conventional training, the meta-training loop updates soft-labels so that resulting gradients updates on the base classifier would yield minimum loss on meta-data. Soft-labels are generated from extracted features of data instances, and the mapping function is learned by a single layer perceptron (SLP) network, which is called MetaLabelNet. Following, base classifier is trained by using these generated soft-labels. These iterations are repeated for each batch of training data. Our algorithm uses a small amount of clean data as meta-data, which can be obtained effortlessly for many cases. We perform extensive experiments on benchmark datasets with both synthetic and real-world noises. Results show that our approach outperforms existing baselines. The source code of the proposed model is available at https://github.com/gorkemalgan/MetaLabelNet.
Collapse
|
6
|
Xu Q, Yang Z, Jiang Y, Cao X, Yao Y, Huang Q. Not All Samples are Trustworthy: Towards Deep Robust SVP Prediction. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:3154-3169. [PMID: 33373295 DOI: 10.1109/tpami.2020.3047817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this paper, we study the problem of estimating subjective visual properties (SVP) for images, which is an emerging task in Computer Vision. Generally speaking, collecting SVP datasets involves a crowdsourcing process where annotations are obtained from a wide range of online users. Since the process is done without quality control, SVP datasets are known to suffer from noise. This leads to the issue that not all samples are trustworthy. Facing this problem, we need to develop robust models for learning SVP from noisy crowdsourced annotations. In this paper, we construct two general robust learning frameworks for this application. Specifically, in the first framework, we propose a probabilistic framework to explicitly model the sparse unreliable patterns that exist in the dataset. It is noteworthy that we then provide an alternative framework that could reformulate the sparse unreliable patterns as a "contraction" operation over the original loss function. The latter framework leverages not only efficient end-to-end training but also rigorous theoretical analyses. To apply these frameworks, we further provide two models as implementations of the frameworks, where the sparse noise parameters could be interpreted with the HodgeRank theory. Finally, extensive theoretical and empirical studies show the effectiveness of our proposed framework.
Collapse
|
7
|
Zhang CB, Jiang PT, Hou Q, Wei Y, Han Q, Li Z, Cheng MM. Delving Deep Into Label Smoothing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5984-5996. [PMID: 34166191 DOI: 10.1109/tip.2021.3089942] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Label smoothing is an effective regularization tool for deep neural networks (DNNs), which generates soft labels by applying a weighted average between the uniform distribution and the hard label. It is often used to reduce the overfitting problem of training DNNs and further improve classification performance. In this paper, we aim to investigate how to generate more reliable soft labels. We present an Online Label Smoothing (OLS) strategy, which generates soft labels based on the statistics of the model prediction for the target category. The proposed OLS constructs a more reasonable probability distribution between the target categories and non-target categories to supervise DNNs. Experiments demonstrate that based on the same classification models, the proposed approach can effectively improve the classification performance on CIFAR-100, ImageNet, and fine-grained datasets. Additionally, the proposed method can significantly improve the robustness of DNN models to noisy labels compared to current label smoothing approaches. The source code is available at our project page: https://mmcheng.net/ols/.
Collapse
|
8
|
Matiisen T, Oliver A, Cohen T, Schulman J. Teacher-Student Curriculum Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3732-3740. [PMID: 31502993 DOI: 10.1109/tnnls.2019.2934906] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We propose Teacher-Student Curriculum Learning (TSCL), a framework for automatic curriculum learning, where the Student tries to learn a complex task, and the Teacher automatically chooses subtasks from a given set for the Student to train on. We describe a family of Teacher algorithms that rely on the intuition that the Student should practice more those tasks on which it makes the fastest progress, i.e., where the slope of the learning curve is highest. In addition, the Teacher algorithms address the problem of forgetting by also choosing tasks where the Student's performance is getting worse. We demonstrate that TSCL matches or surpasses the results of carefully hand-crafted curricula in two tasks: addition of decimal numbers with long short-term memory (LSTM) and navigation in Minecraft. Our automatically ordered curriculum of submazes enabled to solve a Minecraft maze that could not be solved at all when training directly on that maze, and the learning was an order of magnitude faster than a uniform sampling of those submazes.
Collapse
|
9
|
Wei Y, Gong C, Chen S, Liu T, Yang J, Tao D. Harnessing Side Information for Classification Under Label Noise. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3178-3192. [PMID: 31562108 DOI: 10.1109/tnnls.2019.2938782] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Practical data sets often contain the label noise caused by various human factors or measurement errors, which means that a fraction of training examples might be mistakenly labeled. Such noisy labels will mislead the classifier training and severely decrease the classification performance. Existing approaches to handle this problem are usually developed through various surrogate loss functions under the framework of empirical risk minimization. However, they are only suitable for binary classification and also require strong prior knowledge. Therefore, this article treats the example features as side information and formulates the noisy label removal problem as a matrix recovery problem. We denote our proposed method as "label noise handling via side information" (LNSI). Specifically, the observed label matrix is decomposed as the sum of two parts, in which the first part reveals the true labels and can be obtained by conducting a low-rank mapping on the side information; and the second part captures the incorrect labels and is modeled by a row-sparse matrix. The merits of such formulation lie in three aspects: 1) the strong recovery ability of this strategy has been sufficiently demonstrated by intensive theoretical works on side information; 2) multi-class situations can be directly handled with the aid of learned projection matrix; and 3) only very weak assumptions are required for model design, making LNSI applicable to a wide range of practical problems. Moreover, we theoretically derive the generalization bound of LNSI and show that the expected classification error of LNSI is upper bounded. The experimental results on a variety of data sets including UCI benchmark data sets and practical data sets confirm the superiority of LNSI to state-of-the-art approaches on label noise handling.
Collapse
|
10
|
Saab SS, Shen D. Multidimensional Gains for Stochastic Approximation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:1602-1615. [PMID: 31265420 DOI: 10.1109/tnnls.2019.2920930] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper deals with iterative Jacobian-based recursion technique for the root-finding problem of the vector-valued function, whose evaluations are contaminated by noise. Instead of a scalar step size, we use an iterate-dependent matrix gain to effectively weigh the different elements associated with the noisy observations. The analytical development of the matrix gain is built on an iterative-dependent linear function interfered by additive zero-mean white noise, where the dimension of the function is M ≥ 1 and the dimension of the unknown variable is N ≥ 1 . Necessary and sufficient conditions for M ≥ N algorithms are presented pertaining to algorithm stability and convergence of the estimate error covariance matrix. Two algorithms are proposed: one for the case where M ≥ N and the second one for the antithesis. The two algorithms assume full knowledge of the Jacobian. The recursive algorithms are proposed for generating the optimal iterative-dependent matrix gain. The proposed algorithms here aim for per-iteration minimization of the mean square estimate error. We show that the proposed algorithm satisfies the presented conditions for stability and convergence of the covariance. In addition, the convergence rate of the estimation error covariance is shown to be inversely proportional to the number of iterations. For the antithesis , contraction of the error covariance is guaranteed. This underdetermined system of equations can be helpful in training neural networks. Numerical examples are presented to illustrate the performance capabilities of the proposed multidimensional gain while considering nonlinear functions.
Collapse
|
11
|
Gong C, Liu T, Yang J, Tao D. Large-Margin Label-Calibrated Support Vector Machines for Positive and Unlabeled Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3471-3483. [PMID: 30736009 DOI: 10.1109/tnnls.2019.2892403] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Positive and unlabeled learning (PU learning) aims to train a binary classifier based on only PU data. Existing methods usually cast PU learning as a label noise learning problem or a cost-sensitive learning problem. However, none of them fully take the data distribution information into consideration when designing the model, which hinders them from acquiring more encouraging performance. In this paper, we argue that the clusters formed by positive examples and potential negative examples in the feature space should be critically utilized to establish the PU learning model, especially when the negative data are not explicitly available. To this end, we introduce a hat loss to discover the margin between data clusters, a label calibration regularizer to amend the biased decision boundary to the potentially correct one, and propose a novel discriminative PU classifier termed "Large-margin Label-calibrated Support Vector Machines" (LLSVM). Our LLSVM classifier can work properly in the absence of negative training examples and effectively achieve the max-margin effect between positive and negative classes. Theoretically, we derived the generalization error bound of LLSVM which reveals that the introduction of PU data does help to enhance the algorithm performance. Empirically, we compared LLSVM with state-of-the-art PU methods on various synthetic and practical data sets, and the results confirm that the proposed LLSVM is more effective than other compared methods on dealing with PU learning tasks.
Collapse
|
12
|
Zhang J, Sheng VS, Wu J. Crowdsourced Label Aggregation Using Bilayer Collaborative Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3172-3185. [PMID: 30703041 DOI: 10.1109/tnnls.2018.2890148] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
With online crowdsourcing platforms, labels can be acquired at relatively low costs from massive nonexpert workers. To improve the quality of labels obtained from these imperfect crowdsourced workers, we usually let different workers provide labels for the same instance. Then, the true labels for all instances are estimated from these multiple noisy labels. This traditional general-purpose label aggregation process, solely relying on the collected noisy labels, cannot significantly improve the accuracy of integrated labels under a low labeling quality circumstance. This paper proposes a novel bilayer collaborative clustering (BLCC) method for the label aggregation in crowdsourcing. BLCC first generates the conceptual-level features for the instances from their multiple noisy labels and infers the initially integrated labels by performing clustering on the conceptual-level features. Then, it performs another clustering on the physical-level features to form the estimations of the true labels on the physical layer. The clustering results on both layers can facilitate in tracking the changes in the uncertainties of the instances. Finally, the initially integrated labels that are likely to be wrongly inferred on the conceptual layer can be addressed using the estimated labels on the physical layer. The clustering processes on both layers can keep providing guidance information for each other in the multiple label remedy rounds. The experimental results on 12 real-world crowdsourcing data sets show that the performance of the proposed method in terms of accuracy is better than that of the state-of-the-art methods.
Collapse
|
13
|
Zhou JT, Fang M, Zhang H, Gong C, Peng X, Cao Z, Goh RSM. Learning With Annotation of Various Degrees. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2794-2804. [PMID: 30640630 DOI: 10.1109/tnnls.2018.2885854] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we study a new problem in the scenario of sequences labeling. To be exact, we consider that the training data are with annotation of various degrees, namely, fully labeled, unlabeled, and partially labeled sequences. The learning with fully un/labeled sequence refers to the standard setting in traditional un/supervised learning, and the proposed partially labeling specifies the subject that the element does not belong to. The partially labeled data are cheaper to obtain compared with the fully labeled data though it is less informative, especially when the tasks require a lot of domain knowledge. To solve such a practical challenge, we propose a novel deep conditional random field (CRF) model which utilizes an end-to-end learning manner to smoothly handle fully/un/partially labeled sequences within a unified framework. To the best of our knowledge, this could be one of the first works to utilize the partially labeled instance for sequence labeling, and the proposed algorithm unifies the deep learning and CRF in an end-to-end framework. Extensive experiments show that our method achieves state-of-the-art performance in two sequence labeling tasks on some popular data sets.
Collapse
|
14
|
Tao D, Cheng J, Yu Z, Yue K, Wang L. Domain-Weighted Majority Voting for Crowdsourcing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:163-174. [PMID: 29994339 DOI: 10.1109/tnnls.2018.2836969] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Crowdsourcing labeling systems provide an efficient way to generate multiple inaccurate labels for given observations. If the competence level or the "reputation," which can be explained as the probabilities of annotating the right label, for each crowdsourcing annotators is equal and biased to annotate the right label, majority voting (MV) is the optimal decision rule for merging the multiple labels into a single reliable one. However, in practice, the competence levels of annotators employed by the crowdsourcing labeling systems are often diverse very much. In these cases, weighted MV is more preferred. The weights should be determined by the competence levels. However, since the annotators are anonymous and the ground-truth labels are usually unknown, it is hard to compute the competence levels of the annotators directly. In this paper, we propose to learn the weights for weighted MV by exploiting the expertise of annotators. Specifically, we model the domain knowledge of different annotators with different distributions and treat the crowdsourcing problem as a domain adaptation problem. The annotators provide labels to the source domains and the target domain is assumed to be associated with the ground-truth labels. The weights are obtained by matching the source domains with the target domain. Although the target-domain labels are unknown, we prove that they could be estimated under mild conditions. Both theoretical and empirical analyses verify the effectiveness of the proposed method. Large performance gains are shown for specific data sets.
Collapse
|
15
|
Yu X, Liu T, Gong M, Batmanghelich K, Tao D. An Efficient and Provable Approach for Mixture Proportion Estimation Using Linear Independence Assumption. CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS. IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION. WORKSHOPS 2018; 2018:4480-4489. [PMID: 32089968 DOI: 10.1109/cvpr.2018.00471] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper, we study the mixture proportion estimation (MPE) problem in a new setting: given samples from the mixture and the component distributions, we identify the proportions of the components in the mixture distribution. To address this problem, we make use of a linear independence assumption, i.e., the component distributions are independent from each other, which is much weaker than assumptions exploited in the previous MPE methods. Based on this assumption, we propose a method (1) that uniquely identifies the mixture proportions, (2) whose output provably converges to the optimal solution, and (3) that is computationally efficient. We show the superiority of the proposed method over the state-of-the-art methods in two applications including learning with label noise and semi-supervised learning on both synthetic and real-world datasets.
Collapse
Affiliation(s)
- Xiyu Yu
- UBTECH Sydney AI Centre, SIT, FEIT, The University of Sydney, Australia
| | - Tongliang Liu
- UBTECH Sydney AI Centre, SIT, FEIT, The University of Sydney, Australia
| | - Mingming Gong
- Department of Biomedical Informatics, University of Pittsburgh.,Department of Philosophy, Carnegie Mellon University
| | | | - Dacheng Tao
- UBTECH Sydney AI Centre, SIT, FEIT, The University of Sydney, Australia
| |
Collapse
|