1
|
Yu J, Ma T, Fu Y, Chen H, Lai M, Zhuo C, Xu Y. Local-to-global spatial learning for whole-slide image representation and classification. Comput Med Imaging Graph 2023; 107:102230. [PMID: 37116341 DOI: 10.1016/j.compmedimag.2023.102230] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 03/27/2023] [Accepted: 04/05/2023] [Indexed: 04/30/2023]
Abstract
Whole-slide image (WSI) provides an important reference for clinical diagnosis. Classification with only WSI-level labels can be recognized for multi-instance learning (MIL) tasks. However, most existing MIL-based WSI classification methods have moderate performance on correlation mining between instances limited by their instance- level classification strategy. Herein, we propose a novel local-to-global spatial learning method to mine global position and local morphological information by redefining the MIL-based WSI classification strategy, better at learning WSI-level representation, called Global-Local Attentional Multi-Instance Learning (GLAMIL). GLAMIL can focus on regional relationships rather than single instances. It first learns relationships between patches in the local pool to aggregate region correlation (tissue types of a WSI). These correlations then can be further mined to fulfill WSI-level representation, where position correlation between different regions can be modeled. Furthermore, Transformer layers are employed to model global and local spatial information rather than being simply used as feature extractors, and the corresponding structure improvements are present. In addition, we evaluate GIAMIL on three benchmarks considering various challenging factors and achieve satisfactory results. GLAMIL outperforms state-of-the-art methods and baselines by about 1 % and 10 %, respectively.
Collapse
Affiliation(s)
- Jiahui Yu
- Department of Biomedical Enginearing, Key Laboratory of Biomedical Engineering of Ministry of Education, State Key Laboratory of Modern Optical Instrumentation, Zhejiang Provincial Key Laboratory of Cardio-Cerebral Vascular Detection Technology and Medicinal Effectiveness Appraisal, Zhejiang University, Hangzhou 310027, China; Innovation Center for Smart Medical Technologies & Devices, Binjiang Institute of Zhejiang University, Hangzhou 310053, China
| | - Tianyu Ma
- Innovation Center for Smart Medical Technologies & Devices, Binjiang Institute of Zhejiang University, Hangzhou 310053, China
| | - Yu Fu
- College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou 310027, China
| | - Hang Chen
- Department of Biomedical Enginearing, Key Laboratory of Biomedical Engineering of Ministry of Education, State Key Laboratory of Modern Optical Instrumentation, Zhejiang Provincial Key Laboratory of Cardio-Cerebral Vascular Detection Technology and Medicinal Effectiveness Appraisal, Zhejiang University, Hangzhou 310027, China
| | - Maode Lai
- Department of Pathology, School of Medicine, Zhejiang University, Hangzhou 310053, China
| | - Cheng Zhuo
- College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou 310027, China
| | - Yingke Xu
- Department of Biomedical Enginearing, Key Laboratory of Biomedical Engineering of Ministry of Education, State Key Laboratory of Modern Optical Instrumentation, Zhejiang Provincial Key Laboratory of Cardio-Cerebral Vascular Detection Technology and Medicinal Effectiveness Appraisal, Zhejiang University, Hangzhou 310027, China; Innovation Center for Smart Medical Technologies & Devices, Binjiang Institute of Zhejiang University, Hangzhou 310053, China; Department of Endocrinology, Children's Hospital of Zhejiang University School of Medicine, National Clinical Research Center for Children's Health, Hangzhou, Zhejiang 310051, China.
| |
Collapse
|
2
|
Xie L, Luo Y, Su SF, Wei H. Graph Regularized Structured Output SVM for Early Expression Detection With Online Extension. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1419-1431. [PMID: 34495865 DOI: 10.1109/tcyb.2021.3108143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this study, a graph regularized algorithm for early expression detection (EED), called GraphEED, is proposed. EED is aimed at detecting the specified expression in the early stage of a video. Existing EED detectors fail to explicitly exploit the local geometrical structure of the data distribution, which may affect the prediction performance significantly. According to manifold learning, the data in real-world applications are likely to reside on a low-dimensional submanifold embedded in the high-dimensional ambient space. The proposed graph Laplacian consists of two parts: 1) a k -nearest neighbor graph is first constructed to encode the geometrical information under the manifold assumption and 2) the entire expressions are regarded as the must-link constraints since they all contain the complete duration information and it is shown that this can also be formulated as a graph regularization. GraphEED is to have a detection function representing these graph structures. Even with the inclusion of the graph Laplacian, the proposed GraphEED has the same computational complexity as that of the max-margin EED, which is a well-known learning-based EED, but the detection performance has been largely improved. To further make the model appropriate in large-scale applications, with the technique of online learning, the proposed GraphEED is extended to the so-called online GraphEED (OGraphEED). In OGraphEED, the buffering technique is employed to make the optimization practical by reducing the computation and storage cost. Extensive experiments on three video-based datasets have demonstrated the superiority of the proposed methods in terms of both effectiveness and efficiency.
Collapse
|
3
|
Liu J, Li B, Lei M, Shi Y. Self-supervised knowledge distillation for complementary label learning. Neural Netw 2022; 155:318-327. [DOI: 10.1016/j.neunet.2022.08.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Revised: 06/05/2022] [Accepted: 08/11/2022] [Indexed: 11/27/2022]
|
4
|
Xie L, Guo W, Wei H, Tang Y, Tao D. Efficient Unsupervised Dimension Reduction for Streaming Multiview Data. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1772-1784. [PMID: 32525809 DOI: 10.1109/tcyb.2020.2996684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Multiview learning has received substantial attention over the past decade due to its powerful capacity in integrating various types of information. Conventional unsupervised multiview dimension reduction (UMDR) methods are usually conducted in an offline manner and may fail in many real-world applications, where data arrive sequentially and the data distribution changes periodically. Moreover, satisfying the requirements of high memory consumption and expensive retraining of the time cost in large-scale scenarios are difficult. To remedy these drawbacks, we propose an online UMDR (OUMDR) framework. OUMDR aims to seek a low-dimensional and informative consensus representation for streaming multiview data. View-specific weights are also learned in this article to reflect the contributions of different views to the final consensus presentation. A specific model called OUMDR-E is developed by introducing the exclusive group LASSO (EG-LASSO) to explore the intraview and interview correlations. Then, we develop an efficient iterative algorithm with limited memory and time cost requirements for optimization, where the convergence of each update is theoretically guaranteed. We evaluate the proposed approach in video-based expression recognition applications. The experimental results demonstrate the superiority of our approach in terms of both effectiveness and efficiency.
Collapse
|
5
|
Duan M, Li K, Li K, Tian Q. A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3418285] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Multi-task learning plays an important role in face multi-attribute prediction. At present, most researches excavate the shared information between attributes by sharing all convolutional layers. However, it is not appropriate to treat the low-level and high-level features of the face multi-attribute equally, because the high-level features are more biased toward the specific content of the category. In this article, a novel multi-attribute tensor correlation neural network (MTCN) is used to predict face attributes. MTCN shares all attribute features at the low-level layers, and then distinguishes each attribute feature at the high-level layers. To better excavate the correlations among high-level attribute features, each sub-network explores useful information from other networks to enhance its original information. Then a tensor canonical correlation analysis method is used to seek the correlations among the highest-level attributes, which enhances the original information of each attribute. After that, these features are mapped into a highly correlated space through the correlation matrix. Finally, we use sufficient experiments to verify the performance of MTCN on the CelebA and LFWA datasets and our MTCN achieves the best performance compared with the latest multi-attribute recognition algorithms under the same settings.
Collapse
Affiliation(s)
| | | | - Keqin Li
- State University of New York, USA
| | | |
Collapse
|
6
|
Shetty RP, Sathyabhama A, Pai PS. An efficient online sequential extreme learning machine model based on feature selection and parameter optimization using cuckoo search algorithm for multi-step wind speed forecasting. Soft comput 2020. [DOI: 10.1007/s00500-020-05222-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Zhong P, Gong Z, Shan J. Multiple Instance Learning for Multiple Diverse Hyperspectral Target Characterizations. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:246-258. [PMID: 30892253 DOI: 10.1109/tnnls.2019.2900465] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A practical hyperspectral target characterization task estimates a target signature from imprecisely labeled training data. The imprecisions arise from the characteristics of the real-world tasks. First, accurate pixel-level labels on training data are often unavailable. Second, the subpixel targets and occluded targets cause the training samples to contain mixed data and multiple target types. To address these imprecisions, this paper proposes a new hyperspectral target characterization method to produce diverse multiple hyperspectral target signatures under a multiple instance learning (MIL) framework. The proposed method uses only bag-level training samples and labels, which solves the problems arising from the mixed data and lack of pixel-level labels. Moreover, by formulating a multiple characterization MIL and including a diversity-promoting term, the proposed method can learn a set of diverse target signatures, which solves the problems arising from multiple target types in training samples. The experiments on hyperspectral target detections using the learned multiple target signatures over synthetic and real-world data show the effectiveness of the proposed method.
Collapse
|