1
|
Guo S, Wang H, Lin S, Kou Z, Geng X. Addressing Skewed Heterogeneity via Federated Prototype Rectification With Personalization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8442-8454. [PMID: 39190523 DOI: 10.1109/tnnls.2024.3438281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/29/2024]
Abstract
Federated learning (FL) is an efficient framework designed to facilitate collaborative model training across multiple distributed devices while preserving user data privacy. A significant challenge of FL is data-level heterogeneity, i.e., skewed or long-tailed distribution of private data. Although various methods have been proposed to address this challenge, most of them assume that the underlying global data are uniformly distributed across all clients. This article investigates data-level heterogeneity FL with a brief review and redefines a more practical and challenging setting called skewed heterogeneous FL (SHFL). Accordingly, we propose a novel federated prototype rectification with personalization (FedPRP) which consists of two parts: federated personalization and federated prototype rectification. The former aims to construct balanced decision boundaries between dominant and minority classes based on private data, while the latter exploits both interclass discrimination and intraclass consistency to rectify empirical prototypes. Experiments on three popular benchmarks show that the proposed approach outperforms current state-of-the-art methods and achieves balanced performance in both personalization and generalization.
Collapse
|
2
|
Shen S, Qi W, Liu X, Zeng J, Li S, Zhu X, Dong C, Wang B, Shi Y, Yao J, Wang B, Jing L, Cao S, Liang G. From virtual to reality: innovative practices of digital twins in tumor therapy. J Transl Med 2025; 23:348. [PMID: 40108714 PMCID: PMC11921680 DOI: 10.1186/s12967-025-06371-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2025] [Accepted: 03/10/2025] [Indexed: 03/22/2025] Open
Abstract
BACKGROUND As global cancer incidence and mortality rise, digital twin technology in precision medicine offers new opportunities for cancer treatment. OBJECTIVE This study aims to systematically analyze the current applications, research trends, and challenges of digital twin technology in tumor therapy, while exploring future directions. METHODS Relevant literature up to 2024 was retrieved from PubMed, Web of Science, and other databases. Data visualization was performed using R and VOSviewer software. The analysis includes the research initiation and trends, funding models, global research distribution, sample size analysis, and data processing and artificial intelligence applications. Furthermore, the study investigates the specific applications and effectiveness of digital twin technology in tumor diagnosis, treatment decision-making, prognosis prediction, and personalized management. RESULTS Since 2020, research on digital twin technology in oncology has surged, with significant contributions from the United States, Germany, Switzerland, and China. Funding primarily comes from government agencies, particularly the National Institutes of Health in the U.S. Sample size analysis reveals that large-sample studies have greater clinical reliability, while small-sample studies emphasize technology validation. In data processing and artificial intelligence applications, the integration of medical imaging, multi-omics data, and AI algorithms is key. By combining multimodal data integration with dynamic modeling, the accuracy of digital twin models has been significantly improved. However, the integration of different data types still faces challenges related to tool interoperability and limited standardization. Specific applications of digital twin technology have shown significant advantages in diagnosis, treatment decision-making, prognosis prediction, and surgical planning. CONCLUSION Digital twin technology holds substantial promise in tumor therapy by optimizing personalized treatment plans through integrated multimodal data and dynamic modeling. However, the study is limited by factors such as language restrictions, potential selection bias, and the relatively small number of published studies in this emerging field, which may affect the comprehensiveness and generalizability of our findings. Moreover, issues related to data heterogeneity, technical integration, and data privacy and ethics continue to impede its broader clinical application. Future research should promote international collaboration, establish unified interdisciplinary standards, and strengthen ethical regulations to accelerate the clinical translation of digital twin technology in cancer treatment.
Collapse
Affiliation(s)
- Shiying Shen
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Wenhao Qi
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Xin Liu
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Jianwen Zeng
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Sixie Li
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Xiaohong Zhu
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Chaoqun Dong
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Bin Wang
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Yankai Shi
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Jiani Yao
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Bingsheng Wang
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Louxia Jing
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China
| | - Shihua Cao
- School of Nursing, Hangzhou Normal University, No.2318, Yuhangtang Road, Yuhang District, Hangzhou, 310021, China.
- Key Engineering Research Center of Mobile Health Management System, Ministry of Education, Hangzhou, China.
| | - Guanmian Liang
- Zhejiang Cancer Hospital, Hangzhou, China
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, China
| |
Collapse
|
3
|
Zhang X, Zhang Y, Zhang Y, Cheng Y, Liu Q, Deng H, Ma Y, Bai L, Liu L. High-risk nuclide screening and parameter sensitivity analysis based on numerical simulation and machine learning. JOURNAL OF HAZARDOUS MATERIALS 2024; 480:136002. [PMID: 39378595 DOI: 10.1016/j.jhazmat.2024.136002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 09/01/2024] [Accepted: 09/27/2024] [Indexed: 10/10/2024]
Abstract
During nuclear accidents, large quantities of radionuclides will be released into the environment, posing serious health hazards to local residents. The screening of high-risk nuclides is critical for the development of subsequent nuclear emergency response measures. In order to overcome the shortcomings of traditional screening methods, a machine learning method was proposed to screen high-risk nuclides and predict their contamination to groundwater more effectively. The performances of Support Vector Machine (SVM), Random Forest (RF) and Back Propagation Neural Network (BPNN) algorithms were compared, and sensitivity analyses of the initial leakage concentration ratio (C0/Cp), distribution coefficient (Kd) and decay coefficient (λ) on the model outputs were performed. Results showed that RF classification model achieved the highest prediction accuracy for screening high-risk nuclides. The contribution of the input parameters ranked as Kd > λ > C0/Cp. BPNN regression model was found to be the best for predicting when high-risk nuclides would pollute groundwater. The output was negatively correlated with C0/Cp and positively correlated with Kd and λ, with the parameter influence ranking as Kd > C0/Cp > λ. The contribution of Kd mainly came from itself, and the contribution of C0/Cp and λ mainly due to their interaction with other parameters.
Collapse
Affiliation(s)
- Xin Zhang
- College of Construction Engineering, Jilin University, Changchun 130026, China; Engineering Research Center of Geothermal Resources Development Technology and Equipment, Ministry of Education, Jilin University, Changchun 130026, China
| | - Yanjun Zhang
- College of Construction Engineering, Jilin University, Changchun 130026, China; Engineering Research Center of Geothermal Resources Development Technology and Equipment, Ministry of Education, Jilin University, Changchun 130026, China.
| | - Yu Zhang
- State Key Laboratory for Geomechanics and Deep Underground Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
| | - Yuxiang Cheng
- Engineering Research Center of Geothermal Resources Development Technology and Equipment, Ministry of Education, Jilin University, Changchun 130026, China; Key Lab of Groundwater Resource and Environment, Ministry of Education, Jilin University, Changchun 130026, China.
| | - Qiangbin Liu
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Hao Deng
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Yongjie Ma
- Zhejiang Huadong Geotechnical Investigation & Design Institute Co., Ltd, Hangzhou 310023, China
| | - Lin Bai
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Lei Liu
- Chinergy Co., Ltd, Beijing 100193, China
| |
Collapse
|
4
|
Chen X, Zhang Z, Abed AM, Lin L, Zhang H, Escorcia-Gutierrez J, Shohan AAA, Ali E, Xu H, Assilzadeh H, Zhen L. Designing energy-efficient buildings in urban centers through machine learning and enhanced clean water managements. ENVIRONMENTAL RESEARCH 2024; 260:119526. [PMID: 38972341 DOI: 10.1016/j.envres.2024.119526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Revised: 06/06/2024] [Accepted: 06/30/2024] [Indexed: 07/09/2024]
Abstract
Rainwater Harvesting (RWH) is increasingly recognized as a vital sustainable practice in urban environments, aimed at enhancing water conservation and reducing energy consumption. This study introduces an innovative integration of nano-composite materials as Silver Nanoparticles (AgNPs) into RWH systems to elevate water treatment efficiency and assess the resulting environmental and energy-saving benefits. Utilizing a regression analysis approach with Support Vector Machines (SVM) and K-Nearest Neighbors (KNN), this study will reach the study objective. In this study, the inputs are building attributes, environmental parameters, sociodemographic factors, and the algorithms SVM and KNN. At the same time, the outputs are predicted energy consumption, visual comfort outcomes, ROC-AUC values, and Kappa Indices. The integration of AgNPs into RWH systems demonstrated substantial environmental and operational benefits, achieving a 57% reduction in microbial content and 20% reductions in both chemical usage and energy consumption. These improvements highlight the potential of AgNPs to enhance water safety and reduce the environmental impact of traditional water treatments, making them a viable alternative for sustainable water management. Additionally, the use of a hybrid SVM-KNN model effectively predicted building energy usage and visual comfort, with high accuracy and precision, underscoring its utility in optimizing urban building environments for sustainability and comfort.
Collapse
Affiliation(s)
- Ximo Chen
- Zhejiang College of Security Technology, Wenzhou, 325000, China.
| | - Zhaojuan Zhang
- College of Information Engineering, China Jiliang University, Hangzhou, 310018, China.
| | - Azher M Abed
- Mechanical power Techniques Engineering Department, College of Engineering and Technologies, Al-Mustaqbal University, Babylon, 51001, Iraq; Al-Mustaqbal Center for energy research, Al-Mustaqbal University, Babylon, 51001, Iraq.
| | - Luning Lin
- Institute of Intelligent Media Computing, Hangzhou DianziUniversity, Hangzhou 310018, China
| | - Haqi Zhang
- Institute of Intelligent Media Computing, Hangzhou DianziUniversity, Hangzhou 310018, China
| | - José Escorcia-Gutierrez
- Department of Computational Science and Electronics, Universidad de la Costa, CUC, Barranquilla, 080002, Colombia.
| | - Ahmed Ali A Shohan
- Architecture Department, College of Architecture and Planning, King Khalid University, Saudi Arabia
| | - Elimam Ali
- Department of Civil Engineering, College of Engineering in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj, 11942, Saudi Arabia
| | - Huiting Xu
- Institute of Intelligent Media Computing, Hangzhou DianziUniversity, Hangzhou 310018, China
| | - Hamid Assilzadeh
- Department of Biomaterials, Saveetha Dental College and Hospital, Saveetha Institute of Medical and Technical Sciences, Chennai 600077, India; Institute of Research and Development, Duy Tan University, Da Nang, Viet Nam; School of Engineering & Technology, Duy Tan University, Da Nang, Viet Nam; Faculty of Architecture and Urbanism, UTE University, Calle Rumipamba S/N and Bourgeois, Quito, Ecuador
| | - Lei Zhen
- Wenzhou Design Group Co. LTD, 325000, Wenzhou, China
| |
Collapse
|
5
|
Wei M, Zhou Y, Li Z, Xu X. Class-imbalanced complementary-label learning via weighted loss. Neural Netw 2023; 166:555-565. [PMID: 37586256 DOI: 10.1016/j.neunet.2023.07.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 06/17/2023] [Accepted: 07/23/2023] [Indexed: 08/18/2023]
Abstract
Complementary-label learning (CLL) is widely used in weakly supervised classification, but it faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples. In such scenarios, the number of samples in one class is considerably lower than in other classes, which consequently leads to a decline in the accuracy of predictions. Unfortunately, existing CLL approaches have not investigate this problem. To alleviate this challenge, we propose a novel problem setting that enables learning from class-imbalanced complementary labels for multi-class classification. To tackle this problem, we propose a novel CLL approach called Weighted Complementary-Label Learning (WCLL). The proposed method models a weighted empirical risk minimization loss by utilizing the class-imbalanced complementary labels, which is also applicable to multi-class imbalanced training samples. Furthermore, we derive an estimation error bound to provide theoretical assurance. To evaluate our approach, we conduct extensive experiments on several widely-used benchmark datasets and a real-world dataset, and compare our method with existing state-of-the-art methods. The proposed approach shows significant improvement in these datasets, even in the case of multiple class-imbalanced scenarios. Notably, the proposed method not only utilizes complementary labels to train a classifier but also solves the problem of class imbalance.
Collapse
Affiliation(s)
- Meng Wei
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China
| | - Yong Zhou
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China
| | - Zhongnian Li
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China
| | - Xinzheng Xu
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China.
| |
Collapse
|
6
|
Zhao Y, Yang L. Distance metric learning based on the class center and nearest neighbor relationship. Neural Netw 2023; 164:631-644. [PMID: 37245477 DOI: 10.1016/j.neunet.2023.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 04/25/2023] [Accepted: 05/01/2023] [Indexed: 05/30/2023]
Abstract
Distance metric learning has been a promising technology to improve the performance of algorithms related to distance metrics. The existing distance metric learning methods are either based on the class center or the nearest neighbor relationship. In this work, we propose a new distance metric learning method based on the class center and nearest neighbor relationship (DMLCN). Specifically, when centers of different classes overlap, DMLCN first splits each class into several clusters and uses one center to represent one cluster. Then, a distance metric is learned such that each example is close to the corresponding cluster center and the nearest neighbor relationship is kept for each receptive field. Therefore, while characterizing the local structure of data, the proposed method leads to intra-class compactness and inter-class dispersion simultaneously. Further, to better process complex data, we introduce multiple metrics into DMLCN (MMLCN) by learning a local metric for each center. Following that, a new classification decision rule is designed based on the proposed methods. Moreover, we develop an iterative algorithm to optimize the proposed methods. The convergence and complexity are analyzed theoretically. Experiments on different types of data sets including artificial data sets, benchmark data sets and noise data sets show the feasibility and effectiveness of the proposed methods.
Collapse
Affiliation(s)
- Yifeng Zhao
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| | - Liming Yang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China; College of Science, China Agricultural University, Beijing, Haidian, 100083, China.
| |
Collapse
|
7
|
Mohan NJ, Murugan R, Goel T, Tanveer M, Roy P. An efficient microaneurysms detection approach in retinal fundus images. INT J MACH LEARN CYB 2023. [DOI: 10.1007/s13042-022-01696-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
|
8
|
Wang H, Zhu J, Feng F. Elastic net twin support vector machine and its safe screening rules. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
9
|
Hazarika BB, Gupta D, Kumar B. EEG Signal Classification Using a Novel Universum-Based Twin Parametric-Margin Support Vector Machine. Cognit Comput 2023. [DOI: 10.1007/s12559-023-10115-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
|
10
|
A Tailored Particle Swarm and Egyptian Vulture Optimization-Based Synthetic Minority-Oversampling Technique for Class Imbalance Problem. INFORMATION 2022. [DOI: 10.3390/info13080386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Class imbalance is one of the significant challenges in classification problems. The uneven distribution of data samples in different classes may occur due to human error, improper/unguided collection of data samples, etc. The uneven distribution of class samples among classes may affect the classification accuracy of the developed model. The main motivation behind this study is the design and development of methodologies for handling class imbalance problems. In this study, a new variant of the synthetic minority oversampling technique (SMOTE) has been proposed with the hybridization of particle swarm optimization (PSO) and Egyptian vulture (EV). The proposed method has been termed SMOTE-PSOEV in this study. The proposed method generates an optimized set of synthetic samples from traditional SMOTE and augments the five datasets for verification and validation. The SMOTE-PSOEV is then compared with existing SMOTE variants, i.e., Tomek Link, Borderline SMOTE1, Borderline SMOTE2, Distance SMOTE, and ADASYN. After data augmentation to the minority classes, the performance of SMOTE-PSOEV has been evaluated using support vector machine (SVM), Naïve Bayes (NB), and k-nearest-neighbor (k-NN) classifiers. The results illustrate that the proposed models achieved higher accuracy than existing SMOTE variants.
Collapse
|
11
|
El-Deeb OM, Elbadawy W, Elzanfaly DS. The Effect of Imbalanced Classes on Students' Academic Performance Prediction. INTERNATIONAL JOURNAL OF E-COLLABORATION 2022. [DOI: 10.4018/ijec.304373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Imbalanced classes in data mining have more challenges in the educational data mining field. This is because most of the datasets collected from educational records are imbalanced by nature. Some classes dominate others and cause bias predictions. This paper studies the effects of the imbalanced classes on the performance of seven different classifiers, which are J48, Random Forest, k-Nearest Neighbors, Naïve Bayes, Random Tree, SVM, and Linear Regression. Moreover, the effectiveness of the SMOTE technique for handling imbalanced data is evaluated against these classifiers. This will be done through the proposal of an early predictive model that predicts student’s academic performance and recommends their appropriate department in a multi-disciplinary institute. According to our results, the Random Forest technique is the best and has the highest level of accuracy is 94.585%.
Collapse
|