1
|
Jiang Y, Liao D, Zhu Q, Lu YY. PhyloMix: enhancing microbiome-trait association prediction through phylogeny-mixing augmentation. Bioinformatics 2025; 41:btaf014. [PMID: 39799515 PMCID: PMC11849959 DOI: 10.1093/bioinformatics/btaf014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Revised: 12/14/2024] [Accepted: 01/09/2025] [Indexed: 01/15/2025] Open
Abstract
MOTIVATION Understanding the associations between traits and microbial composition is a fundamental objective in microbiome research. Recently, researchers have turned to machine learning (ML) models to achieve this goal with promising results. However, the effectiveness of advanced ML models is often limited by the unique characteristics of microbiome data, which are typically high-dimensional, compositional, and imbalanced. These characteristics can hinder the models' ability to fully explore the relationships among taxa in predictive analyses. To address this challenge, data augmentation has become crucial. It involves generating synthetic samples with artificial labels based on existing data and incorporating these samples into the training set to improve ML model performance. RESULTS Here, we propose PhyloMix, a novel data augmentation method specifically designed for microbiome data to enhance predictive analyses. PhyloMix leverages the phylogenetic relationships among microbiome taxa as an informative prior to guide the generation of synthetic microbial samples. Leveraging phylogeny, PhyloMix creates new samples by removing a subtree from one sample and combining it with the corresponding subtree from another sample. Notably, PhyloMix is designed to address the compositional nature of microbiome data, effectively handling both raw counts and relative abundances. This approach introduces sufficient diversity into the augmented samples, leading to improved predictive performance. We empirically evaluated PhyloMix on six real microbiome datasets across five commonly used ML models. PhyloMix significantly outperforms distinct baseline methods including sample-mixing-based data augmentation techniques like vanilla mixup and compositional cutmix, as well as the phylogeny-based method TADA. We also demonstrated the wide applicability of PhyloMix in both supervised learning and contrastive representation learning. AVAILABILITY AND IMPLEMENTATION The Apache-licensed source code is available at (https://github.com/batmen-lab/phylomix).
Collapse
Affiliation(s)
- Yifan Jiang
- Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| | - Disen Liao
- Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| | - Qiyun Zhu
- School of Life Sciences, Arizona State University, Tempe, AZ, 85281, United States
| | - Yang Young Lu
- Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| |
Collapse
|
2
|
Civitelli E, Sortino A, Lapucci M, Bagattini F, Galvan G. A Robust Initialization of Residual Blocks for Effective ResNet Training Without Batch Normalization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1947-1952. [PMID: 37889824 DOI: 10.1109/tnnls.2023.3325541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Batch normalization is an essential component of all state-of-the-art neural networks architectures. However, since it introduces many practical issues, much recent research has been devoted to designing normalization-free architectures. In this brief, we show that weights initialization is key to train ResNet-like normalization-free networks. In particular, we propose a slight modification to the summation operation of a block output to the skip-connection branch, so that the whole network is correctly initialized. We show that this modified architecture achieves competitive results on CIFAR-10, CIFAR-100 and ImageNet without further regularization nor algorithmic modifications.
Collapse
|
3
|
Liu M, Yu Y, Ji Z, Han J, Zhang Z. Tolerant Self-Distillation for image classification. Neural Netw 2024; 174:106215. [PMID: 38471261 DOI: 10.1016/j.neunet.2024.106215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Revised: 02/06/2024] [Accepted: 02/25/2024] [Indexed: 03/14/2024]
Abstract
Deep neural networks tend to suffer from the overfitting issue when the training data are not enough. In this paper, we introduce two metrics from the intra-class distribution of correct-predicted and incorrect-predicted samples to provide a new perspective on the overfitting issue. Based on it, we propose a knowledge distillation approach without pretraining a teacher model in advance named Tolerant Self-Distillation (TSD) for alleviating the overfitting issue. It introduces an online updating memory and selectively stores the class predictions of the samples from the past iterations, making it possible to distill knowledge across the iterations. Specifically, the class predictions stored in the memory bank serve as the soft labels for supervising the samples from the same class for the current iteration in a reverse way, i.e. the correct-predicted samples are supervised with the incorrect predictions while the incorrect-predicted samples are supervised with the correct predictions. Consequently, the premature convergence issue caused by the over-confident samples would be mitigated, which helps the model to converge to a better local optimum. Extensive experimental results on several image classification benchmarks, including small-scale, large-scale, and fine-grained datasets, demonstrate the superiority of the proposed TSD.
Collapse
Affiliation(s)
- Mushui Liu
- College of Information Science and Electronic Engineering, Zhejiang University, China
| | - Yunlong Yu
- College of Information Science and Electronic Engineering, Zhejiang University, China.
| | - Zhong Ji
- School of Electrical and Information Engineering, Tianjin University, China
| | - Jungong Han
- Department of Computer Science, the University of Sheffield, UK
| | - Zhongfei Zhang
- Computer Science Department, Watson School, State University of New York Binghamton University, USA
| |
Collapse
|
4
|
Ren H, Zhao Y, Zhang Y, Sun W. Learning label smoothing for text classification. PeerJ Comput Sci 2024; 10:e2005. [PMID: 38686010 PMCID: PMC11057568 DOI: 10.7717/peerj-cs.2005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 04/01/2024] [Indexed: 05/02/2024]
Abstract
Training with soft labels instead of hard labels can effectively improve the robustness and generalization of deep learning models. Label smoothing often provides uniformly distributed soft labels during the training process, whereas it does not take the semantic difference of labels into account. This article introduces discrimination-aware label smoothing, an adaptive label smoothing approach that learns appropriate distributions of labels for iterative optimization objectives. In this approach, positive and negative samples are employed to provide experience from both sides, and the performances of regularization and model calibration are improved through an iterative learning method. Experiments on five text classification datasets demonstrate the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Han Ren
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou, China
- Laboratory of Language and Artificial Intelligence, Guangdong University of Foreign Studies, Guangzhou, China
| | - Yajie Zhao
- School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, China
| | - Yong Zhang
- School of Computer Science, Central China Normal University, Wuhan, China
| | - Wei Sun
- School of Information Science and Technology, Qiong Tai Normal University, Haikou, China
| |
Collapse
|
5
|
Morales-Martín A, Mesas-Carrascosa FJ, Gutiérrez PA, Pérez-Porras FJ, Vargas VM, Hervás-Martínez C. Deep Ordinal Classification in Forest Areas Using Light Detection and Ranging Point Clouds. SENSORS (BASEL, SWITZERLAND) 2024; 24:2168. [PMID: 38610379 PMCID: PMC11014040 DOI: 10.3390/s24072168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 03/20/2024] [Accepted: 03/26/2024] [Indexed: 04/14/2024]
Abstract
Recent advances in Deep Learning and aerial Light Detection And Ranging (LiDAR) have offered the possibility of refining the classification and segmentation of 3D point clouds to contribute to the monitoring of complex environments. In this context, the present study focuses on developing an ordinal classification model in forest areas where LiDAR point clouds can be classified into four distinct ordinal classes: ground, low vegetation, medium vegetation, and high vegetation. To do so, an effective soft labeling technique based on a novel proposed generalized exponential function (CE-GE) is applied to the PointNet network architecture. Statistical analyses based on Kolmogorov-Smirnov and Student's t-test reveal that the CE-GE method achieves the best results for all the evaluation metrics compared to other methodologies. Regarding the confusion matrices of the best alternative conceived and the standard categorical cross-entropy method, the smoothed ordinal classification obtains a more consistent classification compared to the nominal approach. Thus, the proposed methodology significantly improves the point-by-point classification of PointNet, reducing the errors in distinguishing between the middle classes (low vegetation and medium vegetation).
Collapse
Affiliation(s)
- Alejandro Morales-Martín
- Department of Computer Science and Numerical Analysis, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain; (P.A.G.); (V.M.V.); (C.H.-M.)
| | - Francisco-Javier Mesas-Carrascosa
- Department of Graphic Engineering and Geomatics, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain; (F.-J.M.-C.); (F.-J.P.-P.)
| | - Pedro Antonio Gutiérrez
- Department of Computer Science and Numerical Analysis, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain; (P.A.G.); (V.M.V.); (C.H.-M.)
| | - Fernando-Juan Pérez-Porras
- Department of Graphic Engineering and Geomatics, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain; (F.-J.M.-C.); (F.-J.P.-P.)
| | - Víctor Manuel Vargas
- Department of Computer Science and Numerical Analysis, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain; (P.A.G.); (V.M.V.); (C.H.-M.)
| | - César Hervás-Martínez
- Department of Computer Science and Numerical Analysis, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain; (P.A.G.); (V.M.V.); (C.H.-M.)
| |
Collapse
|
6
|
Moshkov N, Bornholdt M, Benoit S, Smith M, McQuin C, Goodman A, Senft RA, Han Y, Babadi M, Horvath P, Cimini BA, Carpenter AE, Singh S, Caicedo JC. Learning representations for image-based profiling of perturbations. Nat Commun 2024; 15:1594. [PMID: 38383513 PMCID: PMC10881515 DOI: 10.1038/s41467-024-45999-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 02/07/2024] [Indexed: 02/23/2024] Open
Abstract
Measuring the phenotypic effect of treatments on cells through imaging assays is an efficient and powerful way of studying cell biology, and requires computational methods for transforming images into quantitative data. Here, we present an improved strategy for learning representations of treatment effects from high-throughput imaging, following a causal interpretation. We use weakly supervised learning for modeling associations between images and treatments, and show that it encodes both confounding factors and phenotypic features in the learned representation. To facilitate their separation, we constructed a large training dataset with images from five different studies to maximize experimental diversity, following insights from our causal analysis. Training a model with this dataset successfully improves downstream performance, and produces a reusable convolutional network for image-based profiling, which we call Cell Painting CNN. We evaluated our strategy on three publicly available Cell Painting datasets, and observed that the Cell Painting CNN improves performance in downstream analysis up to 30% with respect to classical features, while also being more computationally efficient.
Collapse
Affiliation(s)
- Nikita Moshkov
- HUN-REN Biological Research Centre, 62 Temesvári krt, Szeged, 6726, Hungary
| | - Michael Bornholdt
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Santiago Benoit
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
- Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA
| | - Matthew Smith
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
- Harvard College, 86 Brattle Street Cambridge, Cambridge, MA, 02138, USA
| | - Claire McQuin
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Allen Goodman
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Rebecca A Senft
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Yu Han
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Mehrtash Babadi
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Peter Horvath
- HUN-REN Biological Research Centre, 62 Temesvári krt, Szeged, 6726, Hungary
| | - Beth A Cimini
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Anne E Carpenter
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Shantanu Singh
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA
| | - Juan C Caicedo
- Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02141, USA.
- Morgridge Institute for Research, 330 N Orchard St, Madison, WI, 53715, USA.
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, 1300 University Ave, Madison, WI, 53706, USA.
| |
Collapse
|
7
|
Passmore E, Kwong AL, Greenstein S, Olsen JE, Eeles AL, Cheong JLY, Spittle AJ, Ball G. Automated identification of abnormal infant movements from smart phone videos. PLOS DIGITAL HEALTH 2024; 3:e0000432. [PMID: 38386627 PMCID: PMC10883563 DOI: 10.1371/journal.pdig.0000432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 12/17/2023] [Indexed: 02/24/2024]
Abstract
Cerebral palsy (CP) is the most common cause of physical disability during childhood, occurring at a rate of 2.1 per 1000 live births. Early diagnosis is key to improving functional outcomes for children with CP. The General Movements (GMs) Assessment has high predictive validity for the detection of CP and is routinely used in high-risk infants but only 50% of infants with CP have overt risk factors when they are born. The implementation of CP screening programs represents an important endeavour, but feasibility is limited by access to trained GMs assessors. To facilitate progress towards this goal, we report a deep-learning framework for automating the GMs Assessment. We acquired 503 videos captured by parents and caregivers at home of infants aged between 12- and 18-weeks term-corrected age using a dedicated smartphone app. Using a deep learning algorithm, we automatically labelled and tracked 18 key body points in each video. We designed a custom pipeline to adjust for camera movement and infant size and trained a second machine learning algorithm to predict GMs classification from body point movement. Our automated body point labelling approach achieved human-level accuracy (mean ± SD error of 3.7 ± 5.2% of infant length) compared to gold-standard human annotation. Using body point tracking data, our prediction model achieved a cross-validated area under the curve (mean ± S.D.) of 0.80 ± 0.08 in unseen test data for predicting expert GMs classification with a sensitivity of 76% ± 15% for abnormal GMs and a negative predictive value of 94% ± 3%. This work highlights the potential for automated GMs screening programs to detect abnormal movements in infants as early as three months term-corrected age using digital technologies.
Collapse
Affiliation(s)
- E Passmore
- Murdoch Children's Research Institute, Developmental Imaging, Melbourne, Australia
- University of Melbourne, Engineering and Information Technology, Melbourne, Australia
- University of Melbourne, Medicine, Dentistry & Health Sciences, Melbourne, Australia
- Royal Children's Hospital, Gait Analysis Laboratory, Melbourne, Australia
| | - A L Kwong
- University of Melbourne, Medicine, Dentistry & Health Sciences, Melbourne, Australia
- Murdoch Children's Research Institute, Victorian Infant Brain Studies, Melbourne, Australia
- Royal Women's Hospital, Newborn Research Centre, Melbourne, Australia
| | - S Greenstein
- Murdoch Children's Research Institute, Developmental Imaging, Melbourne, Australia
| | - J E Olsen
- Murdoch Children's Research Institute, Victorian Infant Brain Studies, Melbourne, Australia
- Royal Women's Hospital, Newborn Research Centre, Melbourne, Australia
| | - A L Eeles
- Murdoch Children's Research Institute, Victorian Infant Brain Studies, Melbourne, Australia
- Royal Women's Hospital, Newborn Research Centre, Melbourne, Australia
| | - J L Y Cheong
- University of Melbourne, Medicine, Dentistry & Health Sciences, Melbourne, Australia
- Murdoch Children's Research Institute, Victorian Infant Brain Studies, Melbourne, Australia
- Royal Women's Hospital, Newborn Research Centre, Melbourne, Australia
| | - A J Spittle
- University of Melbourne, Medicine, Dentistry & Health Sciences, Melbourne, Australia
- Murdoch Children's Research Institute, Victorian Infant Brain Studies, Melbourne, Australia
| | - G Ball
- Murdoch Children's Research Institute, Developmental Imaging, Melbourne, Australia
- University of Melbourne, Medicine, Dentistry & Health Sciences, Melbourne, Australia
| |
Collapse
|
8
|
Arefinia F, Aria M, Rabiei R, Hosseini A, Ghaemian A, Roshanpoor A. Non-invasive fractional flow reserve estimation using deep learning on intermediate left anterior descending coronary artery lesion angiography images. Sci Rep 2024; 14:1818. [PMID: 38245614 PMCID: PMC10799954 DOI: 10.1038/s41598-024-52360-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 01/17/2024] [Indexed: 01/22/2024] Open
Abstract
This study aimed to design an end-to-end deep learning model for estimating the value of fractional flow reserve (FFR) using angiography images to classify left anterior descending (LAD) branch angiography images with average stenosis between 50 and 70% into two categories: FFR > 80 and FFR ≤ 80. In this study 3625 images were extracted from 41 patients' angiography films. Nine pre-trained convolutional neural networks (CNN), including DenseNet121, InceptionResNetV2, VGG16, VGG19, ResNet50V2, Xception, MobileNetV3Large, DenseNet201, and DenseNet169, were used to extract the features of images. DenseNet169 indicated higher performance compared to other networks. AUC, Accuracy, Sensitivity, Specificity, Precision, and F1-score of the proposed DenseNet169 network were 0.81, 0.81, 0.86, 0.75, 0.82, and 0.84, respectively. The deep learning-based method proposed in this study can non-invasively and consistently estimate FFR from angiographic images, offering significant clinical potential for diagnosing and treating coronary artery disease by combining anatomical and physiological parameters.
Collapse
Affiliation(s)
- Farhad Arefinia
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mehrad Aria
- Cancer Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Reza Rabiei
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Azamossadat Hosseini
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Ali Ghaemian
- Department of Cardiology, Faculty of Medicine, Cardiovascular Research Center, Mazandaran University of Medical Sciences, Sari, Iran
| | - Arash Roshanpoor
- Department of Computer, Yadegar-e-Imam Khomeini (RAH), Islamic Azad University, Janat-Abad Branch, Tehran, Iran
| |
Collapse
|
9
|
Sabater A, Montesano L, Murillo AC. Event Transformer +. A Multi-Purpose Solution for Efficient Event Data Processing. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:16013-16020. [PMID: 37656643 DOI: 10.1109/tpami.2023.3311336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/03/2023]
Abstract
Event cameras record sparse illumination changes with high temporal resolution and high dynamic range. Thanks to their sparse recording and low consumption, they are increasingly used in applications such as AR/VR and autonomous driving. Current top-performing methods often ignore specific event-data properties, leading to the development of generic but computationally expensive algorithms, while event-aware methods do not perform as well. We propose Event Transformer +, that improves our seminal work EvT with a refined patch-based event representation and a more robust backbone to achieve more accurate results, while still benefiting from event-data sparsity to increase its efficiency. Additionally, we show how our system can work with different data modalities and propose specific output heads, for event-stream classification (i.e. action recognition) and per-pixel predictions (dense depth estimation). Evaluation results show better performance to the state-of-the-art while requiring minimal computation resources, both on GPU and CPU.
Collapse
|
10
|
Alexandridis KP, Luo S, Nguyen A, Deng J, Zafeiriou S. Inverse Image Frequency for Long-Tailed Image Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5721-5736. [PMID: 37824316 DOI: 10.1109/tip.2023.3321461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we propose a novel de-biasing method named Inverse Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network. Our method achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections. Our extensive experiments show that IIF surpasses the state of the art on many long-tailed benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8% top-1 accuracy with ResNet50 on ImageNet-LT and 26.3% segmentation AP with MaskRCNN ResNet50 on LVIS. Code available at https://github.com/kostas1515/iif.
Collapse
|
11
|
Nie L, Sun Z, Shan F, Li C, Ding X, Shen C. An artificial intelligence framework for the diagnosis of prosthetic joint infection based on 99mTc-MDP dynamic bone scintigraphy. Eur Radiol 2023; 33:6794-6803. [PMID: 37115217 DOI: 10.1007/s00330-023-09687-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 01/30/2023] [Accepted: 02/26/2023] [Indexed: 04/29/2023]
Abstract
OBJECTIVES Dynamic bone scintigraphy (DBS) is the first widely reliable and simple imaging modality in nuclear medicine that can be used to diagnose prosthetic joint infection (PJI). We aimed to apply artificial intelligence to diagnose PJI in patients after total hip or knee arthroplasty (THA or TKA) based on 99mTc-methylene diphosphonate (99mTc-MDP) DBS. METHODS A total of 449 patients (255 THAs and 194 TKAs) with a final diagnosis were retrospectively enrolled and analyzed. The dataset was divided into a training and validation set and an independent test set. A customized framework composed of two data preprocessing algorithms and a diagnosis model (dynamic bone scintigraphy effective neural network, DBS-eNet) was compared with mainstream modified classification models and experienced nuclear medicine specialists on corresponding datasets. RESULTS In the fivefold cross-validation test, diagnostic accuracies of 86.48% for prosthetic knee infection (PKI) and 86.33% for prosthetic hip infection (PHI) were obtained using the proposed framework. On the independent test set, the diagnostic accuracies and AUC values were 87.74% and 0.957 for PKI and 86.36% and 0.906 for PHI, respectively. The customized framework demonstrated better overall diagnostic performance compared to other classification models and showed superiority in diagnosing PKI and consistency in diagnosing PHI compared to specialists. CONCLUSION The customized framework can be used to effectively and accurately diagnose PJI based on 99mTc-MDP DBS. The excellent diagnostic performance of this method indicates its potential clinical practical value in the future. KEY POINTS • The proposed framework in the current study achieved high diagnostic performance for prosthetic knee infection (PKI) and prosthetic hip infection (PHI) with AUC values of 0.957 and 0.906, respectively. • The customized framework demonstrated better overall diagnostic performance compared to other classification models. • Compared to experienced nuclear medicine physicians, the customized framework showed superiority in diagnosing PKI and consistency in diagnosing PHI.
Collapse
Affiliation(s)
- Liangbing Nie
- School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
| | - Zhenkui Sun
- Department of Nuclear Medicine, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China
- Bone Nonunion & Bone Infection MDT, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China
| | - Fengling Shan
- Department of Nuclear Medicine, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Shanghai, 201399, China
| | - Chengfan Li
- School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
| | - Xuehai Ding
- School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
| | - Chentian Shen
- Department of Nuclear Medicine, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China.
- Bone Nonunion & Bone Infection MDT, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China.
| |
Collapse
|
12
|
fu L, Peng H, Liu S. KG-MFEND: an efficient knowledge graph-based model for multi-domain fake news detection. THE JOURNAL OF SUPERCOMPUTING 2023; 79:1-28. [PMID: 37359329 PMCID: PMC10184086 DOI: 10.1007/s11227-023-05381-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 05/03/2023] [Indexed: 06/28/2023]
Abstract
The widespread dissemination of fake news on social media brings adverse effects on the public and social development. Most existing techniques are limited to a single domain (e.g., medicine or politics) to identify fake news. However, many differences exist commonly across domains, such as word usage, which lead to those methods performing poorly in other domains. In the real world, social media releases millions of news pieces in diverse domains every day. Therefore, it is of significant practical importance to propose a fake news detection model that can be applied to multiple domains. In this paper, we propose a novel framework based on knowledge graphs (KG) for multi-domain fake news detection, named KG-MFEND. The model's performance is enhanced by improving the BERT and integrating external knowledge to alleviate domain differences at the word level. Specifically, we construct a new KG that encompasses multi-domain knowledge and injects entity triples to build a sentence tree to enrich the news background knowledge. To solve the problem of embedding space and knowledge noise, we use the soft position and visible matrix in knowledge embedding. To reduce the influence of label noise, we add label smoothing to the training. Extensive experiments are conducted on real Chinese datasets. And the results show that KG-MFEND has a strong generalization capability in single, mixed, and multiple domains and outperforms the current state-of-the-art methods for multi-domain fake news detection.
Collapse
Affiliation(s)
- Lifang fu
- Northeast Agricultural University, Harbin, 150030 China
| | - Huanxin Peng
- School of Engineering, Northeast Agricultural University, Harbin, 150030 China
| | - Shuai Liu
- School of Engineering, Northeast Agricultural University, Harbin, 150030 China
| |
Collapse
|
13
|
Multi-task learning based high-value patent and standard-essential patent identification model. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2023.103327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
|
14
|
Li P, Han T, Ren Y, Xu P, Yu H. Improved YOLOv4-tiny based on attention mechanism for skin detection. PeerJ Comput Sci 2023; 9:e1288. [PMID: 37346516 PMCID: PMC10280476 DOI: 10.7717/peerj-cs.1288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 02/20/2023] [Indexed: 06/23/2023]
Abstract
Background An automatic bathing robot needs to identify the area to be bathed in order to perform visually-guided bathing tasks. Skin detection is the first step. The deep convolutional neural network (CNN)-based object detection algorithm shows excellent robustness to light and environmental changes when performing skin detection. The one-stage object detection algorithm has good real-time performance, and is widely used in practical projects. Methods In our previous work, we performed skin detection using Faster R-CNN (ResNet50 as backbone), Faster R-CNN (MobileNetV2 as backbone), YOLOv3 (DarkNet53 as backbone), YOLOv4 (CSPDarknet53 as backbone), and CenterNet (Hourglass as backbone), and found that YOLOv4 had the best performance. In this study, we considered the convenience of practical deployment and used the lightweight version of YOLOv4, i.e., YOLOv4-tiny, for skin detection. Additionally, we added three kinds of attention mechanisms to strengthen feature extraction: SE, ECA, and CBAM. We added the attention module to the two feature layers of the backbone output. In the enhanced feature extraction network part, we applied the attention module to the up-sampled features. For full comparison, we used other lightweight methods that use MobileNetV1, MobileNetV2, and MobileNetV3 as the backbone of YOLOv4. We established a comprehensive evaluation index to evaluate the performance of the models that mainly reflected the balance between model size and mAP. Results The experimental results revealed that the weight file of YOLOv4-tiny without attention mechanisms was reduced to 9.2% of YOLOv4, but the mAP maintained 67.3% of YOLOv4. YOLOv4-tiny's performance improved after combining the CBAM and ECA modules, but the addition of SE deteriorated the performance of YOLOv4-tiny. MobileNetVX_YOLOv4 (X = 1, 2, 3), which used MobileNetV1, MobileNetV2, and MobileNetV3 as the backbone of YOLOv4, showed higher mAP than YOLOv4-tiny series (including YOLOv4-tiny and three improved YOLOv4-tiny based on the attention mechanism) but had a larger weight file. The network performance was evaluated using the comprehensive evaluation index. The model, which integrates the CBAM attention mechanism and YOLOv4-tiny, achieved a good balance between model size and detection accuracy.
Collapse
Affiliation(s)
- Ping Li
- Institute of Rehabilitation Engineering and Technology, University of Shanghai for Science and Technology, Shanghai, China
- Department of Biomedical Engineering, Changzhi Medical College, Changzhi, Shanxi, China
| | - Taiyu Han
- Institute of Rehabilitation Engineering and Technology, University of Shanghai for Science and Technology, Shanghai, China
| | - Yifei Ren
- Institute of Rehabilitation Engineering and Technology, University of Shanghai for Science and Technology, Shanghai, China
| | - Peng Xu
- Institute of Rehabilitation Engineering and Technology, University of Shanghai for Science and Technology, Shanghai, China
| | - Hongliu Yu
- Institute of Rehabilitation Engineering and Technology, University of Shanghai for Science and Technology, Shanghai, China
| |
Collapse
|
15
|
Vargas VM, Gutiérrez PA, Rosati R, Romeo L, Frontoni E, Hervás-Martínez C. Exponential loss regularisation for encouraging ordinal constraint to shotgun stocks quality assessment. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
|
16
|
Lu L, Cai Y, Huang H, Wang P. An efficient fine-grained vehicle recognition method based on part-level feature optimization. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.03.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
|
17
|
Wang X, Zong C. Learning Category Distribution for Text Classification. ACM T ASIAN LOW-RESO 2023. [DOI: 10.1145/3585279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
Abstract
Label smoothing has a wide range of applications in machine learning field. Nonetheless, label smoothing only softs the targets by adding a uniform distribution into a one-hot vectors, which cannot truthfully reflect the underlying relations among categories. However, learning category relations is of vital importance in many fields such as emotion taxonomy and open set recognition. In this work, we propose a method to obtain label distribution for each category (category distribution) to reveal category relations. Furthermore, based on the learned category distribution, we calculate new soft targets to improve the performance of model classification. Compared with existing methods, our algorithm can improve neural network models without any side information or additional neural network module by considering category relations. Extensive experiments have been conducted on four original datasets and ten constructed noisy datasets with three basic neural network models to validate our algorithm. The results demonstrate the effectiveness of our algorithm on the classification task. In addition, three experiments (arrangement, clustering, and similarity) are also conducted to validate the intrinsic quality of the learned category distribution. The results indicate that the learned category distribution can well express underlying relations among categories.
Collapse
Affiliation(s)
- Xiangyu Wang
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Chengqing Zong
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
18
|
Hallaji E, Razavi-Far R, Saif M, Herrera-Viedma E. Label noise analysis meets adversarial training: A defense against label poisoning in federated learning. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
|
19
|
Fan DP, Zhang J, Xu G, Cheng MM, Shao L. Salient Objects in Clutter. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2344-2366. [PMID: 35404809 DOI: 10.1109/tpami.2022.3166451] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this paper, we identify and address a serious design bias of existing salient object detection (SOD) datasets, which unrealistically assume that each image should contain at least one clear and uncluttered salient object. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. However, these models are still far from satisfactory when applied to real-world scenes. Based on our analyses, we propose a new high-quality dataset and update the previous saliency benchmark. Specifically, our dataset, called Salient Objects in Clutter (SOC), includes images with both salient and non-salient objects from several common object categories. In addition to object category annotations, each salient image is accompanied by attributes that reflect common challenges in common scenes, which can help provide deeper insight into the SOD problem. Further, with a given saliency encoder, e.g., the backbone network, existing saliency models are designed to achieve mapping from the training image set to the training ground-truth set. We therefore argue that improving the dataset can yield higher performance gains than focusing only on the decoder design. With this in mind, we investigate several dataset-enhancement strategies, including label smoothing to implicitly emphasize salient boundaries, random image augmentation to adapt saliency models to various scenarios, and self-supervised learning as a regularization strategy to learn from small datasets. Our extensive results demonstrate the effectiveness of these tricks. We also provide a comprehensive benchmark for SOD, which can be found in our repository: https://github.com/DengPingFan/SODBenchmark.
Collapse
|
20
|
Vargas VM, Gutiérrez PA, Rosati R, Romeo L, Frontoni E, Hervás-Martínez C. Deep learning based hierarchical classifier for weapon stock aesthetic quality control assessment. COMPUT IND 2023. [DOI: 10.1016/j.compind.2022.103786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
21
|
A Soft Label Deep Learning to Assist Breast Cancer Target Therapy and Thyroid Cancer Diagnosis. Cancers (Basel) 2022; 14:cancers14215312. [PMID: 36358732 PMCID: PMC9657740 DOI: 10.3390/cancers14215312] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/20/2022] [Accepted: 10/25/2022] [Indexed: 11/17/2022] Open
Abstract
According to the World Health Organization Report 2022, cancer is the most common cause of death contributing to nearly one out of six deaths worldwide. Early cancer diagnosis and prognosis have become essential in reducing the mortality rate. On the other hand, cancer detection is a challenging task in cancer pathology. Trained pathologists can detect cancer, but their decisions are subjective to high intra- and inter-observer variability, which can lead to poor patient care owing to false-positive and false-negative results. In this study, we present a soft label fully convolutional network (SL-FCN) to assist in breast cancer target therapy and thyroid cancer diagnosis, using four datasets. To aid in breast cancer target therapy, the proposed method automatically segments human epidermal growth factor receptor 2 (HER2) amplification in fluorescence in situ hybridization (FISH) and dual in situ hybridization (DISH) images. To help in thyroid cancer diagnosis, the proposed method automatically segments papillary thyroid carcinoma (PTC) on Papanicolaou-stained fine needle aspiration and thin prep whole slide images (WSIs). In the evaluation of segmentation of HER2 amplification in FISH and DISH images, we compare the proposed method with thirteen deep learning approaches, including U-Net, U-Net with InceptionV5, Ensemble of U-Net with Inception-v4, Inception-Resnet-v2 encoder, and ResNet-34 encoder, SegNet, FCN, modified FCN, YOLOv5, CPN, SOLOv2, BCNet, and DeepLabv3+ with three different backbones, including MobileNet, ResNet, and Xception, on three clinical datasets, including two DISH datasets on two different magnification levels and a FISH dataset. The result on DISH breast dataset 1 shows that the proposed method achieves high accuracy of 87.77 ± 14.97%, recall of 91.20 ± 7.72%, and F1-score of 81.67 ± 17.76%, while, on DISH breast dataset 2, the proposed method achieves high accuracy of 94.64 ± 2.23%, recall of 83.78 ± 6.42%, and F1-score of 85.14 ± 6.61% and, on the FISH breast dataset, the proposed method achieves high accuracy of 93.54 ± 5.24%, recall of 83.52 ± 13.15%, and F1-score of 86.98 ± 9.85%, respectively. Furthermore, the proposed method outperforms most of the benchmark approaches by a significant margin (p <0.001). In evaluation of segmentation of PTC on Papanicolaou-stained WSIs, the proposed method is compared with three deep learning methods, including Modified FCN, U-Net, and SegNet. The experimental result demonstrates that the proposed method achieves high accuracy of 99.99 ± 0.01%, precision of 92.02 ± 16.6%, recall of 90.90 ± 14.25%, and F1-score of 89.82 ± 14.92% and significantly outperforms the baseline methods, including U-Net and FCN (p <0.001). With the high degree of accuracy, precision, and recall, the results show that the proposed method could be used in assisting breast cancer target therapy and thyroid cancer diagnosis with faster evaluation and minimizing human judgment errors.
Collapse
|
22
|
Li H, Huang G, Li Y, Zhang X, Wang Y. Concept-Based Label Distribution Learning for Text Classification. INT J COMPUT INT SYS 2022. [DOI: 10.1007/s44196-022-00144-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
AbstractText classification is a crucial task in data mining and artificial intelligence. In recent years, deep learning-based text classification methods have made great development. The deep learning methods supervise model training by representing a label as a one-hot vector. However, the one-hot label representation cannot adequately reflect the relation between an instance and the labels, as labels are often not completely independent, and the instance may be associated with multiple labels in practice. Simply representing the labels as one-hot vectors leads to overconfidence in the model, making it difficult to distinguish some label confusions. In this paper, we propose a simulated label distribution method based on concepts (SLDC) to tackle this problem. This method captures the overlap between the labels by computing the similarity between an instance and the labels and generates a new simulated label distribution for assisting model training. In particular, we incorporate conceptual information from the knowledge base into the representation of instances and labels to address the surface mismatching problem when instances and labels are compared for similarity. Moreover, to fully use the simulated label distribution and the original label vector, we set up a multi-loss function to supervise the training process. Expensive experiments demonstrate the effectiveness of SLDC on five complex text classification datasets. Further experiments also verify that SLDC is especially helpful for confused datasets.
Collapse
|
23
|
Pan Y, Chen J, Zhang Y, Zhang Y. An efficient CNN-LSTM Network with spectral normalization and label smoothing technologies for SSVEP frequency recognition. J Neural Eng 2022; 19. [PMID: 36041426 DOI: 10.1088/1741-2552/ac8dc5] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Accepted: 08/30/2022] [Indexed: 11/12/2022]
Abstract
OBJECTIVE Steady-state visual evoked potentials(SSVEPs) based braincomputer interface(BCI) has received great interests owing to the high information transfer rate(ITR) and available large number of targets. However, the performance of frequency recognition methods heavily depends on the amount of the calibration data for intra-subject classification. Some research adopted the deep learning(DL) algorithm to conduct the inter-subject classification, which could reduce the calculation procedure, but the performance still has large room to improve compared with the intra-subject classification. APPROACH To address these issues, we proposed an efficient SSVEP DL NETwork (termed SSVEPNET) based on 1D convolution and long short-term memory (LSTM) module. To enhance the performance of SSVEPNT, we adopted the spectral normalization and label smoothing technologies during implementing the network architecture. We evaluated the SSVEPNET and compared it with other methods for the intra- and inter-subject classification under different conditions, i.e., two datasets, two time-window lengths (1 s and 0.5 s), three sizes of training data. MAIN RESULTS Under all the experimental settings, the proposed SSVEPNET achieved the highest average accuracy for the intra- and inter-subject classification on the two SSVEP datasets, when compared with other traditional and DL baseline methods. Signif icance. The extensive experimental results demonstrate that the proposed DL model holds promise to enhance frequency recognition performance in SSVEP-based BCIs. Besides, the mixed network structures with CNN and LSTM, and the spectral normalization and label smoothing could be useful optimization strategies to design efficient models for EEG data.
Collapse
Affiliation(s)
- YuDong Pan
- Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang,CN,621010, Mianyang, 621010, CHINA
| | - Jianbo Chen
- Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang 621010, China, Mianyang, 621010, CHINA
| | - Yangsong Zhang
- School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang,CN,621010, Mianyang, 621010, CHINA
| | - Yu Zhang
- Department of Bioengineering, Lehigh University, Bethlehem, PA 18015, USA, Bethlehem, 18015-3027, UNITED STATES
| |
Collapse
|
24
|
Liu K, Chen K, Jia K. Convolutional Fine-Grained Classification With Self-Supervised Target Relation Regularization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5570-5584. [PMID: 35981063 DOI: 10.1109/tip.2022.3197931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Fine-grained visual classification can be addressed by deep representation learning under supervision of manually pre-defined targets (e.g., one-hot or the Hadamard codes). Such target coding schemes are less flexible to model inter-class correlation and are sensitive to sparse and imbalanced data distribution as well. In light of this, this paper introduces a novel target coding scheme - dynamic target relation graphs (DTRG), which, as an auxiliary feature regularization, is a self-generated structural output to be mapped from input images. Specifically, online computation of class-level feature centers is designed to generate cross-category distance in the representation space, which can thus be depicted by a dynamic graph in a non-parametric manner. Explicitly minimizing intra-class feature variations anchored on those class-level centers can encourage learning of discriminative features. Moreover, owing to exploiting inter-class dependency, the proposed target graphs can alleviate data sparsity and imbalanceness in representation learning. Inspired by recent success of the mixup style data augmentation, this paper introduces randomness into soft construction of dynamic target relation graphs to further explore relation diversity of target classes. Experimental results can demonstrate the effectiveness of our method on a number of diverse benchmarks of multiple visual classification, especially achieving the state-of-the-art performance on three popular fine-grained object benchmarks and superior robustness against sparse and imbalanced data. Source codes are made publicly available at https://github.com/AkonLau/DTRG.
Collapse
|
25
|
Ge FX, Bai Y, Li M, Zhu G, Yin J. Label distribution-guided transfer learning for underwater source localization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:4140. [PMID: 35778193 DOI: 10.1121/10.0011741] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 06/02/2022] [Indexed: 06/15/2023]
Abstract
Underwater source localization by deep neural networks (DNNs) is challenging since training these DNNs generally requires a large amount of experimental data and is computationally expensive. In this paper, label distribution-guided transfer learning (LD-TL) for underwater source localization is proposed, where a one-dimensional convolutional neural network (1D-CNN) is pre-trained with the simulation data generated by an underwater acoustic propagation model and then fine-tuned with a very limited amount of experimental data. In particular, the experimental data for fine-tuning the pre-trained 1D-CNN are labeled with label distribution vectors instead of one-hot encoded vectors. Experimental results show that the performance of underwater source localization with a very limited amount of experimental data is significantly improved by the proposed LD-TL.
Collapse
Affiliation(s)
- Feng-Xiang Ge
- School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
| | - Yanyu Bai
- School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
| | - Mengjia Li
- School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
| | - Guangping Zhu
- College of Underwater Acoustic Engineering, Harbin Engineering University, Harbin 150001, China
| | - Jingwei Yin
- College of Underwater Acoustic Engineering, Harbin Engineering University, Harbin 150001, China
| |
Collapse
|
26
|
Qin J, He Y, Ge J, Liang Y. A multi-task feature fusion model for cervical cell classification. IEEE J Biomed Health Inform 2022; 26:4668-4678. [DOI: 10.1109/jbhi.2022.3180989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Jian Qin
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Yongjun He
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Jinping Ge
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Yiqin Liang
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| |
Collapse
|
27
|
Rajaraman S, Zamzmi G, Antani SK. Novel loss functions for ensemble-based medical image classification. PLoS One 2021; 16:e0261307. [PMID: 34968393 PMCID: PMC8718001 DOI: 10.1371/journal.pone.0261307] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 11/29/2021] [Indexed: 01/08/2023] Open
Abstract
Medical images commonly exhibit multiple abnormalities. Predicting them requires multi-class classifiers whose training and desired reliable performance can be affected by a combination of factors, such as, dataset size, data source, distribution, and the loss function used to train deep neural networks. Currently, the cross-entropy loss remains the de-facto loss function for training deep learning classifiers. This loss function, however, asserts equal learning from all classes, leading to a bias toward the majority class. Although the choice of the loss function impacts model performance, to the best of our knowledge, we observed that no literature exists that performs a comprehensive analysis and selection of an appropriate loss function toward the classification task under study. In this work, we benchmark various state-of-the-art loss functions, critically analyze model performance, and propose improved loss functions for a multi-class classification task. We select a pediatric chest X-ray (CXR) dataset that includes images with no abnormality (normal), and those exhibiting manifestations consistent with bacterial and viral pneumonia. We construct prediction-level and model-level ensembles to improve classification performance. Our results show that compared to the individual models and the state-of-the-art literature, the weighted averaging of the predictions for top-3 and top-5 model-level ensembles delivered significantly superior classification performance (p < 0.05) in terms of MCC (0.9068, 95% confidence interval (0.8839, 0.9297)) metric. Finally, we performed localization studies to interpret model behavior and confirm that the individual models and ensembles learned task-specific features and highlighted disease-specific regions of interest. The code is available at https://github.com/sivaramakrishnan-rajaraman/multiloss_ensemble_models.
Collapse
Affiliation(s)
| | - Ghada Zamzmi
- National Library of Medicine, National Institutes of Health, Bethesda, MD, United States of America
| | - Sameer K. Antani
- National Library of Medicine, National Institutes of Health, Bethesda, MD, United States of America
| |
Collapse
|