1
|
Wang F, Li X, Wen R, Luo H, Liu D, Qi S, Jing Y, Wang P, Deng G, Huang C, Du T, Wang L, Liang H, Wang J, Liu C. Pneumonia-Plus: a deep learning model for the classification of bacterial, fungal, and viral pneumonia based on CT tomography. Eur Radiol 2023; 33:8869-8878. [PMID: 37389609 DOI: 10.1007/s00330-023-09833-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 03/17/2023] [Accepted: 03/30/2023] [Indexed: 07/01/2023]
Abstract
OBJECTIVES This study aims to develop a deep learning algorithm, Pneumonia-Plus, based on computed tomography (CT) images for accurate classification of bacterial, fungal, and viral pneumonia. METHODS A total of 2763 participants with chest CT images and definite pathogen diagnosis were included to train and validate an algorithm. Pneumonia-Plus was prospectively tested on a nonoverlapping dataset of 173 patients. The algorithm's performance in classifying three types of pneumonia was compared to that of three radiologists using the McNemar test to verify its clinical usefulness. RESULTS Among the 173 patients, area under the curve (AUC) values for viral, fungal, and bacterial pneumonia were 0.816, 0.715, and 0.934, respectively. Viral pneumonia was accurately classified with sensitivity, specificity, and accuracy of 0.847, 0.919, and 0.873. Three radiologists also showed good consistency with Pneumonia-Plus. The AUC values of bacterial, fungal, and viral pneumonia were 0.480, 0.541, and 0.580 (radiologist 1: 3-year experience); 0.637, 0.693, and 0.730 (radiologist 2: 7-year experience); and 0.734, 0.757, and 0.847 (radiologist 3: 12-year experience), respectively. The McNemar test results for sensitivity showed that the diagnostic performance of the algorithm was significantly better than that of radiologist 1 and radiologist 2 (p < 0.05) in differentiating bacterial and viral pneumonia. Radiologist 3 had a higher diagnostic accuracy than the algorithm. CONCLUSIONS The Pneumonia-Plus algorithm is used to differentiate between bacterial, fungal, and viral pneumonia, which has reached the level of an attending radiologist and reduce the risk of misdiagnosis. The Pneumonia-Plus is important for appropriate treatment and avoiding the use of unnecessary antibiotics, and provide timely information to guide clinical decision-making and improve patient outcomes. CLINICAL RELEVANCE STATEMENT Pneumonia-Plus algorithm could assist in the accurate classification of pneumonia based on CT images, which has great clinical value in avoiding the use of unnecessary antibiotics, and providing timely information to guide clinical decision-making and improve patient outcomes. KEY POINTS • The Pneumonia-Plus algorithm trained from data collected from multiple centers can accurately identify bacterial, fungal, and viral pneumonia. • The Pneumonia-Plus algorithm was found to have better sensitivity in classifying viral and bacterial pneumonia in comparison to radiologist 1 (5-year experience) and radiologist 2 (7-year experience). • The Pneumonia-Plus algorithm is used to differentiate between bacterial, fungal, and viral pneumonia, which has reached the level of an attending radiologist.
Collapse
Affiliation(s)
- Fang Wang
- Department of Radiology, Southwest Hospital, Third Military Medical University (Army Medical University), 30 Gao Tan Yan St, Chongqing, 400038, China
| | - Xiaoming Li
- Department of Radiology, Southwest Hospital, Third Military Medical University (Army Medical University), 30 Gao Tan Yan St, Chongqing, 400038, China
| | - Ru Wen
- Medical College, Guizhou University, Guiyang, Guizhou Province, 550000, China
| | - Hu Luo
- No 1. Intensive Care Unit, Huoshenshan Hospital, Wuhan, China
- Department of Respiratory and Critical Care Medicine, Southwest Hospital, Third Military Medical University (Army Medical University), Chongqing, China
| | - Dong Liu
- Huiying Medical Technology Co., Ltd, Dongsheng Science and Technology Park, Haidian District, Beijing, China
| | - Shuai Qi
- Huiying Medical Technology Co., Ltd, Dongsheng Science and Technology Park, Haidian District, Beijing, China
| | - Yang Jing
- Huiying Medical Technology Co., Ltd, Dongsheng Science and Technology Park, Haidian District, Beijing, China
| | - Peng Wang
- Medical Big Data and Artificial Intelligence Center, Southwest Hospital, Third Military Medical University (Army Medical University), Chongqing, China
| | - Gang Deng
- Department of Radiology, Maternal and Child Health Hospital of Hubei Province, Guanggu District, Wuhan, China
| | - Cong Huang
- Department of Radiology, The 926 Hospital of PLA, Kaiyuan, China
| | - Tingting Du
- Department of Radiology, Chongqing Traditional Chinese Medicine Hospital, Chongqing, China
| | - Limei Wang
- Department of Radiology, Southwest Hospital, Third Military Medical University (Army Medical University), 30 Gao Tan Yan St, Chongqing, 400038, China
| | - Hongqin Liang
- Department of Radiology, Southwest Hospital, Third Military Medical University (Army Medical University), 30 Gao Tan Yan St, Chongqing, 400038, China.
| | - Jian Wang
- Department of Radiology, Southwest Hospital, Third Military Medical University (Army Medical University), 30 Gao Tan Yan St, Chongqing, 400038, China.
| | - Chen Liu
- Department of Radiology, Southwest Hospital, Third Military Medical University (Army Medical University), 30 Gao Tan Yan St, Chongqing, 400038, China.
| |
Collapse
|
2
|
Tu Z, Liu Y, Zhang Y, Mu Q, Yuan J. DTCM: Joint Optimization of Dark Enhancement and Action Recognition in Videos. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:3507-3520. [PMID: 37335800 DOI: 10.1109/tip.2023.3286254] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Recognizing human actions in dark videos is a useful yet challenging visual task in reality. Existing augmentation-based methods separate action recognition and dark enhancement in a two-stage pipeline, which leads to inconsistently learning of temporal representation for action recognition. To address this issue, we propose a novel end-to-end framework termed Dark Temporal Consistency Model (DTCM), which is able to jointly optimize dark enhancement and action recognition, and force the temporal consistency to guide downstream dark feature learning. Specifically, DTCM cascades the action classification head with the dark augmentation network to perform dark video action recognition in a one-stage pipeline. Our explored spatio-temporal consistency loss, which utilizes the RGB-Difference of dark video frames to encourage temporal coherence of the enhanced video frames, is effective for boosting spatio-temporal representation learning. Extensive experiments demonstrated that our DTCM has remarkable performance: 1) Competitive accuracy, which outperforms the state-of-the-arts on the ARID dataset by 2.32% and the UAVHuman-Fisheye dataset by 4.19% in accuracy, respectively; 2) High efficiency, which surpasses the current most advanced method (Chen et al., 2021) with only 6.4% GFLOPs and 71.3% number of parameters; 3) Strong generalization, which can be used in various action recognition methods (e.g., TSM, I3D, 3D-ResNext-101, Video-Swin) to promote their performance significantly.
Collapse
|
4
|
Zhao H, Zeng H, Qin X, Fu Y, Wang H, Omar B, Li X. What and Where: Learn to Plug Adapters via NAS for Multidomain Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6532-6544. [PMID: 34310322 DOI: 10.1109/tnnls.2021.3082316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
As an important and challenging problem, multidomain learning (MDL) typically seeks a set of effective lightweight domain-specific adapter modules plugged into a common domain-agnostic network. Usually, existing ways of adapter plugging and structure design are handcrafted and fixed for all domains before model learning, resulting in learning inflexibility and computational intensiveness. With this motivation, we propose to learn a data-driven adapter plugging strategy with neural architecture search (NAS), which automatically determines where to plug for those adapter modules. Furthermore, we propose an NAS-adapter module for adapter structure design in an NAS-driven learning scheme, which automatically discovers effective adapter module structures for different domains. Experimental results demonstrate the effectiveness of our MDL model against existing approaches under the conditions of comparable performance.
Collapse
|
7
|
Su B, Zhou J, Wen JR, Wu Y. Linear and Deep Order-Preserving Wasserstein Discriminant Analysis. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:3123-3138. [PMID: 33434122 DOI: 10.1109/tpami.2021.3050750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Supervised dimensionality reduction for sequence data learns a transformation that maps the observations in sequences onto a low-dimensional subspace by maximizing the separability of sequences in different classes. It is typically more challenging than conventional dimensionality reduction for static data, because measuring the separability of sequences involves non-linear procedures to manipulate the temporal structures. In this paper, we propose a linear method, called order-preserving Wasserstein discriminant analysis (OWDA), and its deep extension, namely DeepOWDA, to learn linear and non-linear discriminative subspace for sequence data, respectively. We construct novel separability measures between sequence classes based on the order-preserving Wasserstein (OPW) distance to capture the essential differences among their temporal structures. Specifically, for each class, we extract the OPW barycenter and construct the intra-class scatter as the dispersion of the training sequences around the barycenter. The inter-class distance is measured as the OPW distance between the corresponding barycenters. We learn the linear and non-linear transformations by maximizing the inter-class distance and minimizing the intra-class scatter. In this way, the proposed OWDA and DeepOWDA are able to concentrate on the distinctive differences among classes by lifting the geometric relations with temporal constraints. Experiments on four 3D action recognition datasets show the effectiveness of OWDA and DeepOWDA.
Collapse
|
9
|
Zhang J, Zhang X, Zhang Y, Duan Y, Li Y, Pan Z. Meta-Knowledge Learning and Domain Adaptation for Unseen Background Subtraction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:9058-9068. [PMID: 34714746 DOI: 10.1109/tip.2021.3122102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Background subtraction is a classic video processing task pervading in numerous visual applications such as video surveillance and traffic monitoring. Given the diversity and variability of real application scenes, an ideal background subtraction model should be robust to various scenarios. Even though deep-learning approaches have demonstrated unprecedented improvements, they often fail to generalize to unseen scenarios, thereby less suitable for extensive deployment. In this work, we propose to tackle cross-scene background subtraction via a two-phase framework that includes meta-knowledge learning and domain adaptation. Specifically, as we observe that meta-knowledge (i.e., scene-independent common knowledge) is the cornerstone for generalizing to unseen scenes, we draw on traditional frame differencing algorithms and design a deep difference network (DDN) to encode meta-knowledge especially temporal change knowledge from various cross-scene data (source domain) without intermittent foreground motion pattern. In addition, we explore a self-training domain adaptation strategy based on iterative evolution. With iteratively updated pseudo-labels, the DDN is continuously fine-tuned and evolves progressively toward unseen scenes (target domain) in an unsupervised fashion. Our framework could be easily deployed on unseen scenes without relying on their annotations. As evidenced by our experiments on the CDnet2014 dataset, it brings a significant improvement to background subtraction. Our method has a favorable processing speed (70 fps) and outperforms the best unsupervised algorithm and top supervised algorithm designed for unseen scenes by 9% and 3%, respectively.
Collapse
|
10
|
Serpush F, Rezaei M. Complex Human Action Recognition Using a Hierarchical Feature Reduction and Deep Learning-Based Method. ACTA ACUST UNITED AC 2021; 2:94. [PMID: 33615240 PMCID: PMC7881322 DOI: 10.1007/s42979-021-00484-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Accepted: 01/22/2021] [Indexed: 01/18/2023]
Abstract
Automated human action recognition is one of the most attractive and practical research fields in computer vision. In such systems, the human action labelling is based on the appearance and patterns of the motions in the video sequences; however, majority of the existing research and most of the conventional methodologies and classic neural networks either neglect or are not able to use temporal information for action recognition prediction in a video sequence. On the other hand, the computational cost of a proper and accurate human action recognition is high. In this paper, we address the challenges of the preprocessing phase, by an automated selection of representative frames from the input sequences. We extract the key features of the representative frame rather than the entire features. We propose a hierarchical technique using background subtraction and HOG, followed by application of a deep neural network and skeletal modelling method. The combination of a CNN and the LSTM recursive network is considered for feature selection and maintaining the previous information; and finally, a Softmax-KNN classifier is used for labelling the human activities. We name our model as "Hierarchical Feature Reduction & Deep Learning"-based action recognition method, or HFR-DL in short. To evaluate the proposed method, we use the UCF101 dataset for the benchmarking which is widely used among researchers in the action recognition research field. The dataset includes 101 complicated activities in the wild. Experimental results show a significant improvement in terms of accuracy and speed in comparison with eight state-of-the-art methods.
Collapse
Affiliation(s)
- Fatemeh Serpush
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | - Mahdi Rezaei
- University of Leeds, Institute for Transport Studies, 34-40 University Road, Leeds, LS2 9JT UK
| |
Collapse
|