1
|
Huang KH, Huang YB, Lin YX, Hua KL, Tanveer M, Lu X, Razzak I. GRA: Graph Representation Alignment for Semi-Supervised Action Recognition. IEEE Trans Neural Netw Learn Syst 2024; PP:1-10. [PMID: 38215319 DOI: 10.1109/tnnls.2023.3347593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2024]
Abstract
Graph convolutional networks (GCNs) have emerged as a powerful tool for action recognition, leveraging skeletal graphs to encapsulate human motion. Despite their efficacy, a significant challenge remains the dependency on huge labeled datasets. Acquiring such datasets is often prohibitive, and the frequent occurrence of incomplete skeleton data, typified by absent joints and frames, complicates the testing phase. To tackle these issues, we present graph representation alignment (GRA), a novel approach with two main contributions: 1) a self-training (ST) paradigm that substantially reduces the need for labeled data by generating high-quality pseudo-labels, ensuring model stability even with minimal labeled inputs and 2) a representation alignment (RA) technique that utilizes consistency regularization to effectively reduce the impact of missing data components. Our extensive evaluations on the NTU RGB+D and Northwestern-UCLA (N-UCLA) benchmarks demonstrate that GRA not only improves GCN performance in data-constrained environments but also retains impressive performance in the face of data incompleteness.
Collapse
|
2
|
Lee WJ, Leu YS, Chen JS, Dai KY, Hou TC, Chang CT, Li CJ, Hua KL, Chen YJ. Real-Time Tracking of Laryngeal Motion via the Surface Depth-Sensing Technique for Radiotherapy in Laryngeal Cancer Patients. Bioengineering (Basel) 2023; 10:908. [PMID: 37627793 PMCID: PMC10451758 DOI: 10.3390/bioengineering10080908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/20/2023] [Accepted: 07/27/2023] [Indexed: 08/27/2023] Open
Abstract
Radiotherapy (RT) is an important modality for laryngeal cancer treatment to preserve laryngeal function. During beam delivery, laryngeal motion remains uncontrollable and may compromise tumor-targeting efficacy. We aimed to examine real-time laryngeal motion by developing a surface depth-sensing technique with preliminary testing during RT-based treatment of patients with laryngeal cancer. A surface depth-sensing (SDS) camera was set up and integrated into RT simulation procedures. By recording the natural swallowing of patients, SDS calculation was performed using the Pose Estimation Model and deep neural network technique. Seven male patients with laryngeal cancer were enrolled in this prospective study. The calculated motion distances of the laryngeal prominence (mean ± standard deviation) were 1.6 ± 0.8 mm, 21.4 ± 5.1 mm, 6.4 ± 3.3 mm, and 22.7 ± 4.9 mm in the left-right, cranio-caudal, and anterior-posterior directions and for the spatial displacement, respectively. The calculated differences in the 3D margins for generating the planning tumor volume by senior physicians with and without SDS data were -0.7 ± 1.0 mm (-18%), 11.3 ± 6.8 mm (235%), and 1.8 ± 2.6 mm (45%) in the left-right, cranio-caudal, and anterior-posterior directions, respectively. The SDS technique developed for detecting laryngeal motion during swallowing may be a practical guide for individualized RT design in the treatment of laryngeal cancer.
Collapse
Affiliation(s)
- Wan-Ju Lee
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei 104217, Taiwan; (W.-J.L.); (K.-Y.D.); (T.-C.H.); (C.-J.L.)
| | - Yi-Shing Leu
- Department of Otorhinolaryngology, MacKay Memorial Hospital, Taipei 104217, Taiwan;
| | - Jing-Sheng Chen
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106335, Taiwan; (J.-S.C.); (C.-T.C.)
| | - Kun-Yao Dai
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei 104217, Taiwan; (W.-J.L.); (K.-Y.D.); (T.-C.H.); (C.-J.L.)
| | - Tien-Chi Hou
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei 104217, Taiwan; (W.-J.L.); (K.-Y.D.); (T.-C.H.); (C.-J.L.)
| | - Chung-Ting Chang
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106335, Taiwan; (J.-S.C.); (C.-T.C.)
| | - Chi-Jung Li
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei 104217, Taiwan; (W.-J.L.); (K.-Y.D.); (T.-C.H.); (C.-J.L.)
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106335, Taiwan; (J.-S.C.); (C.-T.C.)
| | - Yu-Jen Chen
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei 104217, Taiwan; (W.-J.L.); (K.-Y.D.); (T.-C.H.); (C.-J.L.)
- Department Medical Research, MacKay Memorial Hospital, Taipei 104217, Taiwan
- Department of Artificial Intelligence and Medical Application, MacKay Junior College of Medicine, Nursing and Management, Taipei 112021, Taiwan
- Department of Medical Research, China Medical University Hospital, Taichung 404332, Taiwan
| |
Collapse
|
3
|
Lin JD, Han YH, Huang PH, Tan J, Chen JC, Tanveer M, Hua KL. DEFAEK: Domain Effective Fast Adaptive Network for Face Anti-Spoofing. Neural Netw 2023; 161:83-91. [PMID: 36736002 DOI: 10.1016/j.neunet.2023.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 12/24/2022] [Accepted: 01/18/2023] [Indexed: 01/26/2023]
Abstract
Existing deep learning based face anti-spoofing (FAS) or deepfake detection approaches usually rely on large-scale datasets and powerful networks with significant amount of parameters to achieve satisfactory performance. However, these make them resource-heavy and unsuitable for handheld devices. Moreover, they are limited by the types of spoof in the dataset they train on and require considerable training time. To produce a robust FAS model, they need large datasets covering the widest variety of predefined presentation attacks possible. Testing on new or unseen attacks or environments generally results in poor performance. Ideally, the FAS model should learn discriminative features that can generalize well even on unseen spoof types. In this paper, we propose a fast learning approach called Domain Effective Fast Adaptive nEt-worK (DEFAEK), a face anti-spoofing approach based on the optimization-based meta-learning paradigm that effectively and quickly adapts to new tasks. DEFAEK treats differences in an environment as domains and simulates multiple domain shifts during training. To further improve the effectiveness and efficiency of meta-learning, we adopt the metric learning in the inner loop update with careful sample selection. With extensive experiments on the challenging CelebA-Spoof and FaceForensics++ datasets, the evaluation results show that DEFAEK can learn cues independent of the environment with good generalization capability. In addition, the resulting model is lightweight following the design principle of modern lightweight network architecture and still generalizes well on unseen classes. In addition, we also demonstrate our model's capabilities by comparing the numbers of parameters, FLOPS, and model performance with other state-of-the-art methods.
Collapse
Affiliation(s)
- Jiun-Da Lin
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, 106335, Taiwan, ROC; Research Center for Information Technology Innovation, Academia Sinica, Taipei, 115201, Taiwan, ROC.
| | - Yue-Hua Han
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, 106335, Taiwan, ROC; Research Center for Information Technology Innovation, Academia Sinica, Taipei, 115201, Taiwan, ROC.
| | - Po-Han Huang
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, 106335, Taiwan, ROC; Research Center for Information Technology Innovation, Academia Sinica, Taipei, 115201, Taiwan, ROC.
| | - Julianne Tan
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, 106335, Taiwan, ROC.
| | - Jun-Cheng Chen
- Research Center for Information Technology Innovation, Academia Sinica, Taipei, 115201, Taiwan, ROC.
| | - M Tanveer
- Department of Mathematics, Indian Institute of Technology Indore, Simrol, 453552, India.
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, 106335, Taiwan, ROC.
| |
Collapse
|
4
|
Tan DS, Soeseno JH, Hua KL. Controllable and Identity-Aware Facial Attribute Transformation. IEEE Trans Cybern 2022; 52:4825-4836. [PMID: 34043518 DOI: 10.1109/tcyb.2021.3071172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Modifying facial attributes without the paired dataset proves to be a challenging task. Previously, approaches either required supervision from a ground-truth transformed image or required training a separate model for mapping every pair of attributes. These limit the scalability of the models to accommodate a larger set of attributes since the number of models that we need to train grows exponentially large. Another major drawback of the previous approaches is the unintentional gain of the identity of the person as they transform the facial attributes. We propose a method that allows for controllable and identity-aware transformations across multiple facial attributes using only a single model. Our approach is to train a generative adversarial network (GAN) with a multitask conditional discriminator that recognizes the identity of the face, distinguishes real images from fake, as well as identifies facial attributes present in an image. This guides the generator into producing an output that is realistic while preserving the person's identity and facial attributes. Through this framework, our model also learns meaningful image representations in a lower dimensional latent space and semantically associate separate parts of the encoded vector with both the person's identity and facial attributes. This opens up the possibility of generating new faces and other transformations such as making the face thinner or chubbier. Furthermore, our model only encodes the image once and allows for multiple transformations using the encoded vector. This allows for faster transformations since it does not need to reprocess the entire image for every transformation. We show the effectiveness of our proposed method through both qualitative and quantitative evaluations, such as ablative studies, visual inspection, and face verification. Competitive results are achieved compared to the main competition (CycleGAN), however, at great space and extensibility gain by using a single model.
Collapse
|
5
|
Hua KL, Huo MK, Dong ZC, Li S, Wang PJ, Li Y, Ren YK. [Study on the dynamic changes of peripheral platelet-to-lymphocyte ratio in the prognosis of neoadjuvant chemotherapy patients with gastric cancer]. Zhonghua Yi Xue Za Zhi 2022; 102:858-863. [PMID: 35330579 DOI: 10.3760/cma.j.cn112137-20211204-02700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Objective: To investigate the significance of platelet lymphocyte ratio (PLR) before and after neoadjuvant chemotherapy in advanced gastric cancer (AGC). Methods: The medical records of 247 AGC patients who underwent surgery between May 2015 and October 2016 were retrospectively reviewed. The relationship between PLR value and its changes before and after neoadjuvant therapy and clinicopathological features and prognosis was further analyzed. Results: △PLR was defined according to the different states of PLR before and after neoadjuvant therapy. If negative value was defined as"Reduced Group"(138) and positive value or 0 was defined as "Unreduced group"(109). There were statistical differences between the two groups of△PLR in tumor size, nerve invasion, presence or absence of vascular tumor thrombus, ypT staging, ypN staging, ypTNM staging, and pathological response (all P<0.05), but there was no statistical difference between age, gender, and postoperative adjuvant chemotherapy (all P>0.05). Survival analysis showed that the 5-year disease-free survival rates between the two groups were 39.0% and 54.0%, respectively (P=0.025); the 5-year overall survival rates between the two groups were 41.8% and 58.1%, respectively (P=0.035); the difference were statistically significant. Multivariate analysis showed that ypT3-4 stage, ypN3b stage and △PLR were independent risk factors for 5-year disease-free survival rate (HR=2.731/2.676, 95%CI: 1.026-7.268/1.014-6.985; HR=4.717, 95%CI: 1.922-11.579; HR=2.854, 95%CI: 1.117-4.124; all P<0.05) and 5-year overall survival rate (HR=3.226/2.655, 95%CI: 1.280-9.227/0.945-7.548; HR=4.550, 95%CI: 1.842-11.239; HR=2.897, 95%CI: 1.049-5.251; all P<0. 05). Conclusion: △PLR can better predict the prognosis of AGC patients receiving neoadjuvant chemotherapy.
Collapse
Affiliation(s)
- K L Hua
- Department of General Surgery, Henan Cancer Hospital, Affiliated Tumor Hospital of Zhengzhou University, Zhengzhou 450008, China
| | - M K Huo
- Department of General Surgery, Henan Cancer Hospital, Affiliated Tumor Hospital of Zhengzhou University, Zhengzhou 450008, China
| | - Z C Dong
- Department of General Surgery, Henan Cancer Hospital, Affiliated Tumor Hospital of Zhengzhou University, Zhengzhou 450008, China
| | - S Li
- Department of General Surgery, Henan Cancer Hospital, Affiliated Tumor Hospital of Zhengzhou University, Zhengzhou 450008, China
| | - P J Wang
- Department of General Surgery, Henan Cancer Hospital, Affiliated Tumor Hospital of Zhengzhou University, Zhengzhou 450008, China
| | - Y Li
- The Third Department of Surgery, the Fourth Hospital of Hebei Medical University, Shijiazhuang 050011, China
| | - Y K Ren
- Department of General Surgery, Henan Cancer Hospital, Affiliated Tumor Hospital of Zhengzhou University, Zhengzhou 450008, China
| |
Collapse
|
6
|
Yeh YC, Dy J, Huang TM, Chen YY, Hua KL. VDNet: video deinterlacing network based on coarse adaptive module and deformable recurrent residual network. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07116-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Lin YH, Hua KL, Chen YY, Chen IY, Tsai YC. A New Photographic Reproduction Method Based on Feature Fusion and Virtual Combined Histogram Equalization. Sensors (Basel) 2021; 21:s21186038. [PMID: 34577244 PMCID: PMC8471737 DOI: 10.3390/s21186038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 09/06/2021] [Accepted: 09/08/2021] [Indexed: 11/16/2022]
Abstract
A desirable photographic reproduction method should have the ability to compress high-dynamic-range images to low-dynamic-range displays that faithfully preserve all visual information. However, during the compression process, most reproduction methods face challenges in striking a balance between maintaining global contrast and retaining majority of local details in a real-world scene. To address this problem, this study proposes a new photographic reproduction method that can smoothly take global and local features into account. First, a highlight/shadow region detection scheme is used to obtain prior information to generate a weight map. Second, a mutually hybrid histogram analysis is performed to extract global/local features in parallel. Third, we propose a feature fusion scheme to construct the virtual combined histogram, which is achieved by adaptively fusing global/local features through the use of Gaussian mixtures according to the weight map. Finally, the virtual combined histogram is used to formulate the pixel-wise mapping function. As both global and local features are simultaneously considered, the output image has a natural and visually pleasing appearance. The experimental results demonstrated the effectiveness of the proposed method and the superiority over other seven state-of-the-art methods.
Collapse
Affiliation(s)
- Yu-Hsiu Lin
- Graduate Institute of Automation Technology, National Taipei University of Technology, Taipei 106, Taiwan; (Y.-H.L.); (I.-Y.C.); (Y.-C.T.)
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106, Taiwan;
| | - Yung-Yao Chen
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei 106, Taiwan
- Correspondence: ; Tel.: +886-2-2737-6378
| | - I-Ying Chen
- Graduate Institute of Automation Technology, National Taipei University of Technology, Taipei 106, Taiwan; (Y.-H.L.); (I.-Y.C.); (Y.-C.T.)
| | - Yun-Chen Tsai
- Graduate Institute of Automation Technology, National Taipei University of Technology, Taipei 106, Taiwan; (Y.-H.L.); (I.-Y.C.); (Y.-C.T.)
| |
Collapse
|
8
|
Lin JD, Lin HH, Dy J, Chen JC, Tanveer M, Razzak I, Hua KL. Lightweight Face Anti-Spoofing Network for Telehealth Applications. IEEE J Biomed Health Inform 2021; 26:1987-1996. [PMID: 34432642 DOI: 10.1109/jbhi.2021.3107735] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Online healthcare applications have grown more popular over the years. For instance,telehealth is an online healthcare application that allows patients and doctors to schedule consultations,prescribe medication,share medical documents,and monitor health conditions conveniently. Apart from this,telehealth can also be used to store a patients personal and medical information. Given the amount of sensitive data it stores,security measures are necessary. With its rise in usage due to COVID-19,its usefulness may be undermined if security issues are not addressed. A simple way of making these applications more secure is through user authentication. One of the most common and often used authentications is face recognition. It is convenient and easy to use. However,face recognition systems are not foolproof. They are prone to malicious attacks like printed photos,paper cutouts,re-played videos,and 3D masks. In order to counter this,multiple face anti-spoofing methods have been proposed. The goal of face anti-spoofing is to differentiate real users (live) from attackers (spoof). Although effective in terms of performance,existing methods use a significant amount of parameters,making them resource-heavy and unsuitable for handheld devices. Apart from this,they fail to generalize well to new environments like changes in lighting or background. This paper proposes a lightweight face anti-spoofing framework that does not compromise on performance. A lightweight model is critical for applications like telehealth that run on handheld devices. Our proposed method achieves good performance with the help of an ArcFace Classifier (AC). The AC encourages differentiation between spoof and live samples by making clear boundaries between them. With clear boundaries,classification becomes more accurate. We further demonstrate our models capabilities by comparing the number of parameters,FLOPS,and performance with other state-of-the-art methods.
Collapse
|
9
|
Tanveer M, Rashid AH, Ganaie MA, Reza M, Razzak I, Hua KL. Classification of Alzheimer's disease using ensemble of deep neural networks trained through transfer learning. IEEE J Biomed Health Inform 2021; 26:1453-1463. [PMID: 34033550 DOI: 10.1109/jbhi.2021.3083274] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Deep learning; transfer learning; ensemble learning; Alzheimer's disease.
Collapse
|
10
|
Ho DKN, Chiu WC, Lee YC, Su HY, Chang CC, Yao CY, Hua KL, Chu HK, Hsu CY, Chang JS. Integration of an Image-Based Dietary Assessment Paradigm into Dietetic Training Improves Food Portion Estimates by Future Dietitians. Nutrients 2021; 13:nu13010175. [PMID: 33430147 PMCID: PMC7827495 DOI: 10.3390/nu13010175] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 12/28/2020] [Accepted: 01/05/2021] [Indexed: 11/16/2022] Open
Abstract
The use of image-based dietary assessments (IBDAs) has rapidly increased; however, there is no formalized training program to enhance the digital viewing skills of dieticians. An IBDA was integrated into a nutritional practicum course in the School of Nutrition and Health Sciences, Taipei Medical University Taiwan. An online IBDA platform was created as an off-campus remedial teaching tool to reinforce the conceptualization of food portion sizes. Dietetic students’ receptiveness and response to the IBDA, and their performance in food identification and quantification, were compared between the IBDA and real food visual estimations (RFVEs). No differences were found between the IBDA and RFVE in terms of food identification (67% vs. 71%) or quantification (±10% of estimated calories: 23% vs. 24%). A Spearman correlation analysis showed a moderate to high correlation for calorie estimates between the IBDA and RFVE (r ≥ 0.33~0.75, all p < 0.0001). Repeated IBDA training significantly improved students’ image-viewing skills [food identification: first semester: 67%; pretest: 77%; second semester: 84%) and quantification [±10%: first semester: 23%; pretest: 28%; second semester: 32%; and ±20%: first semester: 38%; pretest: 48%; second semester: 59%] and reduced absolute estimated errors from 27% (first semester) to 16% (second semester). Training also greatly improved the identification of omitted foods (e.g., condiments, sugar, cooking oil, and batter coatings) and the accuracy of food portion size estimates. The integration of an IBDA into dietetic courses has the potential to help students develop knowledge and skills related to “e-dietetics”.
Collapse
Affiliation(s)
- Dang Khanh Ngan Ho
- School of Nutrition and Health Sciences, College of Nutrition, Taipei Medical University, Taipei 110, Taiwan; (D.K.N.H.); (W.-C.C.); (H.-Y.S.)
| | - Wan-Chun Chiu
- School of Nutrition and Health Sciences, College of Nutrition, Taipei Medical University, Taipei 110, Taiwan; (D.K.N.H.); (W.-C.C.); (H.-Y.S.)
- Research Center of Geriatric Nutrition, College of Nutrition, Taipei Medical University, Taipei 11031, Taiwan
| | - Yu-Chieh Lee
- Department of Obstetrics and Gynecology, Taipei Medical University Hospital, Taipei 110, Taiwan;
| | - Hsiu-Yueh Su
- School of Nutrition and Health Sciences, College of Nutrition, Taipei Medical University, Taipei 110, Taiwan; (D.K.N.H.); (W.-C.C.); (H.-Y.S.)
- Department of Dietetics, Taipei Medical University Hospital, Taipei 110, Taiwan
| | - Chun-Chao Chang
- Department of Physical Medicine and Rehabilitation, Taipei Medical University Hospital, Taipei 110, Taiwan;
- Department of Physical Medicine and Rehabilitation, School of Medicine, College of Medicine, Taipei Medical University, Taipei 110, Taiwan
| | - Chih-Yuan Yao
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 110, Taiwan; (C.-Y.Y.); (K.-L.H.)
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 110, Taiwan; (C.-Y.Y.); (K.-L.H.)
| | - Hung-Kuo Chu
- Department of Computer Science, National Tsing Hua University, Hsinchu 300, Taiwan;
| | - Chien-Yeh Hsu
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei 110, Taiwan;
- Master Program in Global Health and Development, College of Public Health, Taipei Medical University, Taipei 110, Taiwan
| | - Jung-Su Chang
- School of Nutrition and Health Sciences, College of Nutrition, Taipei Medical University, Taipei 110, Taiwan; (D.K.N.H.); (W.-C.C.); (H.-Y.S.)
- Graduate Institute of Metabolism and Obesity Sciences, College of Nutrition, Taipei Medical University, Taipei 110, Taiwan
- Nutrition Research Center, Taipei Medical University Hospital, Taipei 110, Taiwan
- Chinese Taipei Society for the Study of Obesity (CTSSO), Taipei 110, Taiwan
- Correspondence: ; Tel.: +886-(2)-27361661 (ext. 6542); Fax: +886-(2)-2737-3112
| |
Collapse
|
11
|
Shan N, Tan DS, Denekew MS, Chen YY, Cheng WH, Hua KL. Photobomb Defusal Expert: Automatically Remove Distracting People From Photos. IEEE Trans Emerg Top Comput Intell 2020. [DOI: 10.1109/tetci.2018.2865215] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
12
|
Lin YH, Hua KL, Lu HH, Sun WL, Chen YY. An Adaptive Exposure Fusion Method Using fuzzy Logic and Multivariate Normal Conditional Random Fields. Sensors (Basel) 2019; 19:s19214743. [PMID: 31683704 PMCID: PMC6864834 DOI: 10.3390/s19214743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Revised: 10/24/2019] [Accepted: 10/29/2019] [Indexed: 06/10/2023]
Abstract
High dynamic range (HDR) has wide applications involving intelligent vision sensing which includes enhanced electronic imaging, smart surveillance, self-driving cars, intelligent medical diagnosis, etc. Exposure fusion is an essential HDR technique which fuses different exposures of the same scene into an HDR-like image. However, determining the appropriate fusion weights is difficult because each differently exposed image only contains a subset of the scene's details. When blending, the problem of local color inconsistency is more challenging; thus, it often requires manual tuning to avoid image artifacts. To address this problem, we present an adaptive coarse-to-fine searching approach to find the optimal fusion weights. In the coarse-tuning stage, fuzzy logic is used to efficiently decide the initial weights. In the fine-tuning stage, the multivariate normal conditional random field model is used to adjust the fuzzy-based initial weights which allows us to consider both intra- and inter-image information in the data. Moreover, a multiscale enhanced fusion scheme is proposed to blend input images when maintaining the details in each scale-level. The proposed fuzzy-based MNCRF (Multivariate Normal Conditional Random Fields) fusion method provided a smoother blending result and a more natural look. Meanwhile, the details in the highlighted and dark regions were preserved simultaneously. The experimental results demonstrated that our work outperformed the state-of-the-art methods not only in several objective quality measures but also in a user study analysis.
Collapse
Affiliation(s)
- Yu-Hsiu Lin
- Department Electrical Engineering, Ming Chi University of Technology, New Taipei 243, Taiwan.
| | - Kai-Lung Hua
- Department Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106, Taiwan.
| | - Hsin-Han Lu
- Graduate Institute. Automation Technology, National Taipei University of Technology, Taipei 106, Taiwan.
| | - Wei-Lun Sun
- Graduate Institute. Automation Technology, National Taipei University of Technology, Taipei 106, Taiwan.
| | - Yung-Yao Chen
- Graduate Institute. Automation Technology, National Taipei University of Technology, Taipei 106, Taiwan.
| |
Collapse
|
13
|
Hou TC, Dai KY, Wu MC, Hua KL, Tai HC, Huang WC, Chen YJ. Bio-physic constraint model using spatial registration of delta 18F-fluorodeoxyglucose positron emission tomography/computed tomography images for predicting radiation pneumonitis in esophageal squamous cell carcinoma patients receiving neoadjuvant chemoradiation. Onco Targets Ther 2019; 12:6439-6451. [PMID: 31496743 PMCID: PMC6698165 DOI: 10.2147/ott.s205803] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Accepted: 07/12/2019] [Indexed: 12/17/2022] Open
Abstract
PURPOSE This study integrated clinical outcomes and radiomics of advanced thoracic esophageal squamous cell carcinoma patients receiving neoadjuvant concurrent chemoradiotherapy (NACCRT) to establish a novel constraint model for predicting radiation pneumonitis (RP). PATIENTS AND METHODS We conducted a retrospective review for thoracic advanced esophageal cancer patients who received NACCRT. From 2013 to 2018, 89 patients were eligible for review. Staging workup and response evaluation included positron emission tomography/computed tomography (PET/CT) scans and endoscopic ultrasound. Patients received RT with 48 Gy to gross tumor and 43.2 Gy to elective nodal area in simultaneous integrated boost method divided in 24 fractions. Weekly platinum-based chemotherapy was administered concurrently. Side effects were evaluated using CTCAE v4. Images of 2-fluoro-2-deoxyglucose PET/CT before and after NACCRT were registered to planning CT images to create a region of interest for dosimetry parameters that spatially matched RP-related regions, including V10, V20, V50%, V27, and V30. Correlation between bio-physic parameters and toxicity was used to establish a constraint model for avoiding RP. RESULTS Among the investigated cohort, clinical downstaging, complete pathological response, and 5-year overall survival rates were 59.6%, 40%, and 34.4%, respectively. Multivariate logistic regression analysis demonstrated that each individual set standardized uptake value ratios (SUVRs), neither pre- nor post-NACCRT, was not predictive. Interestingly, cutoff increments of 6.2% and 8.9% in SUVRs (delta-SUVR) in registered V20 and V27 regions were powerful predictors for acute and chronic RP, respectively. CONCLUSION Spatial registration of metabolic and planning CT images with delta-radiomics analysis using fore-and-aft image sets can establish a unique bio-physic prediction model for avoiding RP in esophageal cancer patients receiving NACCRT.
Collapse
Affiliation(s)
- Tien-Chi Hou
- Department of Radiation Oncology, Mackay Memorial Hospital, Taipei, Taiwan
| | - Kun-Yao Dai
- Department of Radiation Oncology, Mackay Memorial Hospital, Taipei, Taiwan
| | - Ming-Che Wu
- Department of Nuclear Medicine, Mackay Memorial Hospital, Taipei, Taiwan
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Hung-Chi Tai
- Department of Radiation Oncology, Mackay Memorial Hospital, Taipei, Taiwan
| | - Wen-Chien Huang
- Department of Surgery, Division of Thoracic Surgery, Mackay Memorial Hospital, Taipei City10449, Taiwan
| | - Yu-Jen Chen
- Department of Radiation Oncology, Mackay Memorial Hospital, Taipei, Taiwan
- Department of Medical Research, China Medical University Hospital, Taichung40402, Taiwan
| |
Collapse
|
14
|
Tan DS, Yao CY, Ruiz C, Hua KL. Single-Image Depth Inference Using Generative Adversarial Networks. Sensors (Basel) 2019; 19:E1708. [PMID: 30974774 PMCID: PMC6480060 DOI: 10.3390/s19071708] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 04/01/2019] [Accepted: 04/08/2019] [Indexed: 11/17/2022]
Abstract
Depth has been a valuable piece of information for perception tasks such as robot grasping, obstacle avoidance, and navigation, which are essential tasks for developing smart homes and smart cities. However, not all applications have the luxury of using depth sensors or multiple cameras to obtain depth information. In this paper, we tackle the problem of estimating the per-pixel depths from a single image. Inspired by the recent works on generative neural network models, we formulate the task of depth estimation as a generative task where we synthesize an image of the depth map from a single Red, Green, and Blue (RGB) input image. We propose a novel generative adversarial network that has an encoder-decoder type generator with residual transposed convolution blocks trained with an adversarial loss. Quantitative and qualitative experimental results demonstrate the effectiveness of our approach over several depth estimation works.
Collapse
Affiliation(s)
- Daniel Stanley Tan
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
| | - Chih-Yuan Yao
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
| | - Conrado Ruiz
- Software Technology Department, De La Salle University, Manila 1004, Philippines.
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
- Center for Cyber-Physical System Innovation, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
| |
Collapse
|
15
|
Tan DS, Lin JM, Lai YC, Ilao J, Hua KL. Depth Map Upsampling via Multi-Modal Generative Adversarial Network. Sensors (Basel) 2019; 19:s19071587. [PMID: 30986925 PMCID: PMC6480680 DOI: 10.3390/s19071587] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 03/28/2019] [Accepted: 03/29/2019] [Indexed: 11/16/2022]
Abstract
Autonomous robots for smart homes and smart cities mostly require depth perception in order to interact with their environments. However, depth maps are usually captured in a lower resolution as compared to RGB color images due to the inherent limitations of the sensors. Naively increasing its resolution often leads to loss of sharpness and incorrect estimates, especially in the regions with depth discontinuities or depth boundaries. In this paper, we propose a novel Generative Adversarial Network (GAN)-based framework for depth map super-resolution that is able to preserve the smooth areas, as well as the sharp edges at the boundaries of the depth map. Our proposed model is trained on two different modalities, namely color images and depth maps. However, at test time, our model only requires the depth map in order to produce a higher resolution version. We evaluated our model both quantitatively and qualitatively, and our experiments show that our method performs better than existing state-of-the-art models.
Collapse
Affiliation(s)
- Daniel Stanley Tan
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
| | - Jun-Ming Lin
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
| | - Yu-Chi Lai
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
| | - Joel Ilao
- Center for Automation Research, College of Computer Studies, De La Salle University, Manila 1004, Philippines.
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
- Center for Cyber-Physical System Innovation, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
| |
Collapse
|
16
|
Sun SW, Mou TC, Fang CC, Chang PC, Hua KL, Shih HC. Baseball Player Behavior Classification System Using Long Short-Term Memory with Multimodal Features. Sensors (Basel) 2019; 19:s19061425. [PMID: 30909503 PMCID: PMC6471259 DOI: 10.3390/s19061425] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 03/12/2019] [Accepted: 03/20/2019] [Indexed: 11/16/2022]
Abstract
In this paper, a preliminary baseball player behavior classification system is proposed. By using multiple IoT sensors and cameras, the proposed method accurately recognizes many of baseball players' behaviors by analyzing signals from heterogeneous sensors. The contribution of this paper is threefold: (i) signals from a depth camera and from multiple inertial sensors are obtained and segmented, (ii) the time-variant skeleton vector projection from the depth camera and the statistical features extracted from the inertial sensors are used as features, and (iii) a deep learning-based scheme is proposed for training behavior classifiers. The experimental results demonstrate that the proposed deep learning behavior system achieves an accuracy of greater than 95% compared to the proposed dataset.
Collapse
Affiliation(s)
- Shih-Wei Sun
- Department of New Media Art, Taipei National University of the Arts, Taipei 112, Taiwan.
- Computer Center, Taipei National University of the Arts, Taipei 112, Taiwan.
| | - Ting-Chen Mou
- Department of Communication Engineering, National Central University, Taoyuan 320, Taiwan.
| | - Chih-Chieh Fang
- Graduate Institute of Dance Theory, Taipei National University of the Arts, Taipei 112, Taiwan.
| | - Pao-Chi Chang
- Department of Communication Engineering, National Central University, Taoyuan 320, Taiwan.
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106, Taiwan.
- Center for Cyber-Physical System Innovation, National Taiwan University of Science and Technology, Taipei 106, Taiwan.
| | - Huang-Chia Shih
- Department of Electrical Engineering, Yuan Ze University, Taoyuan 320, Taiwan.
| |
Collapse
|
17
|
Tan DS, Chen WY, Hua KL. DeepDemosaicking: Adaptive Image Demosaicking via Multiple Deep Fully Convolutional Networks. IEEE Trans Image Process 2018; 27:2408-2419. [PMID: 29994510 DOI: 10.1109/tip.2018.2803341] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Convolutional neural networks are currently the state-of-the-art solution for a wide range of image processing tasks. Their deep architecture extracts low and high-level features from images, thus, improving the model's performance. In this paper, we propose a method for image demosaicking based on deep convolutional neural networks. Demosaicking is the task of reproducing full color images from incomplete images formed from overlaid color filter arrays on image sensors found in digital cameras. Instead of producing the output image directly, the proposed method divides the demosaicking task into an initial demosaicking step and a refinement step. The initial step produces a rough demosaicked image containing unwanted color artifacts. The refinement step then reduces these color artifacts using deep residual estimation and multi-model fusion producing a higher quality image. Experimental results show that the proposed method outperforms several existing and state-of-the-art methods in terms of both subjective and objective evaluations.
Collapse
|
18
|
Abstract
Compared to the color images, their associated depth images captured by the RGB-D sensors are typically with lower resolution. The task of depth map super-resolution (SR) aims at increasing the resolution of the range data by utilizing the high-resolution (HR) color image, while the details of the depth information are to be properly preserved. In this paper, we present a joint trilateral filtering (JTF) algorithm for depth image SR. The proposed JTF first observes context information from the HR color image. In addition to the extracted spatial and range information of local pixels, our JTF further integrates local gradient information of the depth image, which allows the prediction and refinement of HR depth image outputs without artifacts like textural copies or edge discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.
Collapse
|
19
|
Hua KL, Wang HC, Yeh CH, Cheng WH, Lai YC. Background Extraction Using Random Walk Image Fusion. IEEE Trans Cybern 2018; 48:423-435. [PMID: 28026799 DOI: 10.1109/tcyb.2016.2640288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
It is important to extract a clear background for computer vision and augmented reality. Generally, background extraction assumes the existence of a clean background shot through the input sequence, but realistically, situations may violate this assumption such as highway traffic videos. Therefore, our probabilistic model-based method formulates fusion of candidate background patches of the input sequence as a random walk problem and seeks a globally optimal solution based on their temporal and spatial relationship. Furthermore, we also design two quality measures to consider spatial and temporal coherence and contrast distinctness among pixels as background selection basis. A static background should have high temporal coherence among frames, and thus, we improve our fusion precision with a temporal contrast filter and an optical-flow-based motionless patch extractor. Experiments demonstrate that our algorithm can successfully extract artifact-free background images with low computational cost while comparing to state-of-the-art algorithms.
Collapse
|
20
|
Lee J, Hua KL, Hsu SM, Lin JB, Lee CH, Lu KW, Dai KY, Huang XN, Huang JZ, Wu MH, Chen YJ. Development of delineation for the left anterior descending coronary artery region in left breast cancer radiotherapy: An optimized organ at risk. Radiother Oncol 2017; 122:423-430. [PMID: 28087071 DOI: 10.1016/j.radonc.2016.12.029] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 11/27/2016] [Accepted: 12/23/2016] [Indexed: 12/25/2022]
Abstract
BACKGROUND AND PURPOSE The left anterior descending coronary artery (LAD) and diagonal branches (DBs) are blurred on computed tomography (CT). We aimed to define the LAD region (LADR) with adequate inclusion of the LAD and DBs and contouring consistency. METHODS AND MATERIALS The LADR was defined using coronary CT angiograms. The inclusion ratio was used to assess the LAD and DBs inclusion by the LADR. Four radiation oncologists delineated the LAD and LADR, using contrast-enhanced CT of 15 patients undergoing left breast radiotherapy. The Sørensen-Dice similarity index (DSI), Jaccard similarity index (JSI), and Hausdorff distance (HD) were calculated to assess similarity. The mean dose (Dmean) and maximum dose (Dmax) to the LAD and LADR were calculated to compare consistency. Correlations were evaluated using Pearson's correlation coefficient. RESULTS The inclusion ratio of the LAD by the LADR was 96%. The mean DSI, JSI, and HD values were respectively 27.9%, 16.7%, and 0.42mm for the LAD, and 83.1%, 73.0%, and 0.18mm for the LADR. The Dmean between the LAD and LADR were strongly correlated (r=0.93). CONCLUSION Delineation of the LADR significantly improved contouring similarity and consistency for dose reporting. This could optimize dose estimation for breast radiotherapy.
Collapse
Affiliation(s)
- Jie Lee
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan; Department of Medicine, MacKay Medical College, Taiwan; Department of Biomedical Imaging and Radiological Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Shih-Ming Hsu
- Department of Biomedical Imaging and Radiological Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Jhen-Bin Lin
- Department of Radiation Oncology, Changhua Christian Hospital, Taiwan
| | - Chou-Hsien Lee
- Department of Radiation Oncology, E-Da Cancer Hospital, Kaohsiung, Taiwan
| | - Kuo-Wei Lu
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan
| | - Kun-Yao Dai
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan
| | - Xu-Nian Huang
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan
| | - Jun-Zhao Huang
- Department of Radiology, MacKay Memorial Hospital, Taipei, Taiwan
| | - Meng-Hao Wu
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan; Department of Medicine, MacKay Medical College, Taiwan.
| | - Yu-Jen Chen
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan; Department of Medicine, MacKay Medical College, Taiwan.
| |
Collapse
|
21
|
Tsai TH, Jhou WC, Cheng WH, Hu MC, Shen IC, Lim T, Hua KL, Ghoneim A, Hossain MA, Hidayati SC. Photo sundial: Estimating the time of capture in consumer photos. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.11.050] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
22
|
Hua KL, Hsu CH, Hidayati SC, Cheng WH, Chen YJ. Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Ther 2015; 8:2015-22. [PMID: 26346558 PMCID: PMC4531007 DOI: 10.2147/ott.s80733] [Citation(s) in RCA: 116] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Lung cancer has a poor prognosis when not diagnosed early and unresectable lesions are present. The management of small lung nodules noted on computed tomography scan is controversial due to uncertain tumor characteristics. A conventional computer-aided diagnosis (CAD) scheme requires several image processing and pattern recognition steps to accomplish a quantitative tumor differentiation result. In such an ad hoc image analysis pipeline, every step depends heavily on the performance of the previous step. Accordingly, tuning of classification performance in a conventional CAD scheme is very complicated and arduous. Deep learning techniques, on the other hand, have the intrinsic advantage of an automatic exploitation feature and tuning of performance in a seamless fashion. In this study, we attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques. Specifically, we introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images. Two baseline methods with feature computing steps were implemented for comparison. The experimental results suggest that deep learning methods could achieve better discriminative results and hold promise in the CAD application domain.
Collapse
Affiliation(s)
- Kai-Lung Hua
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Che-Hao Hsu
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Shintami Chusnul Hidayati
- Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Wen-Huang Cheng
- Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan
| | - Yu-Jen Chen
- Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan
| |
Collapse
|
23
|
|