1
|
Thijssen LCP, de Rooij M, Barentsz JO, Huisman HJ. Radiomics based automated quality assessment for T2W prostate MR images. Eur J Radiol 2023; 165:110928. [PMID: 37354769 DOI: 10.1016/j.ejrad.2023.110928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 05/30/2023] [Accepted: 06/12/2023] [Indexed: 06/26/2023]
Abstract
PURPOSE The guidelines for prostate cancer recommend the use of MRI in the prostate cancer pathway. Due to the variability in prostate MR image quality, the reliability of this technique in the detection of prostate cancer is highly variable in clinical practice. This leads to the need for an objective and automated assessment of image quality to ensure an adequate acquisition and hereby to improve the reliability of MRI. The aim of this study is to investigate the feasibility of Blind/referenceless image spatial quality evaluator (Brisque) and radiomics in automated image quality assessment of T2-weighted (T2W) images. METHOD Anonymized axial T2W images from 140 patients were scored for quality using a five-point Likert scale (low, suboptimal, acceptable, good, very good quality) in consensus by two readers. Images were dichotomized into clinically acceptable (very good, good and acceptable quality images) and clinically unacceptable (low and suboptimal quality images) in order to train and verify the model. Radiomics and Brisque features were extracted from a central cuboid volume including the prostate. A reduced feature set was used to fit a Linear Discriminant Analysis (LDA) model to predict image quality. Two hundred times repeated 5-fold cross-validation was used to train the model and test performance by assessing the classification accuracy, the discrimination accuracy as receiver operating curve - area under curve (ROC-AUC), and by generating confusion matrices. RESULTS Thirty-four images were classified as clinically unacceptable and 106 were classified as clinically acceptable. The accuracy of the independent test set (mean ± standard deviation) was 85.4 ± 5.5%. The ROC-AUC was 0.856 (0.851 - 0.861) (mean; 95% confidence interval). CONCLUSIONS Radiomics AI can automatically detect a significant portion of T2W images of suboptimal image quality. This can help improve image quality at the time of acquisition, thus reducing repeat scans and improving diagnostic accuracy.
Collapse
|
2
|
Yang Y, Hu Y, Zhang X, Wang S. Two-Stage Selective Ensemble of CNN via Deep Tree Training for Medical Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9194-9207. [PMID: 33705343 DOI: 10.1109/tcyb.2021.3061147] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Medical image classification is an important task in computer-aided diagnosis systems. Its performance is critically determined by the descriptiveness and discriminative power of features extracted from images. With rapid development of deep learning, deep convolutional neural networks (CNNs) have been widely used to learn the optimal high-level features from the raw pixels of images for a given classification task. However, due to the limited amount of labeled medical images with certain quality distortions, such techniques crucially suffer from the training difficulties, including overfitting, local optimums, and vanishing gradients. To solve these problems, in this article, we propose a two-stage selective ensemble of CNN branches via a novel training strategy called deep tree training (DTT). In our approach, DTT is adopted to jointly train a series of networks constructed from the hidden layers of CNN in a hierarchical manner, leading to the advantage that vanishing gradients can be mitigated by supplementing gradients for hidden layers of CNN, and intrinsically obtain the base classifiers on the middle-level features with minimum computation burden for an ensemble solution. Moreover, the CNN branches as base learners are combined into the optimal classifier via the proposed two-stage selective ensemble approach based on both accuracy and diversity criteria. Extensive experiments on CIFAR-10 benchmark and two specific medical image datasets illustrate that our approach achieves better performance in terms of accuracy, sensitivity, specificity, and F1 score measurement.
Collapse
|
3
|
Liu H, Li H, Wang X, Li H, Ou M, Hao L, Hu Y, Liu J. Understanding How Fundus Image Quality Degradation Affects CNN-based Diagnosis. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:438-442. [PMID: 36086182 DOI: 10.1109/embc48229.2022.9871507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Quality degradation (QD) is common in the fundus images collected from the clinical environment. Although diagnosis models based on convolutional neural networks (CNN) have been extensively used to interpret retinal fundus images, their performances under QD have not been assessed. To understand the effects of QD on the performance of CNN-based diagnosis model, a systematical study is proposed in this paper. In our study, the QD of fundus images is controlled by independently or simultaneously importing quantified interferences (e.g., image blurring, retinal artifacts, and light transmission disturbance). And the effects of diabetic retinopathy (DR) grading systems are thus analyzed according to the diagnosis performances on the degraded images. With images degraded by quantified interferences, several CNN-based DR grading models (e.g., AlexNet, SqueezeNet, VGG, DenseNet, and ResNet) are evaluated. The experiments demonstrate that image blurring causes a significant decrease in performance, while the impacts from light transmission disturbance and retinal artifacts are relatively slight. Superior performances are achieved by VGG, DenseNet, and ResNet in the absence of image degradation, and their robustness is presented under the controlled degradation.
Collapse
|
4
|
Deep Learning-Based Optimization Algorithm for Enterprise Personnel Identity Authentication. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:9662817. [PMID: 35800683 PMCID: PMC9256414 DOI: 10.1155/2022/9662817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 06/15/2022] [Indexed: 11/18/2022]
Abstract
Enterprise strategic management is not only an important part of enterprise work, but also an important factor to deepen the reform of management system and promote the centralized and unified management of enterprises. Enterprise strategic management is to study the major problems of survival and development of enterprises in the competitive environment from the overall and long-term point of view. It is the most important function of senior leaders of modern enterprises. Starting from the characteristics of the recognition object, this paper analyzes the individual differences of biometrics through intelligent face image recognition technology to identify biometrics, which can be used to identify different individuals. This paper studies the main problems of personnel identity authentication in the current enterprise strategic management system. Based on identity management and supported by face image recognition technology, deep learning, and cloud computing technology, the personnel management model of the management system is constructed, which solves the problems of personnel real identity authentication and personnel safety behavior control. Experiments show that the model can simplify the workflow, improve the operation efficiency, and reduce the management cost. From the perspective of enterprise system development, building a scientific enterprise strategic management system is of great significance to improve the scientific level of enterprise system management.
Collapse
|
5
|
Boussaad L, Boucetta A. Deep-learning based descriptors in application to aging problem in face recognition. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2022. [DOI: 10.1016/j.jksuci.2020.10.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
6
|
Tian F, Xie H, Song Y, Hu S, Liu J. The Face Inversion Effect in Deep Convolutional Neural Networks. Front Comput Neurosci 2022; 16:854218. [PMID: 35615057 PMCID: PMC9124772 DOI: 10.3389/fncom.2022.854218] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 03/22/2022] [Indexed: 11/13/2022] Open
Abstract
The face inversion effect (FIE) is a behavioral marker of face-specific processing that the recognition of inverted faces is disproportionately disrupted than that of inverted non-face objects. One hypothesis is that while upright faces are represented by face-specific mechanism, inverted faces are processed as objects. However, evidence from neuroimaging studies is inconclusive, possibly because the face system, such as the fusiform face area, is interacted with the object system, and therefore the observation from the face system may indirectly reflect influences from the object system. Here we examined the FIE in an artificial face system, visual geometry group network-face (VGG-Face), a deep convolutional neural network (DCNN) specialized for identifying faces. In line with neuroimaging studies on humans, a stronger FIE was found in VGG-Face than that in DCNN pretrained for processing objects. Critically, further classification error analysis revealed that in VGG-Face, inverted faces were miscategorized as objects behaviorally, and the analysis on internal representations revealed that VGG-Face represented inverted faces in a similar fashion as objects. In short, our study supported the hypothesis that inverted faces are represented as objects in a pure face system.
Collapse
Affiliation(s)
- Fang Tian
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Hailun Xie
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Yiying Song
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
- *Correspondence: Yiying Song
| | - Siyuan Hu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Jia Liu
- Department of Psychology & Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| |
Collapse
|
7
|
Križaj J, Dobrišek S, Štruc V. Making the Most of Single Sensor Information: A Novel Fusion Approach for 3D Face Recognition Using Region Covariance Descriptors and Gaussian Mixture Models. SENSORS 2022; 22:s22062388. [PMID: 35336559 PMCID: PMC8950587 DOI: 10.3390/s22062388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 03/15/2022] [Accepted: 03/17/2022] [Indexed: 11/25/2022]
Abstract
Most commercially successful face recognition systems combine information from multiple sensors (2D and 3D, visible light and infrared, etc.) to achieve reliable recognition in various environments. When only a single sensor is available, the robustness as well as efficacy of the recognition process suffer. In this paper, we focus on face recognition using images captured by a single 3D sensor and propose a method based on the use of region covariance matrixes and Gaussian mixture models (GMMs). All steps of the proposed framework are automated, and no metadata, such as pre-annotated eye, nose, or mouth positions is required, while only a very simple clustering-based face detection is performed. The framework computes a set of region covariance descriptors from local regions of different face image representations and then uses the unscented transform to derive low-dimensional feature vectors, which are finally modeled by GMMs. In the last step, a support vector machine classification scheme is used to make a decision about the identity of the input 3D facial image. The proposed framework has several desirable characteristics, such as an inherent mechanism for data fusion/integration (through the region covariance matrixes), the ability to explore facial images at different levels of locality, and the ability to integrate a domain-specific prior knowledge into the modeling procedure. Several normalization techniques are incorporated into the proposed framework to further improve performance. Extensive experiments are performed on three prominent databases (FRGC v2, CASIA, and UMB-DB) yielding competitive results.
Collapse
|
8
|
Deep Learning for Smart Healthcare-A Survey on Brain Tumor Detection from Medical Imaging. SENSORS 2022; 22:s22051960. [PMID: 35271115 PMCID: PMC8915095 DOI: 10.3390/s22051960] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/18/2022] [Accepted: 02/28/2022] [Indexed: 12/13/2022]
Abstract
Advances in technology have been able to affect all aspects of human life. For example, the use of technology in medicine has made significant contributions to human society. In this article, we focus on technology assistance for one of the most common and deadly diseases to exist, which is brain tumors. Every year, many people die due to brain tumors; based on “braintumor” website estimation in the U.S., about 700,000 people have primary brain tumors, and about 85,000 people are added to this estimation every year. To solve this problem, artificial intelligence has come to the aid of medicine and humans. Magnetic resonance imaging (MRI) is the most common method to diagnose brain tumors. Additionally, MRI is commonly used in medical imaging and image processing to diagnose dissimilarity in different parts of the body. In this study, we conducted a comprehensive review on the existing efforts for applying different types of deep learning methods on the MRI data and determined the existing challenges in the domain followed by potential future directions. One of the branches of deep learning that has been very successful in processing medical images is CNN. Therefore, in this survey, various architectures of CNN were reviewed with a focus on the processing of medical images, especially brain MRI images.
Collapse
|
9
|
Baaqeel H, Olatunji S. Spoofing detection on adaptive authentication System‐A survey. IET BIOMETRICS 2021. [DOI: 10.1049/bme2.12060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Hind Baaqeel
- Computer Science Department College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University Dammam Saudi Arabia
| | - Sunday Olatunji
- Computer Science Department College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University Dammam Saudi Arabia
- SAUDI ARAMCO Cybersecurity Chair College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University Dammam Saudi Arabia
| |
Collapse
|
10
|
Lee G, Kim M. Deepfake Detection Using the Rate of Change between Frames Based on Computer Vision. SENSORS 2021; 21:s21217367. [PMID: 34770675 PMCID: PMC8588474 DOI: 10.3390/s21217367] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 10/23/2021] [Accepted: 11/01/2021] [Indexed: 11/16/2022]
Abstract
Recently, artificial intelligence has been successfully used in fields, such as computer vision, voice, and big data analysis. However, various problems, such as security, privacy, and ethics, also occur owing to the development of artificial intelligence. One such problem are deepfakes. Deepfake is a compound word for deep learning and fake. It refers to a fake video created using artificial intelligence technology or the production process itself. Deepfakes can be exploited for political abuse, pornography, and fake information. This paper proposes a method to determine integrity by analyzing the computer vision features of digital content. The proposed method extracts the rate of change in the computer vision features of adjacent frames and then checks whether the video is manipulated. The test demonstrated the highest detection rate of 97% compared to the existing method or machine learning method. It also maintained the highest detection rate of 96%, even for the test that manipulates the matrix of the image to avoid the convolutional neural network detection method.
Collapse
Affiliation(s)
| | - Mihui Kim
- Correspondence: ; Tel.: +82-31-670-5167
| |
Collapse
|
11
|
Ge H, Dai Y, Zhu Z, Wang B. Robust face recognition based on multi-task convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:6638-6651. [PMID: 34517549 DOI: 10.3934/mbe.2021329] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
PURPOSE Due to the lack of prior knowledge of face images, large illumination changes, and complex backgrounds, the accuracy of face recognition is low. To address this issue, we propose a face detection and recognition algorithm based on multi-task convolutional neural network (MTCNN). METHODS In our paper, MTCNN mainly uses three cascaded networks, and adopts the idea of candidate box plus classifier to perform fast and efficient face recognition. The model is trained on a database of 50 faces we have collected, and Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measurement (SSIM), and receiver operating characteristic (ROC) curve are used to analyse MTCNN, Region-CNN (R-CNN) and Faster R-CNN. RESULTS The average PSNR of this technique is 1.24 dB higher than that of R-CNN and 0.94 dB higher than that of Faster R-CNN. The average SSIM value of MTCNN is 10.3% higher than R-CNN and 8.7% higher than Faster R-CNN. The Area Under Curve (AUC) of MTCNN is 97.56%, the AUC of R-CNN is 91.24%, and the AUC of Faster R-CNN is 92.01%. MTCNN has the best comprehensive performance in face recognition. For the face images with defective features, MTCNN still has the best effect. CONCLUSIONS This algorithm can effectively improve face recognition to a certain extent. The accuracy rate and the reduction of the false detection rate of face detection can not only be better used in key places, ensure the safety of property and security of the people, improve safety, but also better reduce the waste of human resources and improve efficiency.
Collapse
Affiliation(s)
- Huilin Ge
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China
| | - Yuewei Dai
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China
| | - Zhiyu Zhu
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China
| | - Biao Wang
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China
| |
Collapse
|
12
|
Rajagopalan N, N. V, Josephraj AN, E. S. Diagnosis of retinal disorders from Optical Coherence Tomography images using CNN. PLoS One 2021; 16:e0254180. [PMID: 34314421 PMCID: PMC8315505 DOI: 10.1371/journal.pone.0254180] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 06/21/2021] [Indexed: 12/04/2022] Open
Abstract
An efficient automatic decision support system for detection of retinal disorders is important and is the need of the hour. Optical Coherence Tomography (OCT) is the current imaging modality for the early detection of retinal disorders non-invasively. In this work, a Convolution Neural Network (CNN) model is proposed to classify three types of retinal disorders namely: Choroidal neovascularization (CNV), Drusen macular degeneration (DMD) and Diabetic macular edema (DME). The hyperparameters of the model like batch size, number of epochs, dropout rate, and the type of optimizer are tuned using random search optimization method for better performance to classify different retinal disorders. The proposed architecture provides an accuracy of 97.01%, sensitivity of 93.43%, and 98.07% specificity and it outperformed other existing models, when compared. The proposed model can be used for the large-scale screening of retinal disorders effectively.
Collapse
Affiliation(s)
- Nithya Rajagopalan
- Department of Biomedical Engineering, Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
- * E-mail:
| | - Venkateswaran N.
- Department of Electronics and Communication Engineering, Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
| | - Alex Noel Josephraj
- Department of Electronic Engineering, College of Engineering, Shantou University, Shantou, China
| | - Srithaladevi E.
- Department of Biomedical Engineering, Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
| |
Collapse
|
13
|
Compression Helps Deep Learning in Image Classification. ENTROPY 2021; 23:e23070881. [PMID: 34356422 PMCID: PMC8304067 DOI: 10.3390/e23070881] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 07/01/2021] [Accepted: 07/06/2021] [Indexed: 11/18/2022]
Abstract
The impact of JPEG compression on deep learning (DL) in image classification is revisited. Given an underlying deep neural network (DNN) pre-trained with pristine ImageNet images, it is demonstrated that, if, for any original image, one can select, among its many JPEG compressed versions including its original version, a suitable version as an input to the underlying DNN, then the classification accuracy of the underlying DNN can be improved significantly while the size in bits of the selected input is, on average, reduced dramatically in comparison with the original image. This is in contrast to the conventional understanding that JPEG compression generally degrades the classification accuracy of DL. Specifically, for each original image, consider its 10 JPEG compressed versions with their quality factor (QF) values from {100,90,80,70,60,50,40,30,20,10}. Under the assumption that the ground truth label of the original image is known at the time of selecting an input, but unknown to the underlying DNN, we present a selector called Highest Rank Selector (HRS). It is shown that HRS is optimal in the sense of achieving the highest Top k accuracy on any set of images for any k among all possible selectors. When the underlying DNN is Inception V3 or ResNet-50 V2, HRS improves, on average, the Top 1 classification accuracy and Top 5 classification accuracy on the whole ImageNet validation dataset by 5.6% and 1.9%, respectively, while reducing the input size in bits dramatically—the compression ratio (CR) between the size of the original images and the size of the selected input images by HRS is 8 for the whole ImageNet validation dataset. When the ground truth label of the original image is unknown at the time of selection, we further propose a new convolutional neural network (CNN) topology which is based on the underlying DNN and takes the original image and its 10 JPEG compressed versions as 11 parallel inputs. It is demonstrated that the proposed new CNN topology, even when partially trained, can consistently improve the Top 1 accuracy of Inception V3 and ResNet-50 V2 by approximately 0.4% and the Top 5 accuracy of Inception V3 and ResNet-50 V2 by 0.32% and 0.2%, respectively. Other selectors without the knowledge of the ground truth label of the original image are also presented. They maintain the Top 1 accuracy, the Top 5 accuracy, or the Top 1 and Top 5 accuracy of the underlying DNN, while achieving CRs of 8.8, 3.3, and 3.1, respectively.
Collapse
|
14
|
Akkoca Gazioğlu BS, Kamaşak ME. Effects of objects and image quality on melanoma classification using deep neural networks. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102530] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
15
|
|
16
|
How to Correctly Detect Face-Masks for COVID-19 from Visual Information? APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11052070] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The new Coronavirus disease (COVID-19) has seriously affected the world. By the end of November 2020, the global number of new coronavirus cases had already exceeded 60 million and the number of deaths 1,410,378 according to information from the World Health Organization (WHO). To limit the spread of the disease, mandatory face-mask rules are now becoming common in public settings around the world. Additionally, many public service providers require customers to wear face-masks in accordance with predefined rules (e.g., covering both mouth and nose) when using public services. These developments inspired research into automatic (computer-vision-based) techniques for face-mask detection that can help monitor public behavior and contribute towards constraining the COVID-19 pandemic. Although existing research in this area resulted in efficient techniques for face-mask detection, these usually operate under the assumption that modern face detectors provide perfect detection performance (even for masked faces) and that the main goal of the techniques is to detect the presence of face-masks only. In this study, we revisit these common assumptions and explore the following research questions: (i) How well do existing face detectors perform with masked-face images? (ii) Is it possible to detect a proper (regulation-compliant) placement of facial masks? and iii) How useful are existing face-mask detection techniques for monitoring applications during the COVID-19 pandemic? To answer these and related questions we conduct a comprehensive experimental evaluation of several recent face detectors for their performance with masked-face images. Furthermore, we investigate the usefulness of multiple off-the-shelf deep-learning models for recognizing correct face-mask placement. Finally, we design a complete pipeline for recognizing whether face-masks are worn correctly or not and compare the performance of the pipeline with standard face-mask detection models from the literature. To facilitate the study, we compile a large dataset of facial images from the publicly available MAFA and Wider Face datasets and annotate it with compliant and non-compliant labels. The annotation dataset, called Face-Mask-Label Dataset (FMLD), is made publicly available to the research community.
Collapse
|
17
|
Taskiran M, Kahraman N, Eroglu Erdem C. Hybrid face recognition under adverse conditions using appearance‐based and dynamic features of smile expression. IET BIOMETRICS 2020. [DOI: 10.1049/bme2.12006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Murat Taskiran
- Department of Electronics and Communication Engineering Yildiz Technical University Istanbul Turkey
| | - Nihan Kahraman
- Department of Electronics and Communication Engineering Yildiz Technical University Istanbul Turkey
| | | |
Collapse
|
18
|
Drone Image Segmentation Using Machine and Deep Learning for Mapping Raised Bog Vegetation Communities. REMOTE SENSING 2020. [DOI: 10.3390/rs12162602] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The application of drones has recently revolutionised the mapping of wetlands due to their high spatial resolution and the flexibility in capturing images. In this study, the drone imagery was used to map key vegetation communities in an Irish wetland, Clara Bog, for the spring season. The mapping, carried out through image segmentation or semantic segmentation, was performed using machine learning (ML) and deep learning (DL) algorithms. With the aim of identifying the most appropriate, cost-efficient, and accurate segmentation method, multiple ML classifiers and DL models were compared. Random forest (RF) was identified as the best pixel-based ML classifier, which provided good accuracy (≈85%) when used in conjunction graph cut algorithm for image segmentation. Amongst the DL networks, a convolutional neural network (CNN) architecture in a transfer learning framework was utilised. A combination of ResNet50 and SegNet architecture gave the best semantic segmentation results (≈90%). The high accuracy of DL networks was accompanied with significantly larger labelled training dataset, computation time and hardware requirements compared to ML classifiers with slightly lower accuracy. For specific applications such as wetland mapping where networks are required to be trained for each different site, topography, season, and other atmospheric conditions, ML classifiers proved to be a more pragmatic choice.
Collapse
|
19
|
Chaves D, Fidalgo E, Alegre E, Alaiz-Rodríguez R, Jáñez-Martino F, Azzopardi G. Assessment and Estimation of Face Detection Performance Based on Deep Learning for Forensic Applications. SENSORS 2020; 20:s20164491. [PMID: 32796644 PMCID: PMC7472057 DOI: 10.3390/s20164491] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 08/06/2020] [Accepted: 08/08/2020] [Indexed: 11/25/2022]
Abstract
Face recognition is a valuable forensic tool for criminal investigators since it certainly helps in identifying individuals in scenarios of criminal activity like fugitives or child sexual abuse. It is, however, a very challenging task as it must be able to handle low-quality images of real world settings and fulfill real time requirements. Deep learning approaches for face detection have proven to be very successful but they require large computation power and processing time. In this work, we evaluate the speed–accuracy tradeoff of three popular deep-learning-based face detectors on the WIDER Face and UFDD data sets in several CPUs and GPUs. We also develop a regression model capable to estimate the performance, both in terms of processing time and accuracy. We expect this to become a very useful tool for the end user in forensic laboratories in order to estimate the performance for different face detection options. Experimental results showed that the best speed–accuracy tradeoff is achieved with images resized to 50% of the original size in GPUs and images resized to 25% of the original size in CPUs. Moreover, performance can be estimated using multiple linear regression models with a Mean Absolute Error (MAE) of 0.113, which is very promising for the forensic field.
Collapse
Affiliation(s)
- Deisy Chaves
- Department of Electrical, Systems and Automation, Universidad de León, 24007 León, Spain; (E.F.); (E.A.); (R.A.-R.); (F.J.-M.)
- Researcher at INCIBE (Spanish National Cybersecurity Institute), 24005 León, Spain
- Correspondence:
| | - Eduardo Fidalgo
- Department of Electrical, Systems and Automation, Universidad de León, 24007 León, Spain; (E.F.); (E.A.); (R.A.-R.); (F.J.-M.)
- Researcher at INCIBE (Spanish National Cybersecurity Institute), 24005 León, Spain
| | - Enrique Alegre
- Department of Electrical, Systems and Automation, Universidad de León, 24007 León, Spain; (E.F.); (E.A.); (R.A.-R.); (F.J.-M.)
- Researcher at INCIBE (Spanish National Cybersecurity Institute), 24005 León, Spain
| | - Rocío Alaiz-Rodríguez
- Department of Electrical, Systems and Automation, Universidad de León, 24007 León, Spain; (E.F.); (E.A.); (R.A.-R.); (F.J.-M.)
- Researcher at INCIBE (Spanish National Cybersecurity Institute), 24005 León, Spain
| | - Francisco Jáñez-Martino
- Department of Electrical, Systems and Automation, Universidad de León, 24007 León, Spain; (E.F.); (E.A.); (R.A.-R.); (F.J.-M.)
- Researcher at INCIBE (Spanish National Cybersecurity Institute), 24005 León, Spain
| | - George Azzopardi
- Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, 9747 AG Groningen, The Netherlands;
| |
Collapse
|
20
|
Padilha R, Andaló FA, Bertocco G, Almeida WR, Dias W, Resek T, Torres RDS, Wainer J, Rocha A. Two‐tiered face verification with low‐memory footprint for mobile devices. IET BIOMETRICS 2020. [DOI: 10.1049/iet-bmt.2020.0031] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Rafael Padilha
- Institute of Computing, University of CampinasCampinasSPBrazil
| | | | | | | | - William Dias
- Institute of Computing, University of CampinasCampinasSPBrazil
| | | | - Ricardo da S. Torres
- Department of ICT and Natural Sciences, Faculty of Information Technology and Electrical EngineeringNTNUÅ˚lesundNorway
| | - Jacques Wainer
- Institute of Computing, University of CampinasCampinasSPBrazil
| | - Anderson Rocha
- Institute of Computing, University of CampinasCampinasSPBrazil
| |
Collapse
|
21
|
Burti S, Longhin Osti V, Zotti A, Banzato T. Use of deep learning to detect cardiomegaly on thoracic radiographs in dogs. Vet J 2020; 262:105505. [PMID: 32792095 DOI: 10.1016/j.tvjl.2020.105505] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 07/04/2020] [Accepted: 07/06/2020] [Indexed: 12/31/2022]
Abstract
The purpose of this study was to develop a computer-aided detection (CAD) device based on convolutional neural networks (CNNs) to detect cardiomegaly from plain radiographs in dogs. Right lateral chest radiographs (n = 1465) were retrospectively selected from archives. The radiographs were classified as having a normal cardiac silhouette (No-vertebral heart scale [VHS]-Cardiomegaly) or an enlarged cardiac silhouette (VHS-Cardiomegaly) based on the breed-specific VHS. The database was divided into a training set (1153 images) and a test set (315 images). The diagnostic accuracy of four different CNN models in the detection of cardiomegaly was calculated using the test set. All tested models had an area under the curve >0.9, demonstrating high diagnostic accuracy. There was a statistically significant difference between Model C and the remainder models (Model A vs. Model C, P = 0.0298; Model B vs. Model C, P = 0.003; Model C vs. Model D, P = 0.0018), but there were no significant differences between other combinations of models (Model A vs. Model B, P = 0.395; Model A vs. Model D, P = 0.128; Model B vs. Model D, P = 0.373). Convolutional neural networks could therefore assist veterinarians in detecting cardiomegaly in dogs from plain radiographs.
Collapse
Affiliation(s)
- S Burti
- Department of Animal Medicine, Productions and Health, University of Padua, Viale Dell'Università 16, 35020 Legnaro, Padua, Italy
| | - V Longhin Osti
- Department of Animal Medicine, Productions and Health, University of Padua, Viale Dell'Università 16, 35020 Legnaro, Padua, Italy
| | - A Zotti
- Department of Animal Medicine, Productions and Health, University of Padua, Viale Dell'Università 16, 35020 Legnaro, Padua, Italy
| | - T Banzato
- Department of Animal Medicine, Productions and Health, University of Padua, Viale Dell'Università 16, 35020 Legnaro, Padua, Italy.
| |
Collapse
|
22
|
Liu CYJ, Wilkinson C. A guided manual method for juvenile age progression using digital images. Forensic Sci Int 2020; 308:110170. [PMID: 32066014 DOI: 10.1016/j.forsciint.2020.110170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 01/21/2020] [Accepted: 01/24/2020] [Indexed: 12/01/2022]
Abstract
Predicting the possible age-related changes to a child's face, age progression methods modify the shape, colour and texture of a facial image while retaining the identity of the individual. However, the techniques vary between different practitioners. This study combines different age progression techniques for juvenile subjects, various researches based on longitudinal radiographic data; physical anthropometric measurements of the head and face; and digital image measurements in pixels. Utilising 12 anthropometric measurements of the face, this study documents a new workflow for digital manual age progression. An inter-observer error study (n = 5) included the comparison of two age progressions of the same individual at different ages. The proposed age progression method recorded satisfactory levels of repeatability based on the 12 anthropometric measurements. Seven measurements achieved an error below 8.60%. Facial anthropometric measurements involving the nasion (n) and trichion (tr) showed the most inconsistency (14-34% difference between the practitioners). Overall, the horizontal measurements were more accurate than the vertical measurements. The age progression images were compared using a manual morphological method and machine-based face recognition. The confidence scores generated by the three different facial recognition APIs suggested the performance of any age progression not only varies between practitioners, but also between the Facial recognition systems. The suggested new workflow was able to guide the positioning of the facial features, but the process of age progression remains dependant on artistic interpretation.
Collapse
Affiliation(s)
- Ching Yiu Jessica Liu
- Face Lab, IC1 Liverpool Science Park, 131 Mount Pleasant, Liverpool, L3 5TF, United Kingdom.
| | - Caroline Wilkinson
- Liverpool School of Art & Design, Duckinfield Street Liverpool, L3 5RD, United Kingdom.
| |
Collapse
|
23
|
Liu CYJ, Wilkinson C. Image conditions for machine-based face recognition of juvenile faces. Sci Justice 2020; 60:43-52. [PMID: 31924288 DOI: 10.1016/j.scijus.2019.10.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 08/06/2019] [Accepted: 10/06/2019] [Indexed: 11/27/2022]
Affiliation(s)
- Ching Yiu Jessica Liu
- Liverpool John Moores University, Liverpool School of Art and Design, IC1 Liverpool Science Park, 131 Mount Pleasant, Liverpool, Merseyside L3 5TF, United Kingdom.
| | - Caroline Wilkinson
- Liverpool John Moores University, Art and Design Academy, 2 Duckinfield Street, Liverpool, Merseyside L3 5RD, United Kingdom.
| |
Collapse
|
24
|
Enhancing human iris recognition performance in unconstrained environment using ensemble of convolutional and residual deep neural network models. Soft comput 2019. [DOI: 10.1007/s00500-019-04610-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
25
|
Sepas‐Moghaddam A, Pereira FM, Correia PL. Face recognition: a novel multi‐level taxonomy based survey. IET BIOMETRICS 2019. [DOI: 10.1049/iet-bmt.2019.0001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Alireza Sepas‐Moghaddam
- Instituto de Telecomunicações, Instituto Superior Técnico – Universidade de LisboaLisbonPortugal
| | - Fernando M. Pereira
- Instituto de Telecomunicações, Instituto Superior Técnico – Universidade de LisboaLisbonPortugal
| | - Paulo Lobato Correia
- Instituto de Telecomunicações, Instituto Superior Técnico – Universidade de LisboaLisbonPortugal
| |
Collapse
|
26
|
Olisah CC, Smith L. Understanding unconventional preprocessors in deep convolutional neural networks for face identification. SN APPLIED SCIENCES 2019. [DOI: 10.1007/s42452-019-1538-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Abstract
Deep convolutional neural networks have achieved huge successes in application domains like object and face recognition. The performance gain is attributed to different facets of the network architecture such as: depth of the convolutional layers, activation function, pooling, batch normalization, forward and back propagation and many more. However, very little emphasis is made on the preprocessor’s module of the network. Therefore, in this paper, the network’s preprocessing module is varied across different preprocessing approaches while keeping constant other facets of the deep network architecture, to investigate the contribution preprocessing makes to the network. Commonly used preprocessors are the data augmentation and normalization and are termed conventional preprocessors. Others are termed the unconventional preprocessors, they are: color space converters; grey-level resolution preprocessors; full-based and plane-based image quantization, Gaussian blur, illumination normalization and insensitive feature preprocessors. To achieve fixed network parameters, CNNs with transfer learning is employed. The aim is to transfer knowledge from the high-level feature vectors of the Inception-V3 network to offline preprocessed LFW target data; and features is trained using the SoftMax classifier for face identification. The experiments show that the discriminative capability of the deep networks can be improved by preprocessing RGB data with some of the unconventional preprocessors before feeding it to the CNNs. However, for best performance, the right setup of preprocessed data with augmentation and/or normalization is required. Summarily, preprocessing data before it is fed to the deep network is found to increase the homogeneity of neighborhood pixels even at reduced bit depth which serves for better storage efficiency.
Collapse
|
27
|
Grm K, Scheirer WJ, Struc V. Face Hallucination Using Cascaded Super-Resolution and Identity Priors. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:2150-2165. [PMID: 31613762 DOI: 10.1109/tip.2019.2945835] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper we address the problem of hallucinating high-resolution facial images from low-resolution inputs at high magnification factors. We approach this task with convolutional neural networks (CNNs) and propose a novel (deep) face hallucination model that incorporates identity priors into the learning procedure. The model consists of two main parts: i) a cascaded super-resolution network that upscales the low-resolution facial images, and ii) an ensemble of face recognition models that act as identity priors for the super-resolution network during training. Different from most competing super-resolution techniques that rely on a single model for upscaling (even with large magnification factors), our network uses a cascade of multiple SR models that progressively upscale the low-resolution images using steps of 2× . This characteristic allows us to apply supervision signals (target appearances) at different resolutions and incorporate identity constraints at multiple-scales. The proposed C-SRIP model (Cascaded Super Resolution with Identity Priors) is able to upscale (tiny) low-resolution images captured in unconstrained conditions and produce visually convincing results for diverse low-resolution inputs. We rigorously evaluate the proposed model on the Labeled Faces in the Wild (LFW), Helen and CelebA datasets and report superior performance compared to the existing state-of-the-art.
Collapse
|
28
|
Affiliation(s)
- Jinane Mounsef
- School of Electrical, Computer & Energy EngineeringArizona State UniversityTempeAZUSA
| | - Lina Karam
- School of Electrical, Computer & Energy EngineeringArizona State UniversityTempeAZUSA
| |
Collapse
|
29
|
Kim JY, Lee HE, Choi YH, Lee SJ, Jeon JS. CNN-based diagnosis models for canine ulcerative keratitis. Sci Rep 2019; 9:14209. [PMID: 31578338 PMCID: PMC6775068 DOI: 10.1038/s41598-019-50437-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 09/12/2019] [Indexed: 12/21/2022] Open
Abstract
The purpose of this methodological study was to develop a convolutional neural network (CNN), which is a recently developed deep-learning-based image recognition method, to determine corneal ulcer severity in dogs. The CNN model was trained with images for which corneal ulcer severity (normal, superficial, and deep) were previously classified by veterinary ophthalmologists' diagnostic evaluations of corneal photographs from patients who visited the Veterinary Medical Teaching Hospital (VMTH) at Konkuk University and 3 different veterinary ophthalmology specialty hospitals in Korea. The original images (depicting normal corneas (36) and corneas with superficial (47) ulcers, deep (47) ulcers), flipped images (total 520), rotated images (total 520), and both flipped and rotated images (total 1,040) were labeled, learned and evaluated with GoogLeNet, ResNet, and VGGNet models, and the severity of each corneal ulcer image was determined. To accomplish this task, models based on TensorFlow, an open-source software library developed by Google, were used, and the labeled images were converted into TensorFlow record (TFRecord) format. The models were fine-tuned using a CNN model trained on the ImageNet dataset and then used to predict severity. Most of the models achieved accuracies of over 90% when classifying superficial and deep corneal ulcers, and ResNet and VGGNet achieved accuracies over 90% for classifying normal corneas, corneas with superficial ulcers, and corneas with deep ulcers. This study proposes a method to effectively determine corneal ulcer severity in dogs by using a CNN and concludes that multiple image classification models can be used in the veterinary field.
Collapse
Affiliation(s)
- Joon Young Kim
- Veterinary Medical Teaching Hospital, Konkuk University, Seoul, 05029, Republic of Korea
| | - Ha Eun Lee
- Veterinary Medical Teaching Hospital, Konkuk University, Seoul, 05029, Republic of Korea
| | - Yeon Hyung Choi
- Veterinary Medical Teaching Hospital, Konkuk University, Seoul, 05029, Republic of Korea
| | - Suk Jun Lee
- Division of Business Administration, College of Business, Kwangwoon University, Seoul, 01897, Republic of Korea.
| | - Jong Soo Jeon
- Division of Business Administration, College of Business, Kwangwoon University, Seoul, 01897, Republic of Korea.
| |
Collapse
|
30
|
|
31
|
Emeršič Ž, Meden B, Peer P, Štruc V. Evaluation and analysis of ear recognition models: performance, complexity and resource requirements. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3530-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
32
|
Meden B, Emeršič Ž, Štruc V, Peer P. k-Same-Net: k-Anonymity with Generative Deep Neural Networks for Face Deidentification. ENTROPY 2018; 20:e20010060. [PMID: 33265147 PMCID: PMC7512257 DOI: 10.3390/e20010060] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Revised: 12/31/2017] [Accepted: 01/09/2018] [Indexed: 11/16/2022]
Abstract
Image and video data are today being shared between government entities and other relevant stakeholders on a regular basis and require careful handling of the personal information contained therein. A popular approach to ensure privacy protection in such data is the use of deidentification techniques, which aim at concealing the identity of individuals in the imagery while still preserving certain aspects of the data after deidentification. In this work, we propose a novel approach towards face deidentification, called k-Same-Net, which combines recent Generative Neural Networks (GNNs) with the well-known k-Anonymitymechanism and provides formal guarantees regarding privacy protection on a closed set of identities. Our GNN is able to generate synthetic surrogate face images for deidentification by seamlessly combining features of identities used to train the GNN model. Furthermore, it allows us to control the image-generation process with a small set of appearance-related parameters that can be used to alter specific aspects (e.g., facial expressions, age, gender) of the synthesized surrogate images. We demonstrate the feasibility of k-Same-Net in comprehensive experiments on the XM2VTS and CK+ datasets. We evaluate the efficacy of the proposed approach through reidentification experiments with recent recognition models and compare our results with competing deidentification techniques from the literature. We also present facial expression recognition experiments to demonstrate the utility-preservation capabilities of k-Same-Net. Our experimental results suggest that k-Same-Net is a viable option for facial deidentification that exhibits several desirable characteristics when compared to existing solutions in this area.
Collapse
Affiliation(s)
- Blaž Meden
- Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
- Correspondence: ; Tel.: +386-1-479-8245
| | - Žiga Emeršič
- Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
| | - Vitomir Štruc
- Faculty of Electrical Engineering, University of Ljubljana, Tržaška cesta 25, SI-1000 Ljubljana, Slovenia
| | - Peter Peer
- Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
| |
Collapse
|