1
|
Kumar Y, Koul A, Kamini, Woźniak M, Shafi J, Ijaz MF. Automated detection and recognition system for chewable food items using advanced deep learning models. Sci Rep 2024; 14:6589. [PMID: 38504098 PMCID: PMC10951243 DOI: 10.1038/s41598-024-57077-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 03/14/2024] [Indexed: 03/21/2024] Open
Abstract
Identifying and recognizing the food on the basis of its eating sounds is a challenging task, as it plays an important role in avoiding allergic foods, providing dietary preferences to people who are restricted to a particular diet, showcasing its cultural significance, etc. In this research paper, the aim is to design a novel methodology that helps to identify food items by analyzing their eating sounds using various deep learning models. To achieve this objective, a system has been proposed that extracts meaningful features from food-eating sounds with the help of signal processing techniques and deep learning models for classifying them into their respective food classes. Initially, 1200 audio files for 20 food items labeled have been collected and visualized to find relationships between the sound files of different food items. Later, to extract meaningful features, various techniques such as spectrograms, spectral rolloff, spectral bandwidth, and mel-frequency cepstral coefficients are used for the cleaning of audio files as well as to capture the unique characteristics of different food items. In the next phase, various deep learning models like GRU, LSTM, InceptionResNetV2, and the customized CNN model have been trained to learn spectral and temporal patterns in audio signals. Besides this, the models have also been hybridized i.e. Bidirectional LSTM + GRU and RNN + Bidirectional LSTM, and RNN + Bidirectional GRU to analyze their performance for the same labeled data in order to associate particular patterns of sound with their corresponding class of food item. During evaluation, the highest accuracy, precision,F1 score, and recall have been obtained by GRU with 99.28%, Bidirectional LSTM + GRU with 97.7% as well as 97.3%, and RNN + Bidirectional LSTM with 97.45%, respectively. The results of this study demonstrate that deep learning models have the potential to precisely identify foods on the basis of their sound by computing the best outcomes.
Collapse
Affiliation(s)
- Yogesh Kumar
- Department of CSE, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India
| | - Apeksha Koul
- Department of Computer Science and Engineering, Punjabi University, Patiala, Punjab, India
| | - Kamini
- Southern Alberta Institute of Technology, Calgary, Alberta, Canada
| | - Marcin Woźniak
- Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44100, Gliwice, Poland.
| | - Jana Shafi
- Department of Computer Engineering and Information, College of Engineering in Wadi Al Dawasir, Prince Sattam Bin Abdulaziz University, 11991, Wadi Al Dawasir, Saudi Arabia
| | - Muhammad Fazal Ijaz
- School of IT and Engineering, Melbourne Institute of Technology, Melbourne, 3000, Australia.
| |
Collapse
|
2
|
Papapanagiotou V, Ganotakis S, Delopoulos A. Bite-Weight Estimation Using Commercial Ear Buds. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:7182-7185. [PMID: 34892757 DOI: 10.1109/embc46164.2021.9630500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
While automatic tracking and measuring of our physical activity is a well established domain, not only in research but also in commercial products and every-day lifestyle, automatic measurement of eating behavior is significantly more limited. Despite the abundance of methods and algorithms that are available in bibliography, commercial solutions are mostly limited to digital logging applications for smart-phones. One factor that limits the adoption of such solutions is that they usually require specialized hardware or sensors. Based on this, we evaluate the potential for estimating the weight of consumed food (per bite) based only on the audio signal that is captured by commercial ear buds (Samsung Galaxy Buds). Specifically, we examine a combination of features (both audio and non-audio features) and trainable estimators (linear regression, support vector regression, and neural-network based estimators) and evaluate on an in-house dataset of 8 participants and 4 food types. Results indicate good potential for this approach: our best results yield mean absolute error of less than 1 g for 3 out of 4 food types when training food-specific models, and 2.1 g when training on all food types together, both of which improve over an existing literature approach.
Collapse
|
3
|
Papapanagiotou V, Diou C, Delopoulos A. Self-Supervised Feature Learning of 1D Convolutional Neural Networks with Contrastive Loss for Eating Detection Using an In-Ear Microphone. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:7186-7189. [PMID: 34892758 DOI: 10.1109/embc46164.2021.9630399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The importance of automated and objective monitoring of dietary behavior is becoming increasingly accepted. The advancements in sensor technology along with recent achievements in machine-learning-based signal-processing algorithms have enabled the development of dietary monitoring solutions that yield highly accurate results. A common bottleneck for developing and training machine learning algorithms is obtaining labeled data for training supervised algorithms, and in particular ground truth annotations. Manual ground truth annotation is laborious, cumbersome, can sometimes introduce errors, and is sometimes impossible in free-living data collection. As a result, there is a need to decrease the labeled data required for training. Additionally, unlabeled data, gathered in-the-wild from existing wearables (such as Bluetooth earbuds) can be used to train and fine-tune eating-detection models. In this work, we focus on training a feature extractor for audio signals captured by an in-ear microphone for the task of eating detection in a self-supervised way. We base our approach on the SimCLR method for image classification, proposed by Chen et al. from the domain of computer vision. Results are promising as our self-supervised method achieves similar results to supervised training alternatives, and its overall effectiveness is comparable to current state-of-the-art methods. Code is available at https://github.com/mug-auth/ssl-chewing.
Collapse
|
4
|
Chen G, Jia W, Zhao Y, Mao ZH, Lo B, Anderson AK, Frost G, Jobarteh ML, McCrory MA, Sazonov E, Steiner-Asiedu M, Ansong RS, Baranowski T, Burke L, Sun M. Food/Non-Food Classification of Real-Life Egocentric Images in Low- and Middle-Income Countries Based on Image Tagging Features. Front Artif Intell 2021; 4:644712. [PMID: 33870184 PMCID: PMC8047062 DOI: 10.3389/frai.2021.644712] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 02/26/2021] [Indexed: 11/25/2022] Open
Abstract
Malnutrition, including both undernutrition and obesity, is a significant problem in low- and middle-income countries (LMICs). In order to study malnutrition and develop effective intervention strategies, it is crucial to evaluate nutritional status in LMICs at the individual, household, and community levels. In a multinational research project supported by the Bill & Melinda Gates Foundation, we have been using a wearable technology to conduct objective dietary assessment in sub-Saharan Africa. Our assessment includes multiple diet-related activities in urban and rural families, including food sources (e.g., shopping, harvesting, and gathering), preservation/storage, preparation, cooking, and consumption (e.g., portion size and nutrition analysis). Our wearable device ("eButton" worn on the chest) acquires real-life images automatically during wake hours at preset time intervals. The recorded images, in amounts of tens of thousands per day, are post-processed to obtain the information of interest. Although we expect future Artificial Intelligence (AI) technology to extract the information automatically, at present we utilize AI to separate the acquired images into two binary classes: images with (Class 1) and without (Class 0) edible items. As a result, researchers need only to study Class-1 images, reducing their workload significantly. In this paper, we present a composite machine learning method to perform this classification, meeting the specific challenges of high complexity and diversity in the real-world LMIC data. Our method consists of a deep neural network (DNN) and a shallow learning network (SLN) connected by a novel probabilistic network interface layer. After presenting the details of our method, an image dataset acquired from Ghana is utilized to train and evaluate the machine learning system. Our comparative experiment indicates that the new composite method performs better than the conventional deep learning method assessed by integrated measures of sensitivity, specificity, and burden index, as indicated by the Receiver Operating Characteristic (ROC) curve.
Collapse
Affiliation(s)
- Guangzong Chen
- Department of Electrical and Computer Engineering, University of Pittsburgh, PA, United States
| | - Wenyan Jia
- Department of Electrical and Computer Engineering, University of Pittsburgh, PA, United States
| | - Yifan Zhao
- Department of Electrical and Computer Engineering, University of Pittsburgh, PA, United States
| | - Zhi-Hong Mao
- Department of Electrical and Computer Engineering, University of Pittsburgh, PA, United States
| | - Benny Lo
- Hamlyn Centre, Imperial College London, London, United Kingdom
| | - Alex K. Anderson
- Department of Foods and Nutrition, University of Georgia, Athens, GA, United States
| | - Gary Frost
- Section for Nutrition Research, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, United Kingdom
| | - Modou L. Jobarteh
- Section for Nutrition Research, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, United Kingdom
| | - Megan A. McCrory
- Department of Health Sciences, Boston University, Boston, MA, United States
| | - Edward Sazonov
- Department of Electrical and Computer Engineering, University of Alabama, Tuscaloosa, AL, United States
| | | | - Richard S. Ansong
- Department of Nutrition and Food Science, University of Ghana, Legon-Accra, Ghana
| | - Thomas Baranowski
- USDA/ARS Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, TX, United States
| | - Lora Burke
- School of Nursing, University of Pittsburgh, Pittsburgh, PA, United States
| | - Mingui Sun
- Department of Electrical and Computer Engineering, University of Pittsburgh, PA, United States
- Department of Neurosurgery, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
5
|
Kyritsis K, Diou C, Delopoulos A. A Data Driven End-to-End Approach for In-the-Wild Monitoring of Eating Behavior Using Smartwatches. IEEE J Biomed Health Inform 2021; 25:22-34. [PMID: 32750897 DOI: 10.1109/jbhi.2020.2984907] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The increased worldwide prevalence of obesity has sparked the interest of the scientific community towards tools that objectively and automatically monitor eating behavior. Despite the study of obesity being in the spotlight, such tools can also be used to study eating disorders (e.g. anorexia nervosa) or provide a personalized monitoring platform for patients or athletes. This paper presents a complete framework towards the automated i) modeling of in-meal eating behavior and ii) temporal localization of meals, from raw inertial data collected in-the-wild using commercially available smartwatches. Initially, we present an end-to-end Neural Network which detects food intake events (i.e. bites). The proposed network uses both convolutional and recurrent layers that are trained simultaneously. Subsequently, we show how the distribution of the detected bites throughout the day can be used to estimate the start and end points of meals, using signal processing algorithms. We perform extensive evaluation on each framework part individually. Leave-one-subject-out (LOSO) evaluation shows that our bite detection approach outperforms four state-of-the-art algorithms towards the detection of bites during the course of a meal (0.923 F1 score). Furthermore, LOSO and held-out set experiments regarding the estimation of meal start/end points reveal that the proposed approach outperforms a relevant approach found in the literature (Jaccard Index of 0.820 and 0.821 for the LOSO and held-out experiments, respectively). Experiments are performed using our publicly available FIC and the newly introduced FreeFIC datasets.
Collapse
|
6
|
Kyritsis K, Diou C, Delopoulos A. Modeling Wrist Micromovements to Measure In-Meal Eating Behavior From Inertial Sensor Data. IEEE J Biomed Health Inform 2019; 23:2325-2334. [PMID: 30629523 DOI: 10.1109/jbhi.2019.2892011] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Overweight and obesity are both associated with in-meal eating parameters such as eating speed. Recently, the plethora of available wearable devices in the market ignited the interest of both the scientific community and the industry toward unobtrusive solutions for eating behavior monitoring. In this paper, we present an algorithm for automatically detecting the in-meal food intake cycles using the inertial signals (acceleration and orientation velocity) from an off-the-shelf smartwatch. We use five specific wrist micromovements to model the series of actions leading to and following an intake event (i.e., bite). Food intake detection is performed in two steps. In the first step, we process windows of raw sensor streams and estimate their micromovement probability distributions by means of a convolutional neural network. In the second step, we use a long short-term memory network to capture the temporal evolution and classify sequences of windows as food intake cycles. Evaluation is performed using a challenging dataset of 21 meals from 12 subjects. In our experiments, we compare the performance of our algorithm against three state-of-the-art approaches, where our approach achieves the highest F1 detection score (0.913 in the leave-one-subject-out experiment). The dataset used in the experiments is available at https://mug.ee.auth.gr/intake-cycle-detection/.
Collapse
|