451
|
Forecasting Air Temperature on Edge Devices with Embedded AI. SENSORS 2021; 21:s21123973. [PMID: 34207546 PMCID: PMC8228015 DOI: 10.3390/s21123973] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 06/03/2021] [Accepted: 06/07/2021] [Indexed: 11/17/2022]
Abstract
With the advent of the Smart Agriculture, the joint utilization of Internet of Things (IoT) and Machine Learning (ML) holds the promise to significantly improve agricultural production and sustainability. In this paper, the design of a Neural Network (NN)-based prediction model of a greenhouse's internal air temperature, to be deployed and run on an edge device with constrained capabilities, is investigated. The model relies on a time series-oriented approach, taking as input variables the past and present values of the air temperature to forecast the future ones. In detail, we evaluate three different NN architecture types-namely, Long Short-Term Memory (LSTM) networks, Recurrent NNs (RNNs) and Artificial NNs (ANNs)-with various values of the sliding window associated with input data. Experimental results show that the three best-performing models have a Root Mean Squared Error (RMSE) value in the range 0.289÷0.402∘C, a Mean Absolute Percentage Error (MAPE) in the range of 0.87÷1.04%, and a coefficient of determination (R2) not smaller than 0.997. The overall best performing model, based on an ANN, has a good prediction performance together with low computational and architectural complexities (evaluated on the basis of the NetScore metric), making its deployment on an edge device feasible.
Collapse
|
452
|
Zhang Wu M, Luo J, Fang X, Xu M, Zhao P. Modeling multivariate cyber risks: deep learning dating extreme value theory. J Appl Stat 2021; 50:610-630. [PMID: 36819078 PMCID: PMC9930783 DOI: 10.1080/02664763.2021.1936468] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Modeling cyber risks has been an important but challenging task in the domain of cyber security, which is mainly caused by the high dimensionality and heavy tails of risk patterns. Those obstacles have hindered the development of statistical modeling of the multivariate cyber risks. In this work, we propose a novel approach for modeling the multivariate cyber risks which relies on the deep learning and extreme value theory. The proposed model not only enjoys the high accurate point predictions via deep learning but also can provide the satisfactory high quantile predictions via extreme value theory. Both the simulation and empirical studies show that the proposed approach can model the multivariate cyber risks very well and provide satisfactory prediction performances.
Collapse
|
453
|
Sudarshan VK, Brabrand M, Range TM, Wiil UK. Performance evaluation of Emergency Department patient arrivals forecasting models by including meteorological and calendar information: A comparative study. Comput Biol Med 2021; 135:104541. [PMID: 34166880 DOI: 10.1016/j.compbiomed.2021.104541] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 05/30/2021] [Accepted: 05/30/2021] [Indexed: 11/30/2022]
Abstract
The volume of daily patient arrivals at Emergency Departments (EDs) is unpredictable and is a significant reason of ED crowding in hospitals worldwide. Timely forecast of patients arriving at ED can help the hospital management in early planning and avoiding of overcrowding. Many different ED patient arrivals forecasting models have previously been proposed by using time series analysis methods. Even though the time series methods such as Linear and Logistic Regression, Autoregressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA), Exponential Smoothing (ES), and Artificial Neural Network (ANN) have been explored extensively for the ED forecasting model development, the few significant limitations of these methods associated in the analysis of time series data make the models inadequate in many practical situations. Therefore, in this paper, Machine Learning (ML)-based Random Forest (RF) regressor, and Deep Neural Network (DNN)-based Long Short-Term Memory (LSTM) and Convolutional Neural network (CNN) methods, which have not been explored to the same extent as the other time series techniques, are implemented by incorporating meteorological and calendar parameters for the development of forecasting models. The performances of the developed three models in forecasting ED patient arrivals are evaluated. Among the three models, CNN outperformed for short-term (3 days in advance) patient arrivals prediction with Mean Absolute Percentage Error (MAPE) of 9.24% and LSTM performed better for moderate-term (7 days in advance) patient arrivals prediction with MAPE of 8.91% using weather forecast information. Whereas, LSTM model outperformed with MAPE of 8.04% compared to 9.53% by CNN and 10.10% by RF model for current day prediction of patient arrivals using 3 days past weather information. Thus, for short-term ED patient arrivals forecasting, DNN-based model performed better compared to RF regressor ML-based model.
Collapse
|
454
|
Banerjee S, Lian Y. Data driven covid-19 spread prediction based on mobility and mask mandate information. APPL INTELL 2021; 52:1969-1978. [PMID: 34764603 PMCID: PMC8172182 DOI: 10.1007/s10489-021-02381-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/24/2021] [Indexed: 11/20/2022]
Abstract
COVID-19 is one of the largest spreading pandemic diseases faced in the documented history of mankind. Human to human interaction is the most prolific method of transmission of this virus. Nations all across the globe started to issue stay at home orders and mandating to wear masks or a form of face-covering in public to minimize the transmission by reducing contact between majority of the populace. The epidemiological models used in the literature have considerable drawbacks in the assumption of homogeneous mixing among the populace. Moreover, the effect of mitigation strategies such as mask mandate and stay at home orders cannot be efficiently accounted for in these models. In this work, we propose a novel data driven approach using LSTM (Long Short Term Memory) neural network model to form a functional mapping of daily new confirmed cases with mobility data which has been quantified from cell phone traffic information and mask mandate information. With this approach no pre-defined equations are used to predict the spread, no homogeneous mixing assumption is made, and the effect of mitigation strategies can be accounted for. The model learns the spread of the virus based on factual data from verified resources. A study of the number of cases for the state of New York (NY) and state of Florida (FL) in the USA are performed using the model. The model correctly predicts that with higher mobility the cases would increase and vice-versa. It further predicts the rate of new cases would see a decline if a mask mandate is administered. Both these predictions are in agreement with the opinions of leading medical and immunological experts. The model also predicts that with the mask mandate option even a higher mobility would reduce the daily cases than lower mobility without masks. We additionally generate results and provide RMSE (Root Mean Square Error) comparison with ARIMA based model of other published work for Italy, Turkey, Australia, Brazil, Canada, Egypt, Japan, and the UK. Our model reports lower RMSE than the ARIMA based work for all eight countries which were tested. The proposed model would provide administrations with a quantifiable basis of how mobility, mask mandates are related to new confirmed cases; so far no epidemiological models provide that information. It gives fast and relatively accurate prediction of the number of cases and would enable the administrations to make informed decisions and make plans for mitigation strategies and changes in hospital resources.
Collapse
|
455
|
Quddus A, Shahidi Zandi A, Prest L, Comeau FJE. Using long short term memory and convolutional neural networks for driver drowsiness detection. ACCIDENT; ANALYSIS AND PREVENTION 2021; 156:106107. [PMID: 33848710 DOI: 10.1016/j.aap.2021.106107] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 07/19/2020] [Accepted: 03/27/2021] [Indexed: 06/12/2023]
Abstract
Fatigue negatively affects the safety and performance of drivers on the road. In fact, drowsiness and fatigue are the cause of a substantial number of motor vehicle accidents. Drowsiness among the drivers can be detected using variety of modalities, including electroencephalogram (EEG), eye movement, and vehicle driving dynamics. Among these EEG is highly accurate but very intrusive and cumbersome. On the other hand, vehicle driving dynamics are very easy to acquire but accuracy is not very high. Eye movement based approach is very attractive in terms of balance between these two extremes. However, eye movement based techniques normally require an eye tracking device which consists of high speed camera with sophisticated algorithm to extract eye movement related parameters such as blinking, eye closure, saccades, fixation etc. This makes eye tracking based drowsiness detection difficult to implement as a practical system, especially on an embedded platform. In this paper, authors propose to use eye images from camera directly without the need for expensive eye-tracking system. Here, eye related movements are captured by Recurrent Neural Network (RNN) to detect the drowsiness. Long Short Term Memory (LSTM) is a class of RNN which has several advantages over vanilla RNNs. In this work an array of LSTM cells are utilized to model the eye movements. Two types of LSTMs were employed: 1-D LSTM (R-LSTM) which is used as baseline and the convolutional LSTM (C-LSTM) which facilitates using 2-D images directly. Patches of size 48 × 48 around each eye were extracted from 38 subjects, participating in a simulated driving experiment. The state of vigilance among the subjects were independently assessed by power spectral analysis of multichannel electroencephalogram (EEG) signals, recorded simultaneously, and binary labels of alert and drowsy (baseline) were generated. Results show high efficacy of the proposed system. R-LSTM based approach resulted in accuracy around 82 % and C-LSTM based approach resulted in accuracy in the range of 95%-97%. Comparison is also provided with a recently published eye-tracking based approach, showing the proposed LSTM technique outperform with a wide margin.
Collapse
|
456
|
Barren M, Hauskrecht M. Improving Prediction of Low-Prior Clinical Events with Simultaneous General Patient-State Representation Learning. ARTIFICIAL INTELLIGENCE IN MEDICINE. CONFERENCE ON ARTIFICIAL INTELLIGENCE IN MEDICINE (2005- ) 2021; 12721:479-490. [PMID: 34308430 PMCID: PMC8301230 DOI: 10.1007/978-3-030-77211-6_57] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
Low-prior targets are common among many important clinical events, which introduces the challenge of having enough data to support learning of their predictive models. Many prior works have addressed this problem by first building a general patient-state representation model, and then adapting it to a new low-prior prediction target. In this schema, there is potential for the predictive performance to be hindered by the misalignment between the general patient-state model and the target task. To overcome this challenge, we propose a new method that simultaneously optimizes a shared model through multi-task learning of both the low-prior supervised target and general purpose patient-state representation (GPSR). More specifically, our method improves prediction performance of a low-prior task by jointly optimizing a shared model that combines the loss of the target event and a broad range of generic clinical events. We study the approach in the context of Recurrent Neural Networks (RNNs). Through extensive experiments on multiple clinical event targets using MIMIC-III [8] data, we show that the inclusion of general patient-state representation tasks during model training improves the prediction of individual low-prior targets.
Collapse
|
457
|
Protein Structure Prediction: Conventional and Deep Learning Perspectives. Protein J 2021; 40:522-544. [PMID: 34050498 DOI: 10.1007/s10930-021-10003-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/21/2021] [Indexed: 10/21/2022]
Abstract
Protein structure prediction is a way to bridge the sequence-structure gap, one of the main challenges in computational biology and chemistry. Predicting any protein's accurate structure is of paramount importance for the scientific community, as these structures govern their function. Moreover, this is one of the complicated optimization problems that computational biologists have ever faced. Experimental protein structure determination methods include X-ray crystallography, Nuclear Magnetic Resonance Spectroscopy and Electron Microscopy. All of these are tedious and time-consuming procedures that require expertise. To make the process less cumbersome, scientists use predictive tools as part of computational methods, using data consolidated in the protein repositories. In recent years, machine learning approaches have raised the interest of the structure prediction community. Most of the machine learning approaches for protein structure prediction are centred on co-evolution based methods. The accuracy of these approaches depends on the number of homologous protein sequences available in the databases. The prediction problem becomes challenging for many proteins, especially those without enough sequence homologs. Deep learning methods allow for the extraction of intricate features from protein sequence data without making any intuitions. Accurately predicted protein structures are employed for drug discovery, antibody designs, understanding protein-protein interactions, and interactions with other molecules. This article provides a review of conventional and deep learning approaches in protein structure prediction. We conclude this review by outlining a few publicly available datasets and deep learning architectures currently employed for protein structure prediction tasks.
Collapse
|
458
|
One-Year Lesson: Machine Learning Prediction of COVID-19 Positive Cases with Meteorological Data and Mobility Estimate in Japan. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18115736. [PMID: 34071801 PMCID: PMC8198917 DOI: 10.3390/ijerph18115736] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 05/22/2021] [Accepted: 05/25/2021] [Indexed: 12/13/2022]
Abstract
With the wide spread of COVID-19 and the corresponding negative impact on different life aspects, it becomes important to understand ways to deal with the pandemic as a part of daily routine. After a year of the COVID-19 pandemic, it has become obvious that different factors, including meteorological factors, influence the speed at which the disease is spread and the potential fatalities. However, the impact of each factor on the speed at which COVID-19 is spreading remains controversial. Accurate forecasting of potential positive cases may lead to better management of healthcare resources and provide guidelines for government policies in terms of the action required within an effective timeframe. Recently, Google Cloud has provided online COVID-19 forecasting data for the United States and Japan, which would help in predicting future situations on a state/prefecture scale and are updated on a day-by-day basis. In this study, we propose a deep learning architecture to predict the spread of COVID-19 considering various factors, such as meteorological data and public mobility estimates, and applied it to data collected in Japan to demonstrate its effectiveness. The proposed model was constructed using a neural network architecture based on a long short-term memory (LSTM) network. The model consists of multi-path LSTM layers that are trained using time-series meteorological data and public mobility data obtained from open-source data. The model was tested using different time frames, and the results were compared to Google Cloud forecasts. Public mobility is a dominant factor in estimating new positive cases, whereas meteorological data improve their accuracy. The average relative error of the proposed model ranged from 16.1% to 22.6% in major regions, which is a significant improvement compared with Google Cloud forecasting. This model can be used to provide public awareness regarding the morbidity risk of the COVID-19 pandemic in a feasible manner.
Collapse
|
459
|
Abstract
Chatbots potentially address deficits in availability of the traditional health workforce and could help to stem concerning rates of youth mental health issues including high suicide rates. While chatbots have shown some positive results in helping people cope with mental health issues, there are yet deep concerns regarding such chatbots in terms of their ability to identify emergency situations and act accordingly. Risk of suicide/self-harm is one such concern which we have addressed in this project. A chatbot decides its response based on the text input from the user and must correctly recognize the significance of a given input. We have designed a self-harm classifier which could use the user's response to the chatbot and predict whether the response indicates intent for self-harm. With the difficulty to access confidential counselling data, we looked for alternate data sources and found Twitter and Reddit to provide data similar to what we would expect to get from a chatbot user. We trained a sentiment analysis classifier on Twitter data and a self-harm classifier on the Reddit data. We combined the results of the two models to improve the model performance. We got the best results from a LSTM-RNN classifier using BERT encoding. The best model accuracy achieved was 92.13%. We tested the model on new data from Reddit and got an impressive result with an accuracy of 97%. Such a model is promising for future embedding in mental health chatbots to improve their safety through accurate detection of self-harm talk by users.
Collapse
|
460
|
Fekri P, Dargahi J, Zadeh M. Corrigendum: Deep Learning-Based Haptic Guidance for Surgical Skills Transfer. Front Robot AI 2021; 8:691570. [PMID: 34026860 PMCID: PMC8132118 DOI: 10.3389/frobt.2021.691570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 04/12/2021] [Indexed: 12/02/2022] Open
|
461
|
Prediction of Head Movement in 360-Degree Videos Using Attention Model. SENSORS 2021; 21:s21113678. [PMID: 34070560 PMCID: PMC8198419 DOI: 10.3390/s21113678] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 05/18/2021] [Accepted: 05/23/2021] [Indexed: 12/03/2022]
Abstract
In this paper, we propose a prediction algorithm, the combination of Long Short-Term Memory (LSTM) and attention model, based on machine learning models to predict the vision coordinates when watching 360-degree videos in a Virtual Reality (VR) or Augmented Reality (AR) system. Predicting the vision coordinates while video streaming is important when the network condition is degraded. However, the traditional prediction models such as Moving Average (MA) and Autoregression Moving Average (ARMA) are linear so they cannot consider the nonlinear relationship. Therefore, machine learning models based on deep learning are recently used for nonlinear predictions. We use the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural network methods, originated in Recurrent Neural Networks (RNN), and predict the head position in the 360-degree videos. Therefore, we adopt the attention model to LSTM to make more accurate results. We also compare the performance of the proposed model with the other machine learning models such as Multi-Layer Perceptron (MLP) and RNN using the root mean squared error (RMSE) of predicted and real coordinates. We demonstrate that our model can predict the vision coordinates more accurately than the other models in various videos.
Collapse
|
462
|
Utilization of Micro-Doppler Radar to Classify Gait Patterns of Young and Elderly Adults: An Approach Using a Long Short-Term Memory Network. SENSORS 2021; 21:s21113643. [PMID: 34073806 PMCID: PMC8197185 DOI: 10.3390/s21113643] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 05/20/2021] [Accepted: 05/21/2021] [Indexed: 11/16/2022]
Abstract
To develop a daily monitoring system for early detection of fall risk of elderly people during walking, this study presents a highly accurate micro-Doppler radar (MDR)-based gait classification method for the young and elderly adults. Our method utilizes a time-series of velocity corresponding to leg motion during walking extracted from the MDR spectrogram (time-velocity distribution) in an experimental study involving 300 participants. The extracted time-series was inputted to a long short-term memory recurrent neural network to classify the gaits of young and elderly participant groups. We achieved a classification accuracy of 94.9%, which is significantly higher than that of a previously presented velocity-parameter-based classification method.
Collapse
|
463
|
Lu X, Sha YH, Li Z, Huang Y, Chen W, Chen D, Shen J, Chen Y, Fung JCH. Development and application of a hybrid long-short term memory - three dimensional variational technique for the improvement of PM 2.5 forecasting. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 770:144221. [PMID: 33513492 DOI: 10.1016/j.scitotenv.2020.144221] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 10/31/2020] [Accepted: 11/23/2020] [Indexed: 06/12/2023]
Abstract
The current state-of-the-art three-dimensional (3D) numerical model for air quality forecasting is restricted by the uncertainty from the emission inventory, physical/chemical parameterization, and meteorological prediction. Forecasting performance can be improved by using the 3D-variational (3D-VAR) technique for assimilating the observation data, which corrects the initial concentration field. However, errors from the prognostic model cause the correction effects at the first hour to be erased, and the bias of the forecast increases relatively fast as the simulation progresses. As an emerging alternative technique, long short-term memory (LSTM) shows promising performance in air quality forecasting for individual stations and outperforms the traditional persistent statistical models. In this study, a new method was developed to combine a 3D numerical model with 3D-VAR and LSTM techniques. This method integrates the advantage of LSTM, namely its high-accuracy forecasting for a single station and that of the 3D-VAR technique, namely its ability to extend improvement to the whole simulation domain. This hybrid method can effectively improve PM2.5 forecasting for the next 24 h, relative to forecasting with the 3D-VAR technique which uses the initial hour concentration correction. Results showed that the root-mean-square error and normalized mean error were decreased by 29.3% and 33.3% in the validation stations, respectively. The LSTM-3D-VAR method developed in this study can be further applied in other regions to improve the forecasting of PM2.5 and other ambient pollutants.
Collapse
|
464
|
Moreira de Lima JM, Ugulino de Araújo FM. Industrial Semi-Supervised Dynamic Soft-Sensor Modeling Approach Based on Deep Relevant Representation Learning. SENSORS 2021; 21:s21103430. [PMID: 34069123 PMCID: PMC8156853 DOI: 10.3390/s21103430] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 03/30/2021] [Accepted: 04/02/2021] [Indexed: 11/16/2022]
Abstract
Soft sensors based on deep learning have been growing in industrial process applications, inferring hard-to-measure but crucial quality-related variables. However, applications may present strong non-linearity, dynamicity, and a lack of labeled data. To deal with the above-cited problems, the extraction of relevant features is becoming a field of interest in soft-sensing. A novel deep representative learning soft-sensor modeling approach is proposed based on stacked autoencoder (SAE), mutual information (MI), and long-short term memory (LSTM). SAE is trained layer by layer with MI evaluation performed between extracted features and targeted output to evaluate the relevance of learned representation in each layer. This approach highlights relevant information and eliminates irrelevant information from the current layer. Thus, deep output-related representative features are retrieved. In the supervised fine-tuning stage, an LSTM is coupled to the tail of the SAE to address system inherent dynamic behavior. Also, a k-fold cross-validation ensemble strategy is applied to enhance the soft-sensor reliability. Two real-world industrial non-linear processes are employed to evaluate the proposed method performance. The obtained results show improved prediction performance in comparison to other traditional and state-of-art methods. Compared to the other methods, the proposed model can generate more than 38.6% and 39.4% improvement of RMSE for the two analyzed industrial cases.
Collapse
|
465
|
de Bardeci M, Ip CT, Olbrich S. Deep learning applied to electroencephalogram data in mental disorders: A systematic review. Biol Psychol 2021; 162:108117. [PMID: 33991592 DOI: 10.1016/j.biopsycho.2021.108117] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 04/19/2021] [Accepted: 05/10/2021] [Indexed: 12/12/2022]
Abstract
In recent medical research, tremendous progress has been made in the application of deep learning (DL) techniques. This article systematically reviews how DL techniques have been applied to electroencephalogram (EEG) data for diagnostic and predictive purposes in conducting research on mental disorders. EEG-studies on psychiatric diseases based on the ICD-10 or DSM-V classification that used either convolutional neural networks (CNNs) or long -short-term-memory (LSTMs) networks for classification were searched and examined for the quality of the information they contained in three domains: clinical, EEG-data processing, and deep learning. Although we found that the description of EEG acquisition and pre-processing was sufficient in most of the studies, we found, that many of them lacked a systematic characterization of clinical features. Furthermore, many studies used misguided model selection procedures or flawed testing. It is recommended that the study of psychiatric disorders using DL in the future must improve the quality of clinical data and follow state of the art model selection and testing procedures so as to achieve a higher research standard and head toward a clinical significance.
Collapse
|
466
|
Dashtipour K, Gogate M, Adeel A, Larijani H, Hussain A. Sentiment Analysis of Persian Movie Reviews Using Deep Learning. ENTROPY (BASEL, SWITZERLAND) 2021; 23:596. [PMID: 34066133 PMCID: PMC8151596 DOI: 10.3390/e23050596] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/03/2021] [Accepted: 05/04/2021] [Indexed: 02/07/2023]
Abstract
Sentiment analysis aims to automatically classify the subject's sentiment (e.g., positive, negative, or neutral) towards a particular aspect such as a topic, product, movie, news, etc. Deep learning has recently emerged as a powerful machine learning technique to tackle the growing demand for accurate sentiment analysis. However, the majority of research efforts are devoted to English-language only, while information of great importance is also available in other languages. This paper presents a novel, context-aware, deep-learning-driven, Persian sentiment analysis approach. Specifically, the proposed deep-learning-driven automated feature-engineering approach classifies Persian movie reviews as having positive or negative sentiments. Two deep learning algorithms, convolutional neural networks (CNN) and long-short-term memory (LSTM), are applied and compared with our previously proposed manual-feature-engineering-driven, SVM-based approach. Simulation results demonstrate that LSTM obtained a better performance as compared to multilayer perceptron (MLP), autoencoder, support vector machine (SVM), logistic regression and CNN algorithms.
Collapse
|
467
|
Siraj A, Lim DY, Tayara H, Chong KT. UbiComb: A Hybrid Deep Learning Model for Predicting Plant-Specific Protein Ubiquitylation Sites. Genes (Basel) 2021; 12:genes12050717. [PMID: 34064731 PMCID: PMC8151217 DOI: 10.3390/genes12050717] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 05/06/2021] [Accepted: 05/07/2021] [Indexed: 12/11/2022] Open
Abstract
Protein ubiquitylation is an essential post-translational modification process that performs a critical role in a wide range of biological functions, even a degenerative role in certain diseases, and is consequently used as a promising target for the treatment of various diseases. Owing to the significant role of protein ubiquitylation, these sites can be identified by enzymatic approaches, mass spectrometry analysis, and combinations of multidimensional liquid chromatography and tandem mass spectrometry. However, these large-scale experimental screening techniques are time consuming, expensive, and laborious. To overcome the drawbacks of experimental methods, machine learning and deep learning-based predictors were considered for prediction in a timely and cost-effective manner. In the literature, several computational predictors have been published across species; however, predictors are species-specific because of the unclear patterns in different species. In this study, we proposed a novel approach for predicting plant ubiquitylation sites using a hybrid deep learning model by utilizing convolutional neural network and long short-term memory. The proposed method uses the actual protein sequence and physicochemical properties as inputs to the model and provides more robust predictions. The proposed predictor achieved the best result with accuracy values of 80% and 81% and F-scores of 79% and 82% on the 10-fold cross-validation and an independent dataset, respectively. Moreover, we also compared the testing of the independent dataset with popular ubiquitylation predictors; the results demonstrate that our model significantly outperforms the other methods in prediction classification results.
Collapse
|
468
|
Guleryuz D. Forecasting outbreak of COVID-19 in Turkey; Comparison of Box-Jenkins, Brown's exponential smoothing and long short-term memory models. PROCESS SAFETY AND ENVIRONMENTAL PROTECTION : TRANSACTIONS OF THE INSTITUTION OF CHEMICAL ENGINEERS, PART B 2021; 149:927-935. [PMID: 33776248 PMCID: PMC7983456 DOI: 10.1016/j.psep.2021.03.032] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 03/15/2021] [Indexed: 05/27/2023]
Abstract
The new coronavirus disease (COVID-19), which first appeared in China in December 2019, has pervaded throughout the world. Because the epidemic started later in Turkey than other European countries, it has the least number of deaths according to the current data. Outbreak management in COVID-19 is of great importance for public safety and public health. For this reason, prediction models can decide the precautionary warning to control the spread of the disease. Therefore, this study aims to develop a forecasting model, considering statistical data for Turkey. Box-Jenkins Methods (ARIMA), Brown's Exponential Smoothing model and RNN-LSTM are employed. ARIMA was selected with the lowest AIC values (12.0342, -2.51411, 12.0253, 3.67729, -4.24405, and 3.66077) as the best fit for the number of total case, the growth rate of total cases, the number of new cases, the number of total death, the growth rate of total deaths and the number of new deaths, respectively. The forecast values of the number of each indicator are stable over time. In the near future, it will not show an increasing trend in the number of cases for Turkey. In addition, the pandemic will become a steady state and an increase in mortality rates will not be expected between 17-31 May. ARIMA models can be used in fresh outbreak situations to ensure health and safety. It is vital to make quick and accurate decisions on the precautions for epidemic preparedness and management, so corrective and preventive actions can be updated considering obtained values.
Collapse
|
469
|
Tăuţan AM, Rossi AC, de Francisco R, Ionescu B. Dimensionality reduction for EEG-based sleep stage detection: comparison of autoencoders, principal component analysis and factor analysis. BIOMED ENG-BIOMED TE 2021; 66:125-136. [PMID: 33048831 DOI: 10.1515/bmt-2020-0139] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 08/19/2020] [Indexed: 11/15/2022]
Abstract
Methods developed for automatic sleep stage detection make use of large amounts of data in the form of polysomnographic (PSG) recordings to build predictive models. In this study, we investigate the effect of several dimensionality reduction techniques, i.e., principal component analysis (PCA), factor analysis (FA), and autoencoders (AE) on common classifiers, e.g., random forests (RF), multilayer perceptron (MLP), long-short term memory (LSTM) networks, for automated sleep stage detection. Experimental testing is carried out on the MGH Dataset provided in the "You Snooze, You Win: The PhysioNet/Computing in Cardiology Challenge 2018". The signals used as input are the six available (EEG) electoencephalographic channels and combinations with the other PSG signals provided: ECG - electrocardiogram, EMG - electromyogram, respiration based signals - respiratory efforts and airflow. We observe that a similar or improved accuracy is obtained in most cases when using all dimensionality reduction techniques, which is a promising result as it allows to reduce the computational load while maintaining performance and in some cases also improves the accuracy of automated sleep stage detection. In our study, using autoencoders for dimensionality reduction maintains the performance of the model, while using PCA and FA the accuracy of the models is in most cases improved.
Collapse
|
470
|
Liu X, Richardson AG. Edge deep learning for neural implants: a case study of seizure detection and prediction. J Neural Eng 2021; 18. [PMID: 33794507 DOI: 10.1088/1741-2552/abf473] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 04/01/2021] [Indexed: 11/12/2022]
Abstract
Objective.Implanted devices providing real-time neural activity classification and control are increasingly used to treat neurological disorders, such as epilepsy and Parkinson's disease. Classification performance is critical to identifying brain states appropriate for the therapeutic action (e.g. neural stimulation). However, advanced algorithms that have shown promise in offline studies, in particular deep learning (DL) methods, have not been deployed on resource-restrained neural implants. Here, we designed and optimized three DL models or edge deployment and evaluated their inference performance in a case study of seizure detection.Approach.A deep neural network (DNN), a convolutional neural network (CNN), and a long short-term memory (LSTM) network were designed and trained with TensorFlow to classify ictal, preictal, and interictal phases from the CHB-MIT scalp EEG database. A sliding window based weighted majority voting algorithm was developed to detect seizure events based on each DL model's classification results. After iterative model compression and coefficient quantization, the algorithms were deployed on a general-purpose, off-the-shelf microcontroller for real-time testing. Inference sensitivity, false positive rate (FPR), execution time, memory size, and power consumption were quantified.Main results.For seizure event detection, the sensitivity and FPR for the DNN, CNN, and LSTM models were 87.36%/0.169 h-1, 96.70%/0.102 h-1, and 97.61%/0.071 h-1, respectively. Predicting seizures for early warnings was also feasible. The LSTM model achieved the best overall performance at the expense of the highest power. The DNN model achieved the shortest execution time. The CNN model showed advantages in balanced performance and power with minimum memory requirement. The implemented model compression and quantization achieved a significant saving of power and memory with an accuracy degradation of less than 0.5%.Significance.Inference with embedded DL models achieved performance comparable to many prior implementations that had no time or computational resource limitations. Generic microcontrollers can provide the required memory and computational resources, while model designs can be migrated to application-specific integrated circuits for further optimization and power saving. The results suggest that edge DL inference is a feasible option for future neural implants to improve classification performance and therapeutic outcomes.
Collapse
|
471
|
Younis MC. Evaluation of deep learning approaches for identification of different corona-virus species and time series prediction. Comput Med Imaging Graph 2021; 90:101921. [PMID: 33930734 PMCID: PMC8062905 DOI: 10.1016/j.compmedimag.2021.101921] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 01/28/2021] [Accepted: 04/05/2021] [Indexed: 01/01/2023]
Abstract
Novel corona-virus (nCOV) has been declared as a pandemic that started from the city Wuhan of China. This deadly virus is infecting people rapidly and has targeted 4.93 million people across the world, with 227 K people being infected only in Italy. Cases of nCOV are quickly increasing whereas the number of nCOV test kits available in hospitals are limited. Under these conditions, an automated system for the classification of patients into nCOV positive and negative cases, is a much needed tool against the pandemic, helping in a selective use of the limited number of test kits. In this research, Convolutional Neural Network-based models (one block VGG, two block VGG, three block VGG, four block VGG, LetNet-5, AlexNet, and Resnet-50) have been employed for the detection of Corona-virus and SARS_MERS infected patients, distinguishing them from the healthy subjects, using lung X-ray scans, which has proven to be a challenging task, due to overlapping characteristics of different corona virus types. Furthermore, LSTM model has been used for time series forecasting of nCOV cases, in the following 10 days, in Italy. The evaluation results obtained, proved that the VGG1 model distinguishes the three classes at an accuracy of almost 91%, as compared to other models, whereas the approach based on the LSTM predicts the number of nCOV cases with 99% accuracy.
Collapse
|
472
|
Harfiya LN, Chang CC, Li YH. Continuous Blood Pressure Estimation Using Exclusively Photopletysmography by LSTM-Based Signal-to-Signal Translation. SENSORS (BASEL, SWITZERLAND) 2021; 21:2952. [PMID: 33922447 PMCID: PMC8122812 DOI: 10.3390/s21092952] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 04/12/2021] [Accepted: 04/19/2021] [Indexed: 11/16/2022]
Abstract
Monitoring continuous BP signal is an important issue, because blood pressure (BP) varies over days, minutes, or even seconds for short-term cases. Most of photoplethysmography (PPG)-based BP estimation methods are susceptible to noise and only provides systolic blood pressure (SBP) and diastolic blood pressure (DBP) prediction. Here, instead of estimating a discrete value, we focus on different perspectives to estimate the whole waveform of BP. We propose a novel deep learning model to learn how to perform signal-to-signal translation from PPG to arterial blood pressure (ABP). Furthermore, using a raw PPG signal only as the input, the output of the proposed model is a continuous ABP signal. Based on the translated ABP signal, we extract the SBP and DBP values accordingly to ease the comparative evaluation. Our prediction results achieve average absolute error under 5 mmHg, with 70% confidence for SBP and 95% confidence for DBP without complex feature engineering. These results fulfill the standard from Association for the Advancement of Medical Instrumentation (AAMI) and the British Hypertension Society (BHS) with grade A. From the results, we believe that our model is applicable and potentially boosts the accuracy of an effective signal-to-signal continuous blood pressure estimation.
Collapse
|
473
|
Chen X, Huang R, Li X, Xiao L, Zhou M, Zhang L. A Novel User Emotional Interaction Design Model Using Long and Short-Term Memory Networks and Deep Learning. Front Psychol 2021; 12:674853. [PMID: 33959083 PMCID: PMC8093774 DOI: 10.3389/fpsyg.2021.674853] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 03/18/2021] [Indexed: 01/17/2023] Open
Abstract
Emotional design is an important development trend of interaction design. Emotional design in products plays a key role in enhancing user experience and inducing user emotional resonance. In recent years, based on the user's emotional experience, the design concept of strengthening product emotional design has become a new direction for most designers to improve their design thinking. In the emotional interaction design, the machine needs to capture the user's key information in real time, recognize the user's emotional state, and use a variety of clues to finally determine the appropriate user model. Based on this background, this research uses a deep learning mechanism for more accurate and effective emotion recognition, thereby optimizing the design of the interactive system and improving the user experience. First of all, this research discusses how to use user characteristics such as speech, facial expression, video, heartbeat, etc., to make machines more accurately recognize human emotions. Through the analysis of various characteristics, the speech is selected as the experimental material. Second, a speech-based emotion recognition method is proposed. The mel-Frequency cepstral coefficient (MFCC) of the speech signal is used as the input of the improved long and short-term memory network (ILSTM). To ensure the integrity of the information and the accuracy of the output at the next moment, ILSTM makes peephole connections in the forget gate and input gate of LSTM, and adds the unit state as input data to the threshold layer. The emotional features obtained by ILSTM are input into the attention layer, and the self-attention mechanism is used to calculate the weight of each frame of speech signal. The speech features with higher weights are used to distinguish different emotions and complete the emotion recognition of the speech signal. Experiments on the EMO-DB and CASIA datasets verify the effectiveness of the model for emotion recognition. Finally, the feasibility of emotional interaction system design is discussed.
Collapse
|
474
|
Optical Gas Sensing with Liquid Crystal Droplets and Convolutional Neural Networks. SENSORS 2021; 21:s21082854. [PMID: 33919620 PMCID: PMC8073403 DOI: 10.3390/s21082854] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/13/2021] [Accepted: 04/14/2021] [Indexed: 01/14/2023]
Abstract
Liquid crystal (LC)-based materials are promising platforms to develop rapid, miniaturised and low-cost gas sensor devices. In hybrid gel films containing LC droplets, characteristic optical texture variations are observed due to orientational transitions of LC molecules in the presence of distinct volatile organic compounds (VOC). Here, we investigate the use of deep convolutional neural networks (CNN) as pattern recognition systems to analyse optical textures dynamics in LC droplets exposed to a set of different VOCs. LC droplets responses to VOCs were video recorded under polarised optical microscopy (POM). CNNs were then used to extract features from the responses and, in separate tasks, to recognise and quantify the vapours exposed to the films. The impact of droplet diameter on the results was also analysed. With our classification models, we show that a single individual droplet can recognise 11 VOCs with small structural and functional differences (F1-score above 93%). The optical texture variation pattern of a droplet also reflects VOC concentration changes, as suggested by applying a regression model to acetone at 0.9-4.0% (v/v) (mean absolute errors below 0.25% (v/v)). The CNN-based methodology is thus a promising approach for VOC sensing using responses from individual LC-droplets.
Collapse
|
475
|
Sikandar T, Rabbi MF, Ghazali KH, Altwijri O, Alqahtani M, Almijalli M, Altayyar S, Ahamed NU. Using a Deep Learning Method and Data from Two-Dimensional (2D) Marker-Less Video-Based Images for Walking Speed Classification. SENSORS 2021; 21:s21082836. [PMID: 33920617 PMCID: PMC8072769 DOI: 10.3390/s21082836] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 04/10/2021] [Accepted: 04/13/2021] [Indexed: 01/09/2023]
Abstract
Human body measurement data related to walking can characterize functional movement and thereby become an important tool for health assessment. Single-camera-captured two-dimensional (2D) image sequences of marker-less walking individuals might be a simple approach for estimating human body measurement data which could be used in walking speed-related health assessment. Conventional body measurement data of 2D images are dependent on body-worn garments (used as segmental markers) and are susceptible to changes in the distance between the participant and camera in indoor and outdoor settings. In this study, we propose five ratio-based body measurement data that can be extracted from 2D images and can be used to classify three walking speeds (i.e., slow, normal, and fast) using a deep learning-based bidirectional long short-term memory classification model. The results showed that average classification accuracies of 88.08% and 79.18% could be achieved in indoor and outdoor environments, respectively. Additionally, the proposed ratio-based body measurement data are independent of body-worn garments and not susceptible to changes in the distance between the walking individual and camera. As a simple but efficient technique, the proposed walking speed classification has great potential to be employed in clinics and aged care homes.
Collapse
|
476
|
Ullah W, Ullah A, Hussain T, Khan ZA, Baik SW. An Efficient Anomaly Recognition Framework Using an Attention Residual LSTM in Surveillance Videos. SENSORS 2021; 21:s21082811. [PMID: 33923712 PMCID: PMC8072779 DOI: 10.3390/s21082811] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 04/08/2021] [Accepted: 04/12/2021] [Indexed: 11/16/2022]
Abstract
Video anomaly recognition in smart cities is an important computer vision task that plays a vital role in smart surveillance and public safety but is challenging due to its diverse, complex, and infrequent occurrence in real-time surveillance environments. Various deep learning models use significant amounts of training data without generalization abilities and with huge time complexity. To overcome these problems, in the current work, we present an efficient light-weight convolutional neural network (CNN)-based anomaly recognition framework that is functional in a surveillance environment with reduced time complexity. We extract spatial CNN features from a series of video frames and feed them to the proposed residual attention-based long short-term memory (LSTM) network, which can precisely recognize anomalous activity in surveillance videos. The representative CNN features with the residual blocks concept in LSTM for sequence learning prove to be effective for anomaly detection and recognition, validating our model’s effective usage in smart cities video surveillance. Extensive experiments on the real-world benchmark UCF-Crime dataset validate the effectiveness of the proposed model within complex surveillance environments and demonstrate that our proposed model outperforms state-of-the-art models with a 1.77%, 0.76%, and 8.62% increase in accuracy on the UCF-Crime, UMN and Avenue datasets, respectively.
Collapse
|
477
|
Mehta P, Pandya S, Kotecha K. Harvesting social media sentiment analysis to enhance stock market prediction using deep learning. PeerJ Comput Sci 2021; 7:e476. [PMID: 33954250 PMCID: PMC8053016 DOI: 10.7717/peerj-cs.476] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 03/16/2021] [Indexed: 05/14/2023]
Abstract
Information gathering has become an integral part of assessing people's behaviors and actions. The Internet is used as an online learning site for sharing and exchanging ideas. People can actively give their reviews and recommendations for variety of products and services using popular social sites and personal blogs. Social networking sites, including Twitter, Facebook, and Google+, are examples of the sites used to share opinion. The stock market (SM) is an essential area of the economy and plays a significant role in trade and industry development. Predicting SM movements is a well-known and area of interest to researchers. Social networking perfectly reflects the public's views of current affairs. Financial news stories are thought to have an impact on the return of stock trend prices and many data mining techniques are used address fluctuations in the SM. Machine learning can provide a more accurate and robust approach to handle SM-related predictions. We sought to identify how movements in a company's stock prices correlate with the expressed opinions (sentiments) of the public about that company. We designed and implemented a stock price prediction accuracy tool considering public sentiment apart from other parameters. The proposed algorithm considers public sentiment, opinions, news and historical stock prices to forecast future stock prices. Our experiments were performed using machine-learning and deep-learning methods including Support Vector Machine, MNB classifier, linear regression, Naïve Bayes and Long Short-Term Memory. Our results validate the success of the proposed methodology.
Collapse
|
478
|
Chowdhury AA, Hasan KT, Hoque KKS. Analysis and Prediction of COVID-19 Pandemic in Bangladesh by Using ANFIS and LSTM Network. Cognit Comput 2021; 13:761-770. [PMID: 33868501 PMCID: PMC8041393 DOI: 10.1007/s12559-021-09859-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 03/30/2021] [Indexed: 02/05/2023]
Abstract
The dangerously contagious virus named "COVID-19" has struck the world strong and has locked down billions of people in their homes to stop the further spread. All the researchers and scientists in various fields are continually developing a vaccine and prevention methods to aid the world from this challenging situation. However, a reliable prediction of the epidemic may help control this contiguous disease until the cure is available. The machine learning techniques are one of the frontiers in predicting this outbreak's future trend and behavior. Our research is focused on finding a suitable machine learning algorithm that can predict the COVID-19 daily new cases with higher accuracy. This research has used the adaptive neuro-fuzzy inference system (ANFIS) and the long short-term memory (LSTM) to foresee the newly infected cases in Bangladesh. We have compared both the experiments' results, and it can be forenamed that LSTM has shown more satisfactory results. Upon study and testing on several models, we have shown that LSTM works better on a scenario-based model for Bangladesh with mean absolute percentage error (MAPE)-4.51, root-mean-square error (RMSE)-6.55, and correlation coefficient-0.75. This study is expected to shed light on COVID-19 prediction models for researchers working with machine learning techniques and avoid proven failures, especially for small imprecise datasets.
Collapse
|
479
|
To QG, To KG, Huynh VAN, Nguyen NTQ, Ngo DTN, Alley SJ, Tran ANQ, Tran ANP, Pham NTT, Bui TX, Vandelanotte C. Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:4069. [PMID: 33921539 PMCID: PMC8069687 DOI: 10.3390/ijerph18084069] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 04/05/2021] [Accepted: 04/08/2021] [Indexed: 12/16/2022]
Abstract
Anti-vaccination attitudes have been an issue since the development of the first vaccines. The increasing use of social media as a source of health information may contribute to vaccine hesitancy due to anti-vaccination content widely available on social media, including Twitter. Being able to identify anti-vaccination tweets could provide useful information for formulating strategies to reduce anti-vaccination sentiments among different groups. This study aims to evaluate the performance of different natural language processing models to identify anti-vaccination tweets that were published during the COVID-19 pandemic. We compared the performance of the bidirectional encoder representations from transformers (BERT) and the bidirectional long short-term memory networks with pre-trained GLoVe embeddings (Bi-LSTM) with classic machine learning methods including support vector machine (SVM) and naïve Bayes (NB). The results show that performance on the test set of the BERT model was: accuracy = 91.6%, precision = 93.4%, recall = 97.6%, F1 score = 95.5%, and AUC = 84.7%. Bi-LSTM model performance showed: accuracy = 89.8%, precision = 44.0%, recall = 47.2%, F1 score = 45.5%, and AUC = 85.8%. SVM with linear kernel performed at: accuracy = 92.3%, Precision = 19.5%, Recall = 78.6%, F1 score = 31.2%, and AUC = 85.6%. Complement NB demonstrated: accuracy = 88.8%, precision = 23.0%, recall = 32.8%, F1 score = 27.1%, and AUC = 62.7%. In conclusion, the BERT models outperformed the Bi-LSTM, SVM, and NB models in this task. Moreover, the BERT model achieved excellent performance and can be used to identify anti-vaccination tweets in future studies.
Collapse
|
480
|
Time-aware deep neural networks for needle tip localization in 2D ultrasound. Int J Comput Assist Radiol Surg 2021; 16:819-827. [PMID: 33840037 DOI: 10.1007/s11548-021-02361-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Accepted: 03/25/2021] [Indexed: 10/21/2022]
Abstract
PURPOSE Accurate placement of the needle is critical in interventions like biopsies and regional anesthesia, during which incorrect needle insertion can lead to procedure failure and complications. Therefore, ultrasound guidance is widely used to improve needle placement accuracy. However, at steep and deep insertions, the visibility of the needle is lost. Computational methods for automatic needle tip localization could improve the clinical success rate in these scenarios. METHODS We propose a novel algorithm for needle tip localization during challenging ultrasound-guided insertions when the shaft may be invisible, and the tip has a low intensity. There are two key steps in our approach. First, we enhance the needle tip features in consecutive ultrasound frames using a detection scheme which recognizes subtle intensity variations caused by needle tip movement. We then employ a hybrid deep neural network comprising a convolutional neural network and long short-term memory recurrent units. The input to the network is a consecutive plurality of fused enhanced frames and the corresponding original B-mode frames, and this spatiotemporal information is used to predict the needle tip location. RESULTS We evaluate our approach on an ex vivo dataset collected with in-plane and out-of-plane insertion of 17G and 22G needles in bovine, porcine, and chicken tissue, acquired using two different ultrasound systems. We train the model with 5000 frames from 42 video sequences. Evaluation on 600 frames from 30 sequences yields a tip localization error of [Formula: see text] mm and an overall inference time of 0.064 s (15 fps). Comparison against prior art on challenging datasets reveals a 30% improvement in tip localization accuracy. CONCLUSION The proposed method automatically models temporal dynamics associated with needle tip motion and is more accurate than state-of-the-art methods. Therefore, it has the potential for improving needle tip localization in challenging ultrasound-guided interventions.
Collapse
|
481
|
Bhimala KR, Patra GK, Mopuri R, Mutheneni SR. Prediction of COVID-19 cases using the weather integrated deep learning approach for India. Transbound Emerg Dis 2021; 69:1349-1363. [PMID: 33837675 PMCID: PMC8250893 DOI: 10.1111/tbed.14102] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 03/31/2021] [Accepted: 04/04/2021] [Indexed: 12/30/2022]
Abstract
Advanced and accurate forecasting of COVID‐19 cases plays a crucial role in planning and supplying resources effectively. Artificial Intelligence (AI) techniques have proved their capability in time series forecasting non‐linear problems. In the present study, the relationship between weather factor and COVID‐19 cases was assessed, and also developed a forecasting model using long short‐term memory (LSTM), a deep learning model. The study found that the specific humidity has a strong positive correlation, whereas there is a negative correlation with maximum temperature, and a positive correlation with minimum temperature was observed in various geographic locations of India. The weather data and COVID‐19 confirmed case data (1 April to 30 June 2020) were used to optimize univariate and multivariate LSTM time series forecast models. The optimized models were utilized to forecast the daily COVID‐19 cases for the period 1 July 2020 to 31 July 2020 with 1 to 14 days of lead time. The results showed that the univariate LSTM model was reasonably good for the short‐term (1 day lead) forecast of COVID‐19 cases (relative error <20%). Moreover, the multivariate LSTM model improved the medium‐range forecast skill (1–7 days lead) after including the weather factors. The study observed that the specific humidity played a crucial role in improving the forecast skill majorly in the West and northwest region of India. Similarly, the temperature played a significant role in model enhancement in the Southern and Eastern regions of India.
Collapse
|
482
|
Asgari Mehrabadi M, Dutt N, Rahmani AM. The Causality Inference of Public Interest in Restaurants and Bars on Daily COVID-19 Cases in the United States: Google Trends Analysis. JMIR Public Health Surveill 2021; 7:e22880. [PMID: 33690143 PMCID: PMC8025919 DOI: 10.2196/22880] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 12/07/2020] [Accepted: 03/09/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND The COVID-19 pandemic has affected virtually every region in the world. At the time of this study, the number of daily new cases in the United States was greater than that in any other country, and the trend was increasing in most states. Google Trends provides data regarding public interest in various topics during different periods. Analyzing these trends using data mining methods may provide useful insights and observations regarding the COVID-19 outbreak. OBJECTIVE The objective of this study is to consider the predictive ability of different search terms not directly related to COVID-19 with regard to the increase of daily cases in the United States. In particular, we are concerned with searches related to dine-in restaurants and bars. Data were obtained from the Google Trends application programming interface and the COVID-19 Tracking Project. METHODS To test the causation of one time series on another, we used the Granger causality test. We considered the causation of two different search query trends related to dine-in restaurants and bars on daily positive cases in the US states and territories with the 10 highest and 10 lowest numbers of daily new cases of COVID-19. In addition, we used Pearson correlations to measure the linear relationships between different trends. RESULTS Our results showed that for states and territories with higher numbers of daily cases, the historical trends in search queries related to bars and restaurants, which mainly occurred after reopening, significantly affected the number of daily new cases on average. California, for example, showed the most searches for restaurants on June 7, 2020; this affected the number of new cases within two weeks after the peak, with a P value of .004 for the Granger causality test. CONCLUSIONS Although a limited number of search queries were considered, Google search trends for restaurants and bars showed a significant effect on daily new cases in US states and territories with higher numbers of daily new cases. We showed that these influential search trends can be used to provide additional information for prediction tasks regarding new cases in each region. These predictions can help health care leaders manage and control the impact of the COVID-19 outbreak on society and prepare for its outcomes.
Collapse
|
483
|
Ghany KKA, Zawbaa HM, Sabri HM. COVID-19 prediction using LSTM algorithm: GCC case study. INFORMATICS IN MEDICINE UNLOCKED 2021; 23:100566. [PMID: 33842686 PMCID: PMC8021451 DOI: 10.1016/j.imu.2021.100566] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 03/31/2021] [Accepted: 03/31/2021] [Indexed: 12/20/2022] Open
Abstract
Coronavirus-19 (COVID-19) is the black swan of 2020. Still, the human response to restrain the virus is also creating massive ripples through different systems, such as health, economy, education, and tourism. This paper focuses on research and applying Artificial Intelligence (AI) algorithms to predict COVID-19 propagation using the available time-series data and study the effect of the quality of life, the number of tests performed, and the awareness of citizens on the virus in the Gulf Cooperation Council (GCC) countries at the Gulf area. So we focused on cases in the Kingdom of Saudi Arabia (KSA), United Arab of Emirates (UAE), Kuwait, Bahrain, Oman, and Qatar. For this aim, we accessed the time-series real-datasets collected from Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). The timeline of our data is from January 22, 2020 to January 25, 2021. We have implemented the proposed model based on Long Short-Term Memory (LSTM) with ten hidden units (neurons) to predict COVID-19 confirmed and death cases. From the experimental results, we confirmed that KSA and Qatar would take the most extended period to recover from the COVID-19 virus, and the situation will be controllable in the second half of March 2021 in UAE, Kuwait, Oman, and Bahrain. Also, we calculated the root mean square error (RMSE) between the actual and predicted values of each country for confirmed and death cases, and we found that the best values for both confirmed and death cases are 320.79 and 1.84, respectively, and both are related to Bahrain. While the worst values are 1768.35 and 21.78, respectively, and both are related to KSA. On the other hand, we also calculated the mean absolute relative errors (MARE) between the actual and predicted values of each country for confirmed and death cases, and we found that the best values for both confirmed and deaths cases are 37.76 and 0.30, and these are related to Kuwait and Qatar respectively. While the worst values are 71.45 and 1.33, respectively, and both are related to KSA.
Collapse
|
484
|
Maheshwari S, Agarwal A, Shukla A, Tiwari R. A comprehensive evaluation for the prediction of mortality in intensive care units with LSTM networks: patients with cardiovascular disease. ACTA ACUST UNITED AC 2021; 65:435-446. [PMID: 31846424 DOI: 10.1515/bmt-2018-0206] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 10/25/2019] [Indexed: 11/15/2022]
Abstract
Intensive care units (ICUs) are responsible for generating a wealth of useful data in the form of electronic health records. We aimed to build a mortality prediction model on a Medical Information Mart for Intensive Care (MIMIC-III) database and to assess whether the use of deep learning techniques like long short-term memory (LSTM) can effectively utilize the temporal relations among clinical variables. The models were built on clinical variable dynamics of the first 48 h of ICU admission of 12,550 records from the MIMIC-III database. A total of 36 variables including 33 time series variables and three static variables were used for the prediction. We present the application of LSTM and LSTM attention (LSTM-AT) model for mortality prediction with such a large number of clinical variables dataset. For training and validation purpose, we have used International Classification of Diseases, 9th edition (ICD-9) codes for extracting the patients with cardiovascular disease, and infections and parasitic disease, respectively. The effectiveness of the LSTM model is achieved over non-recurrent baseline models like naïve Bayes, logistic regression (LR), support vector machine and multilayer perceptron (MLP) by generating state of the art results (area under the curve [AUC], 0.852). Next, by providing attention at each time stamp, we developed a model, LSTM-AT, which exhibits even better performance (AUC, 0.876).
Collapse
|
485
|
Morales A, Costela FM, Woods RL. Saccade Landing Point Prediction Based on Fine-Grained Learning Method. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2021; 9:52474-52484. [PMID: 33981520 PMCID: PMC8112574 DOI: 10.1109/access.2021.3070511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The landing point of a saccade defines the new fixation region, the new region of interest. We asked whether it was possible to predict the saccade landing point early in this very fast eye movement. This work proposes a new algorithm based on LSTM networks and a fine-grained loss function for saccade landing point prediction in real-world scenarios. Predicting the landing point is a critical milestone toward reducing the problems caused by display-update latency in gaze-contingent systems that make real-time changes in the display based on eye tracking. Saccadic eye movements are some of the fastest human neuro-motor activities with angular velocities of up to 1,000°/s. We present a comprehensive analysis of the performance of our method using a database with almost 220,000 saccades from 75 participants captured during natural viewing of videos. We include a comparison with state-of-the-art saccade landing point prediction algorithms. The results obtained using our proposed method outperformed existing approaches with improvements of up to 50% error reduction. Finally, we analyzed some factors that affected prediction errors including duration, length, age, and user intrinsic characteristics.
Collapse
|
486
|
Wang L, Zhong X, Wang S, Zhang H, Liu Y. A novel end-to-end method to predict RNA secondary structure profile based on bidirectional LSTM and residual neural network. BMC Bioinformatics 2021; 22:169. [PMID: 33789581 PMCID: PMC8011163 DOI: 10.1186/s12859-021-04102-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 03/24/2021] [Indexed: 11/26/2022] Open
Abstract
Background Studies have shown that RNA secondary structure, a planar structure formed by paired bases, plays diverse vital roles in fundamental life activities and complex diseases. RNA secondary structure profile can record whether each base is paired with others. Hence, accurate prediction of secondary structure profile can help to deduce the secondary structure and binding site of RNA. RNA secondary structure profile can be obtained through biological experiment and calculation methods. Of them, the biological experiment method involves two ways: chemical reagent and biological crystallization. The chemical reagent method can obtain a large number of prediction data, but its cost is high and always associated with high noise, making it difficult to get results of all bases on RNA due to the limited of sequencing coverage. By contrast, the biological crystallization method can lead to accurate results, yet heavy experimental work and high costs are required. On the other hand, the calculation method is CROSS, which comprises a three-layer fully connected neural network. However, CROSS can not completely learn the features of RNA secondary structure profile since its poor network structure, leading to its low performance. Results In this paper, a novel end-to-end method, named as “RPRes, was proposed to predict RNA secondary structure profile based on Bidirectional LSTM and Residual Neural Network. Conclusions RPRes utilizes data sets generated by multiple biological experiment methods as the training, validation, and test sets to predict profile, which can compatible with numerous prediction requirements. Compared with the biological experiment method, RPRes has reduced the costs and improved the prediction efficiency. Compared with the state-of-the-art calculation method CROSS, RPRes has significantly improved performance.
Collapse
|
487
|
ElSaadani M, Habib E, Abdelhameed AM, Bayoumi M. Assessment of a Spatiotemporal Deep Learning Approach for Soil Moisture Prediction and Filling the Gaps in Between Soil Moisture Observations. Front Artif Intell 2021; 4:636234. [PMID: 33748748 PMCID: PMC7969976 DOI: 10.3389/frai.2021.636234] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 01/26/2021] [Indexed: 11/24/2022] Open
Abstract
Soil moisture (SM) plays a significant role in determining the probability of flooding in a given area. Currently, SM is most commonly modeled using physically-based numerical hydrologic models. Modeling the natural processes that take place in the soil is difficult and requires assumptions. Besides, hydrologic model runtime is highly impacted by the extent and resolution of the study domain. In this study, we propose a data-driven modeling approach using Deep Learning (DL) models. There are different types of DL algorithms that serve different purposes. For example, the Convolutional Neural Network (CNN) algorithm is well suited for capturing and learning spatial patterns, while the Long Short-Term Memory (LSTM) algorithm is designed to utilize time-series information and to learn from past observations. A DL algorithm that combines the capabilities of CNN and LSTM called ConvLSTM was recently developed. In this study, we investigate the applicability of the ConvLSTM algorithm in predicting SM in a study area located in south Louisiana in the United States. This study reveals that ConvLSTM significantly outperformed CNN in predicting SM. We tested the performance of ConvLSTM based models by using a combination of different sets of predictors and different LSTM sequence lengths. The study results show that ConvLSTM models can predict SM with a mean areal Root Mean Squared Error (RMSE) of 2.5% and mean areal correlation coefficients of 0.9 for our study area. ConvLSTM models can also provide predictions between discrete SM observations, making them potentially useful for applications such as filling observational gaps between satellite overpasses.
Collapse
|
488
|
Gruber N, Jockisch A. Are GRU Cells More Specific and LSTM Cells More Sensitive in Motive Classification of Text? Front Artif Intell 2021; 3:40. [PMID: 33733157 PMCID: PMC7861254 DOI: 10.3389/frai.2020.00040] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Accepted: 05/12/2020] [Indexed: 11/17/2022] Open
Abstract
In the Thematic Apperception Test, a picture story exercise (TAT/PSE; Heckhausen, 1963), it is assumed that unconscious motives can be detected in the text someone is telling about pictures shown in the test. Therefore, this text is classified by trained experts regarding evaluation rules. We tried to automate this coding and used a recurrent neuronal network (RNN) because of the sequential input data. There are two different cell types to improve recurrent neural networks regarding long-term dependencies in sequential input data: long-short-term-memory cells (LSTMs) and gated-recurrent units (GRUs). Some results indicate that GRUs can outperform LSTMs; others show the opposite. So the question remains when to use GRU or LSTM cells. The results show (N = 18000 data, 10-fold cross-validated) that the GRUs outperform LSTMs (accuracy = .85 vs. .82) for overall motive coding. Further analysis showed that GRUs have higher specificity (true negative rate) and learn better less prevalent content. LSTMs have higher sensitivity (true positive rate) and learn better high prevalent content. A closer look at a picture x category matrix reveals that LSTMs outperform GRUs only where deep context understanding is important. As these both techniques do not clearly present a major advantage over one another in the domain investigated here, an interesting topic for future work is to develop a method that combines their strengths.
Collapse
|
489
|
Hastings J, Glauer M, Memariani A, Neuhaus F, Mossakowski T. Learning chemistry: exploring the suitability of machine learning for the task of structure-based chemical ontology classification. J Cheminform 2021; 13:23. [PMID: 33726837 PMCID: PMC7962259 DOI: 10.1186/s13321-021-00500-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 02/26/2021] [Indexed: 12/22/2022] Open
Abstract
Chemical data is increasingly openly available in databases such as PubChem, which contains approximately 110 million compound entries as of February 2021. With the availability of data at such scale, the burden has shifted to organisation, analysis and interpretation. Chemical ontologies provide structured classifications of chemical entities that can be used for navigation and filtering of the large chemical space. ChEBI is a prominent example of a chemical ontology, widely used in life science contexts. However, ChEBI is manually maintained and as such cannot easily scale to the full scope of public chemical data. There is a need for tools that are able to automatically classify chemical data into chemical ontologies, which can be framed as a hierarchical multi-class classification problem. In this paper we evaluate machine learning approaches for this task, comparing different learning frameworks including logistic regression, decision trees and long short-term memory artificial neural networks, and different encoding approaches for the chemical structures, including cheminformatics fingerprints and character-based encoding from chemical line notation representations. We find that classical learning approaches such as logistic regression perform well with sets of relatively specific, disjoint chemical classes, while the neural network is able to handle larger sets of overlapping classes but needs more examples per class to learn from, and is not able to make a class prediction for every molecule. Future work will explore hybrid and ensemble approaches, as well as alternative network architectures including neuro-symbolic approaches.
Collapse
|
490
|
Budiharto W. Data science approach to stock prices forecasting in Indonesia during Covid-19 using Long Short-Term Memory ( LSTM). JOURNAL OF BIG DATA 2021; 8:47. [PMID: 33723498 PMCID: PMC7948653 DOI: 10.1186/s40537-021-00430-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 02/21/2021] [Indexed: 06/12/2023]
Abstract
BACKGROUND Stock market process is full of uncertainty; hence stock prices forecasting very important in finance and business. For stockbrokers, understanding trends and supported by prediction software for forecasting is very important for decision making. This paper proposes a data science model for stock prices forecasting in Indonesian exchange based on the statistical computing based on R language and Long Short-Term Memory (LSTM). FINDINGS The first Covid-19 (Coronavirus disease-19) confirmed case in Indonesia is on 2 March 2020. After that, the composite stock price index has plunged 28% since the start of the year and the share prices of cigarette producers and banks in the midst of the corona pandemic reached their lowest value on March 24, 2020. We use the big data from Bank of Central Asia (BCA) and Bank of Mandiri from Indonesia obtained from Yahoo finance. In our experiments, we visualize the data using data science and predict and simulate the important prices called Open, High, Low and Closing (OHLC) with various parameters. CONCLUSIONS Based on the experiment, data science is very useful for visualization data and our proposed method using Long Short-Term Memory (LSTM) can be used as predictor in short term data with accuracy 94.57% comes from the short term (1 year) with high epoch in training phase rather than using 3 years training data.
Collapse
|
491
|
Dinh C, Samuelsson JG, Hunold A, Hämäläinen MS, Khan S. Contextual MEG and EEG Source Estimates Using Spatiotemporal LSTM Networks. Front Neurosci 2021; 15:552666. [PMID: 33767606 PMCID: PMC7985163 DOI: 10.3389/fnins.2021.552666] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 01/25/2021] [Indexed: 11/13/2022] Open
Abstract
Most magneto- and electroencephalography (M/EEG) based source estimation techniques derive their estimates sample wise, independently across time. However, neuronal assemblies are intricately interconnected, constraining the temporal evolution of neural activity that is detected by MEG and EEG; the observed neural currents must thus be highly context dependent. Here, we use a network of Long Short-Term Memory (LSTM) cells where the input is a sequence of past source estimates and the output is a prediction of the following estimate. This prediction is then used to correct the estimate. In this study, we applied this technique on noise-normalized minimum norm estimates (MNE). Because the correction is found by using past activity (context), we call this implementation Contextual MNE (CMNE), although this technique can be used in conjunction with any source estimation method. We test CMNE on simulated epileptiform activity and recorded auditory steady state response (ASSR) data, showing that the CMNE estimates exhibit a higher degree of spatial fidelity than the unfiltered estimates in the tested cases.
Collapse
|
492
|
Llerena Caña JP, García Herrero J, Molina López JM. Forecasting Nonlinear Systems with LSTM: Analysis and Comparison with EKF. SENSORS 2021; 21:s21051805. [PMID: 33807681 PMCID: PMC7961344 DOI: 10.3390/s21051805] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 02/26/2021] [Accepted: 02/27/2021] [Indexed: 11/17/2022]
Abstract
Certain difficulties in path forecasting and filtering problems are based in the initial hypothesis of estimation and filtering techniques. Common hypotheses include that the system can be modeled as linear, Markovian, Gaussian, or all at one time. Although, in many cases, there are strategies to tackle problems with approaches that show very good results, the associated engineering process can become highly complex, requiring a great deal of time or even becoming unapproachable. To have tools to tackle complex problems without starting from a previous hypothesis but to continue to solve classic challenges and sharpen the implementation of estimation and filtering systems is of high scientific interest. This paper addresses the forecast–filter problem from deep learning paradigms with a neural network architecture inspired by natural language processing techniques and data structure. Unlike Kalman, this proposal performs the process of prediction and filtering in the same phase, while Kalman requires two phases. We propose three different study cases of incremental conceptual difficulty. The experimentation is divided into five parts: the standardization effect in raw data, proposal validation, filtering, loss of measurements (forecasting), and, finally, robustness. The results are compared with a Kalman filter, showing that the proposal is comparable in terms of the error within the linear case, with improved performance when facing non-linear systems.
Collapse
|
493
|
Chen L, Gu Y, Ji X, Sun Z, Li H, Gao Y, Huang Y. Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning. J Am Med Inform Assoc 2021; 27:56-64. [PMID: 31591641 DOI: 10.1093/jamia/ocz141] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 01/25/2019] [Accepted: 07/22/2019] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE Detecting adverse drug events (ADEs) and medications related information in clinical notes is important for both hospital medical care and medical research. We describe our clinical natural language processing (NLP) system to automatically extract medical concepts and relations related to ADEs and medications from clinical narratives. This work was part of the 2018 National NLP Clinical Challenges Shared Task and Workshop on Adverse Drug Events and Medication Extraction. MATERIALS AND METHODS The authors developed a hybrid clinical NLP system that employs a knowledge-based general clinical NLP system for medical concepts extraction, and a task-specific deep learning system for relations identification using attention-based bidirectional long short-term memory networks. RESULTS The systems were evaluated as part of the 2018 National NLP Clinical Challenges challenge, and our attention-based bidirectional long short-term memory networks based system obtained an F-measure of 0.9442 for relations identification task, ranking fifth at the challenge, and had <2% difference from the best system. Error analysis was also conducted targeting at figuring out the root causes and possible approaches for improvement. CONCLUSIONS We demonstrate the generic approaches and the practice of connecting general purposed clinical NLP system to task-specific requirements with deep learning methods. Our results indicate that a well-designed hybrid NLP system is capable of ADE and medication-related information extraction, which can be used in real-world applications to support ADE-related researches and medical decisions.
Collapse
|
494
|
Wanyan T, Vaid A, De Freitas JK, Somani S, Miotto R, Nadkarni GN, Azad A, Ding Y, Glicksberg BS. Relational Learning Improves Prediction of Mortality in COVID-19 in the Intensive Care Unit. IEEE TRANSACTIONS ON BIG DATA 2021; 7:38-44. [PMID: 33768136 PMCID: PMC7990133 DOI: 10.1109/tbdata.2020.3048644] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 10/29/2020] [Accepted: 12/21/2020] [Indexed: 05/04/2023]
Abstract
Traditional Machine Learning (ML) models have had limited success in predicting Coronoavirus-19 (COVID-19) outcomes using Electronic Health Record (EHR) data partially due to not effectively capturing the inter-connectivity patterns between various data modalities. In this work, we propose a novel framework that utilizes relational learning based on a heterogeneous graph model (HGM) for predicting mortality at different time windows in COVID-19 patients within the intensive care unit (ICU). We utilize the EHRs of one of the largest and most diverse patient populations across five hospitals in major health system in New York City. In our model, we use an LSTM for processing time varying patient data and apply our proposed relational learning strategy in the final output layer along with other static features. Here, we replace the traditional softmax layer with a Skip-Gram relational learning strategy to compare the similarity between a patient and outcome embedding representation. We demonstrate that the construction of a HGM can robustly learn the patterns classifying patient representations of outcomes through leveraging patterns within the embeddings of similar patients. Our experimental results show that our relational learning-based HGM model achieves higher area under the receiver operating characteristic curve (auROC) than both comparator models in all prediction time windows, with dramatic improvements to recall.
Collapse
|
495
|
Arvin R, Khattak AJ, Qi H. Safety critical event prediction through unified analysis of driver and vehicle volatilities: Application of deep learning methods. ACCIDENT; ANALYSIS AND PREVENTION 2021; 151:105949. [PMID: 33385957 DOI: 10.1016/j.aap.2020.105949] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 11/12/2020] [Accepted: 12/09/2020] [Indexed: 06/12/2023]
Abstract
Transportation safety is highly correlated with driving behavior, especially human error playing a key role in a large portion of crashes. Modern instrumentation and computational resources allow for the monitorization of driver, vehicle, and roadway/environment to extract leading indicators of crashes from multi-dimensional data streams. To quantify variations that are beyond normal in driver behavior and vehicle kinematics, the concept of volatility is applied. The study measures driver-vehicle volatilities using the naturalistic driving data. By integrating and fusing multiple real-time streams of data, i.e., driver distraction, vehicular movements and kinematics, and instability in driving, this study aims to predict occurrence of safety critical events and generate appropriate feedback to drivers and surrounding vehicles. The naturalistic driving data is used which contains 7566 normal driving events, and 1315 severe events (i.e., crash and near-crash), vehicle kinematics, and driver behavior collected from more than 3500 drivers. In order to capture the local dependency and volatility in time-series data 1D-Convolutional Neural Network (1D-CNN), Long Short-Term Memory (LSTM), and 1DCNN-LSTM are applied. Vehicle kinematics, driving volatility, and impaired driving (in terms of distraction) are used as the input parameters. The results reveal that the 1DCNN-LSTM model provides the best performance, with 95.45% accuracy and prediction of 73.4% of crashes with a precision of 95.67%. Additional features are extracted with the CNN layers and temporal dependency between observations is addressed, which helps the network learn driving patterns and volatile behavior. The model can be used to monitor driving behavior in real-time and provide warnings and alerts to drivers in low-level automated vehicles, reducing their crash risk.
Collapse
|
496
|
Mekruksavanich S, Jitpattanakul A. LSTM Networks Using Smartphone Data for Sensor-Based Human Activity Recognition in Smart Homes. SENSORS 2021; 21:s21051636. [PMID: 33652697 PMCID: PMC7956629 DOI: 10.3390/s21051636] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 02/22/2021] [Accepted: 02/22/2021] [Indexed: 11/16/2022]
Abstract
Human Activity Recognition (HAR) employing inertial motion data has gained considerable momentum in recent years, both in research and industrial applications. From the abstract perspective, this has been driven by an acceleration in the building of intelligent and smart environments and systems that cover all aspects of human life including healthcare, sports, manufacturing, commerce, etc. Such environments and systems necessitate and subsume activity recognition, aimed at recognizing the actions, characteristics, and goals of one or more individuals from a temporal series of observations streamed from one or more sensors. Due to the reliance of conventional Machine Learning (ML) techniques on handcrafted features in the extraction process, current research suggests that deep-learning approaches are more applicable to automated feature extraction from raw sensor data. In this work, the generic HAR framework for smartphone sensor data is proposed, based on Long Short-Term Memory (LSTM) networks for time-series domains. Four baseline LSTM networks are comparatively studied to analyze the impact of using different kinds of smartphone sensor data. In addition, a hybrid LSTM network called 4-layer CNN-LSTM is proposed to improve recognition performance. The HAR method is evaluated on a public smartphone-based dataset of UCI-HAR through various combinations of sample generation processes (OW and NOW) and validation protocols (10-fold and LOSO cross validation). Moreover, Bayesian optimization techniques are used in this study since they are advantageous for tuning the hyperparameters of each LSTM network. The experimental results indicate that the proposed 4-layer CNN-LSTM network performs well in activity recognition, enhancing the average accuracy by up to 2.24% compared to prior state-of-the-art approaches.
Collapse
|
497
|
Huang Q, Zhou W, Guo F, Xu L, Zhang L. 6mA-Pred: identifying DNA N6-methyladenine sites based on deep learning. PeerJ 2021; 9:e10813. [PMID: 33604189 PMCID: PMC7866889 DOI: 10.7717/peerj.10813] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 12/30/2020] [Indexed: 01/03/2023] Open
Abstract
With the accumulation of data on 6mA modification sites, an increasing number of scholars have begun to focus on the identification of 6mA sites. Despite the recognized importance of 6mA sites, methods for their identification remain lacking, with most existing methods being aimed at their identification in individual species. In the present study, we aimed to develop an identification method suitable for multiple species. Based on previous research, we propose a method for 6mA site recognition. Our experiments prove that the proposed 6mA-Pred method is effective for identifying 6mA sites in genes from taxa such as rice, Mus musculus, and human. A series of experimental results show that 6mA-Pred is an excellent method. We provide the source code used in the study, which can be obtained from http://39.100.246.211:5004/6mA_Pred/.
Collapse
|
498
|
Hasan KT, Rahman MM, Ahmmed MM, Chowdhury AA, Islam MK. 4P Model for Dynamic Prediction of COVID-19: a Statistical and Machine Learning Approach. Cognit Comput 2021:1-14. [PMID: 33619436 PMCID: PMC7888531 DOI: 10.1007/s12559-020-09786-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 10/21/2020] [Indexed: 02/05/2023]
Abstract
Around the world, scientists are racing hard to understand how the COVID-19 epidemic is spreading and growing, thus trying to find ways to prevent it before medications are available. Many different models have been proposed so far correlating different factors. Some of them are too localized to indicate a general trend of the pandemic while some others have established transient correlations only. Hence, in this study, taking Bangladesh as a case, a 4P model has been proposed based on four probabilities (4P) which have been found to be true for all affected countries. Efficiency scores have been estimated from survey analysis not only for governing authorities on managing the situation (P(G)) but also for the compliance of the citizens ((P(P)). Since immune responses to a specific pathogen can vary from person to person, the probability of a person getting infected ((P(I)) after being exposed has also been estimated. And the vital one is the probability of test positivity ((P(T)) which is a strong indicator of how effectively the infected people are diagnosed and isolated from the rest of the group that affects the rate of growth. All the four parameters have been fitted in a non-linear exponential model that partly updates itself periodically with everyday facts. Along with the model, all the four probabilistic parameters are engaged to train a recurrent neural network using long short-term memory neural network and the followed trial confirmed a ruling functionality of the 4Ps.
Collapse
|
499
|
Zhi W, Feng D, Tsai WP, Sterle G, Harpold A, Shen C, Li L. From Hydrometeorology to River Water Quality: Can a Deep Learning Model Predict Dissolved Oxygen at the Continental Scale? ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:2357-2368. [PMID: 33533608 DOI: 10.1021/acs.est.0c06783] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Dissolved oxygen (DO) reflects river metabolic pulses and is an essential water quality measure. Our capabilities of forecasting DO however remain elusive. Water quality data, specifically DO data here, often have large gaps and sparse areal and temporal coverage. Earth surface and hydrometeorology data, on the other hand, have become largely available. Here we ask: can a Long Short-Term Memory (LSTM) model learn about river DO dynamics from sparse DO and intensive (daily) hydrometeorology data? We used CAMELS-chem, a new data set with DO concentrations from 236 minimally disturbed watersheds across the U.S. The model generally learns the theory of DO solubility and captures its decreasing trend with increasing water temperature. It exhibits the potential of predicting DO in "chemically ungauged basins", defined as basins without any measurements of DO and broadly water quality in general. The model however misses some DO peaks and troughs when in-stream biogeochemical processes become important. Surprisingly, the model does not perform better where more data are available. Instead, it performs better in basins with low variations of streamflow and DO, high runoff-ratio (>0.45), and winter precipitation peaks. Results here suggest that more data collections at DO peaks and troughs and in sparsely monitored areas are essential to overcome the issue of data scarcity, an outstanding challenge in the water quality community.
Collapse
|
500
|
Wang B, Yuan Q, Yang Q, Zhu L, Li T, Zhang L. Estimate hourly PM 2.5 concentrations from Himawari-8 TOA reflectance directly using geo-intelligent long short-term memory network. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2021; 271:116327. [PMID: 33360654 DOI: 10.1016/j.envpol.2020.116327] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 12/07/2020] [Accepted: 12/15/2020] [Indexed: 06/12/2023]
Abstract
Fine particulate matter (PM2.5) has attracted extensive attention because of its baneful influence on human health and the environment. However, the sparse distribution of PM2.5 measuring stations limits its application to public utility and scientific research, which can be remedied by satellite observations. Therefore, we developed a Geo-intelligent long short-term network (Geoi-LSTM) to estimate hourly ground-level PM2.5 concentrations in 2017 in Wuhan Urban Agglomeration (WUA). We conducted contrast experiments to verify the effectiveness of our model and explored the optimal modeling strategy. It turned out that Geoi-LSTM with TOA reflectance, meteorological conditions, and NDVI as inputs performs best. The station-based cross-validation R2, root mean squared error and mean absolute error are 0.82, 15.44 μg/m3, 10.63 μg/m3, respectively. Based on model results, we revealed spatiotemporal characteristics of PM2.5 in WUA. Generally speaking, during the day, PM2.5 concentration remained stable at a relatively high level in the morning and decreased continuously in the afternoon. While during the year, PM2.5 concentrations were highest in winter, lowest in summer, and in-between in spring and autumn. Combined with meteorological conditions, we further analyzed the whole process of a PM2.5 pollution event. Finally, we discussed the loss in removing clouds-covered pixels and compared our model with several popular models. Overall, our results can reflect hourly PM2.5 concentrations seamlessly and accurately with a spatial resolution of 5 km, which benefits PM2.5 exposure evaluations and policy regulations.
Collapse
|