1
|
Yang Y, Wang M, Wang J, Li P, Zhou M. Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains. SENSORS (BASEL, SWITZERLAND) 2025; 25:2428. [PMID: 40285118 PMCID: PMC12031219 DOI: 10.3390/s25082428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Revised: 03/30/2025] [Accepted: 04/10/2025] [Indexed: 04/29/2025]
Abstract
The retail industry faces increasing challenges in matching supply with demand due to evolving consumer behaviors, market volatility, and supply chain disruptions. While existing approaches employ statistical and machine learning methods for demand forecasting, they often fail to capture complex temporal dependencies and lack the ability to simultaneously optimize inventory decisions. This paper proposes a novel multi-agent deep reinforcement learning framework that jointly optimizes demand forecasting and inventory management in retail supply chains, leveraging data from IoT sensors, RFID tracking systems, and smart shelf monitoring devices. Our approach combines transformer-based sequence modeling for demand patterns with hierarchical reinforcement learning agents that coordinate inventory decisions across distribution networks. The framework integrates both historical sales data and real-time sensor measurements, employing attention mechanisms to capture seasonal patterns, promotional effects, and environmental conditions detected through temperature and humidity sensors. Through extensive experiments on large-scale retail datasets incorporating sensor network data, we demonstrate that our method achieves 18.2% lower forecast error and 23.5% reduced stockout rates compared with state-of-the-art baselines. The results show particular improvements in handling promotional events and seasonal transitions, where traditional methods often struggle. Our work provides new insights into leveraging deep reinforcement learning for integrated retail operations optimization and offers a scalable solution for modern sensor-enabled supply chain challenges.
Collapse
Affiliation(s)
- Yongbin Yang
- Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90007, USA
| | - Mengdie Wang
- School of Taxation and Public Administration, Shanghai Lixin University of Accounting and Finance, Shanghai 201620, China;
| | - Jiyuan Wang
- The Fuqua School of Business, Duke University, Durham, NC 27708, USA;
| | - Pan Li
- The Business School, University of Hull, Hull HU6 7R, UK;
| | - Mengjie Zhou
- Department of Computer Science, The University of Bristol, Bristol BS8 1QU, UK
| |
Collapse
|
2
|
He H, Zhang Q, Yi K, Shi K, Niu Z, Cao L. Distributional Drift Adaptation With Temporal Conditional Variational Autoencoder for Multivariate Time Series Forecasting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7287-7301. [PMID: 38683706 DOI: 10.1109/tnnls.2024.3384842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Due to the nonstationary nature, the distribution of real-world multivariate time series (MTS) changes over time, which is known as distribution drift. Most existing MTS forecasting models greatly suffer from distribution drift and degrade the forecasting performance over time. Existing methods address distribution drift via adapting to the latest arrived data or self-correcting per the meta knowledge derived from future data. Despite their great success in MTS forecasting, these methods hardly capture the intrinsic distribution changes, especially from a distributional perspective. Accordingly, we propose a novel framework temporal conditional variational autoencoder (TCVAE) to model the dynamic distributional dependencies over time between historical observations and future data in MTSs and infer the dependencies as a temporal conditional distribution to leverage latent variables. Specifically, a novel temporal Hawkes attention (THA) mechanism represents temporal factors that subsequently fed into feedforward networks to estimate the prior Gaussian distribution of latent variables. The representation of temporal factors further dynamically adjusts the structures of Transformer-based encoder and decoder to distribution changes by leveraging a gated attention mechanism (GAM). Moreover, we introduce conditional continuous normalization flow (CCNF) to transform the prior Gaussian to a complex and form-free distribution to facilitate flexible inference of the temporal conditional distribution. Extensive experiments conducted on six real-world MTS datasets demonstrate the TCVAE's superior robustness and effectiveness over the state-of-the-art MTS forecasting baselines. We further illustrate the TCVAE applicability through multifaceted case studies and visualization in real-world scenarios.
Collapse
|
3
|
Guo Q, Fang L, Wang R, Zhang C. Multivariate Time Series Forecasting Using Multiscale Recurrent Networks With Scale Attention and Cross-Scale Guidance. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:540-554. [PMID: 37903050 DOI: 10.1109/tnnls.2023.3326140] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
Multivariate time series (MTS) forecasting is considered as a challenging task due to complex and nonlinear interdependencies between time steps and series. With the advance of deep learning, significant efforts have been made to model long-term and short-term temporal patterns hidden in historical information by recurrent neural networks (RNNs) with a temporal attention mechanism. Although various forecasting models have been developed, most of them are single-scale oriented, resulting in scale information loss. In this article, we seamlessly integrate multiscale analysis into deep learning frameworks to build scale-aware recurrent networks and propose two multiscale recurrent network (MRN) models for MTS forecasting. The first model called MRN-SA adopts a scale attention mechanism to dynamically select the most relevant information from different scales and simultaneously employs input attention and temporal attention to make predictions. The second one named as MRN-CSG introduces a novel cross-scale guidance mechanism to exploit the information from coarse scale to guide the decoding process at fine scale, which results in a lightweight and more easily trained model without obvious loss of accuracy. Extensive experimental results demonstrate that both MRN-SA and MRN-CSG can achieve state-of-the-art performance on five typical MTS datasets in different domains. The source codes will be publicly available at https://github.com/qguo2010/MRN.
Collapse
|
4
|
Huang X, Li Y, Wang X. Integrating a multi-variable scenario with Attention-LSTM model to forecast long-term coastal beach erosion. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 954:176257. [PMID: 39288874 DOI: 10.1016/j.scitotenv.2024.176257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 08/31/2024] [Accepted: 09/11/2024] [Indexed: 09/19/2024]
Abstract
Beach erosion is an adverse impact of climate change and human development activities. Effective beach management necessitates integrating natural and anthropogenic factors to address future erosion trends, while most current prediction models focus only on natural factors, which may provide an incomplete and potentially inaccurate representation of erosion dynamics. This study enhances prediction methods by integrating both natural and anthropogenic factors, thereby enhancing the accuracy and reliability of erosion projections. By extracting historical shorelines through CoastSat model from 1986 to 2020, we develop multivariable scenarios with Attention-LSTM model to predict the regional impacts of natural and anthropogenic factors on erosion to sandy beaches along the typical shoreline of Shenzhen in China. Results reveal that Shenzhen's beaches experienced erosion up to 12 m over the past 35 years. Here we project a decrease in the mean erosion rate of the beaches, identifying population growth (21.0 %) as the main controlling factor before the mid-century in a range of scenarios. We find that Attention-LSTM multi-model ensemble approach can provide overall improved accuracy and reliability over a wide range of beach erosion compared to scenario prediction model of Attention-LSTM and statistical model of Digital Shoreline Analysis System (DSAS), yielding an average uncertainty of 10.99 compared to 13.29. These insights reveal policies to safeguard beaches because of the rising demand for beaches due to human factors, coupled with decreased impervious surfaces through ecological conservation, lead to mitigation for beach erosion. Accurate forecasts empower policymakers to implement effective coastal management strategies, safeguard resources, and mitigate erosion's adverse effects. Our study offers finely-tuned predictions of coastal erosion, providing crucial insights for future coastal conservation efforts and climate change adaptation along the shoreline, and serving as a foundation for further research aimed at understanding the evolving environmental impacts of beach erosion in Shenzhen.
Collapse
Affiliation(s)
- Xuanhao Huang
- Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Yangfan Li
- Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China.
| | - Xinwei Wang
- Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| |
Collapse
|
5
|
Tsang TK, Du Q, Cowling BJ, Viboud C. An adaptive weight ensemble approach to forecast influenza activity in an irregular seasonality context. Nat Commun 2024; 15:8625. [PMID: 39366942 PMCID: PMC11452387 DOI: 10.1038/s41467-024-52504-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 09/11/2024] [Indexed: 10/06/2024] Open
Abstract
Forecasting influenza activity in tropical and subtropical regions, such as Hong Kong, is challenging due to irregular seasonality and high variability. We develop a diverse set of statistical, machine learning, and deep learning approaches to forecast influenza activity in Hong Kong 0 to 8 weeks ahead, leveraging a unique multi-year surveillance record spanning 32 epidemics from 1998 to 2019. We consider a simple average ensemble (SAE) of the top two individual models, and develop an adaptive weight blending ensemble (AWBE) that dynamically updates model contribution. All models outperform the baseline constant incidence model, reducing the root mean square error (RMSE) by 23%-29% and weighted interval score (WIS) by 25%-31% for 8-week ahead forecasts. The SAE model performed similarly to individual models, while the AWBE model reduces RMSE by 52% and WIS by 53%, outperforming individual models for forecasts in different epidemic trends (growth, plateau, decline) and during both winter and summer seasons. Using the post-COVID data (2023-2024) as another test period, the AWBE model still reduces RMSE by 39% and WIS by 45%. Our framework contributes to comparing and benchmarking models in ensemble forecasts, enhancing evidence for synthesizing multiple models in disease forecasting for geographies with irregular influenza seasonality.
Collapse
Affiliation(s)
- Tim K Tsang
- WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
- Laboratory of Data Discovery for Health Limited, Hong Kong Science and Technology Park, New Territories, Hong Kong.
| | - Qiurui Du
- WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Benjamin J Cowling
- WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Laboratory of Data Discovery for Health Limited, Hong Kong Science and Technology Park, New Territories, Hong Kong
| | - Cécile Viboud
- Fogarty International Center National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
6
|
Altieri M, Corizzo R, Ceci M. GAP-LSTM: Graph-Based Autocorrelation Preserving Networks for Geo-Distributed Forecasting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11773-11787. [PMID: 38758622 DOI: 10.1109/tnnls.2024.3398441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
Forecasting methods are important decision support tools in geo-distributed sensor networks. However, challenges such as the multivariate nature of data, the existence of multiple nodes, and the presence of spatio-temporal autocorrelation increase the complexity of the task. Existing forecasting methods are unable to address these challenges in a combined manner, resulting in a suboptimal model accuracy. In this article, we propose GAP-LSTM, a novel geo-distributed forecasting method that leverages the synergic interaction of graph convolution, attention-based long short-term memory (LSTM), 2-D-convolution, and latent memory states to effectively exploit spatio-temporal autocorrelation in multivariate data generated by multiple nodes, resulting in improved modeling capabilities. Our extensive evaluation, involving real-world datasets on traffic, energy, and pollution domains, showcases the ability of our method to outperform state-of-the-art forecasting methods. An ablation study confirms that all method components provide a positive contribution to the accuracy of the extracted forecasts. The method also provides an interpretable visualization that complements forecasts with additional insights for domain experts.
Collapse
|
7
|
Wan J, Xia N, Yin Y, Pan X, Hu J, Yi J. TCDformer: A transformer framework for non-stationary time series forecasting based on trend and change-point detection. Neural Netw 2024; 173:106196. [PMID: 38412739 DOI: 10.1016/j.neunet.2024.106196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/25/2024] [Accepted: 02/18/2024] [Indexed: 02/29/2024]
Abstract
Although time series prediction models based on Transformer architecture have achieved significant advances, concerns have arisen regarding their performance with non-stationary real-world data. Traditional methods often use stabilization techniques to boost predictability, but this often results in the loss of non-stationarity, notably underperforming when tackling major events in practical applications. To address this challenge, this research introduces an innovative method named TCDformer (Trend and Change-point Detection Transformer). TCDformer employs a unique strategy, initially encoding abrupt changes in non-stationary time series using the local linear scaling approximation (LLSA) module. The reconstructed contextual time series is then decomposed into trend and seasonal components. The final prediction results are derived from the additive combination of a multilayer perceptron (MLP) for predicting trend components and wavelet attention mechanisms for seasonal components. Comprehensive experimental results show that on standard time series prediction datasets, TCDformer significantly surpasses existing benchmark models in terms of performance, reducing MSE by 47.36% and MAE by 31.12%. This approach offers an effective framework for managing non-stationary time series, achieving a balance between performance and interpretability, making it especially suitable for addressing non-stationarity challenges in real-world scenarios.
Collapse
Affiliation(s)
- Jiashan Wan
- College of Computer and Information Science, Hefei University of Technology, Hefei, 230601, Anhui, China; College of Big Data and Artificial Intelligence, Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China.
| | - Na Xia
- College of Computer and Information Science, Hefei University of Technology, Hefei, 230601, Anhui, China
| | - Yutao Yin
- Shenzhen Hangsheng electronics Co., Ltd., Shenzhen, 518103, Guangdong, China
| | - Xulei Pan
- College of Big Data and Artificial Intelligence, Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China
| | - Jin Hu
- Shenzhen Hangsheng electronics Co., Ltd., Shenzhen, 518103, Guangdong, China
| | - Jun Yi
- College of Computer and Information Science, Hefei University of Technology, Hefei, 230601, Anhui, China
| |
Collapse
|
8
|
Grecov P, Prasanna AN, Ackermann K, Campbell S, Scott D, Lubman DI, Bergmeir C. Probabilistic Causal Effect Estimation With Global Neural Network Forecasting Models. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4999-5013. [PMID: 35853064 DOI: 10.1109/tnnls.2022.3190984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We introduce a novel method to estimate the causal effects of an intervention over multiple treated units by combining the techniques of probabilistic forecasting with global forecasting methods using deep learning (DL) models. Considering the counterfactual and synthetic approach for policy evaluation, we recast the causal effect estimation problem as a counterfactual prediction outcome of the treated units in the absence of the treatment. Nevertheless, in contrast to estimating only the counterfactual time series outcome, our work differs from conventional methods by proposing to estimate the counterfactual time series probability distribution based on the past preintervention set of treated and untreated time series. We rely on time series properties and forecasting methods, with shared parameters, applied to stacked univariate time series for causal identification. This article presents DeepProbCP, a framework for producing accurate quantile probabilistic forecasts for the counterfactual outcome, based on training a global autoregressive recurrent neural network model with conditional quantile functions on a large set of related time series. The output of the proposed method is the counterfactual outcome as the spline-based representation of the counterfactual distribution. We demonstrate how this probabilistic methodology added to the global DL technique to forecast the counterfactual trend and distribution outcomes overcomes many challenges faced by the baseline approaches to the policy evaluation problem. Oftentimes, some target interventions affect only the tails or the variance of the treated units' distribution rather than the mean or median, which is usual for skewed or heavy-tailed distributions. Under this scenario, the classical causal effect models based on counterfactual predictions are not capable of accurately capturing or even seeing policy effects. By means of empirical evaluations of synthetic and real-world datasets, we show that our framework delivers more accurate forecasts than the state-of-the-art models, depicting, in which quantiles, the intervention most affected the treated units, unlike the conventional counterfactual inference methods based on nonprobabilistic approaches.
Collapse
|
9
|
Silva ADD, Gomes MFDC, Gregianini TS, Martins LG, Veiga ABGD. Machine learning in predicting severe acute respiratory infection outbreaks. CAD SAUDE PUBLICA 2024; 40:e00122823. [PMID: 38198384 PMCID: PMC10775960 DOI: 10.1590/0102-311xen122823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/21/2023] [Accepted: 10/02/2023] [Indexed: 01/12/2024] Open
Abstract
Severe acute respiratory infection (SARI) outbreaks occur annually, with seasonal peaks varying among geographic regions. Case notification is important to prepare healthcare networks for patient attendance and hospitalization. Thus, health managers need adequate resource planning tools for SARI seasons. This study aims to predict SARI outbreaks based on models generated with machine learning using SARI hospitalization notification data. In this study, data from the reporting of SARI hospitalization cases in Brazil from 2013 to 2020 were used, excluding SARI cases caused by COVID-19. These data were prepared to feed a neural network configured to generate predictive models for time series. The neural network was implemented with a pipeline tool. Models were generated for the five Brazilian regions and validated for different years of SARI outbreaks. By using neural networks, it was possible to generate predictive models for SARI peaks, volume of cases per season, and for the beginning of the pre-epidemic period, with good weekly incidence correlation (R2 = 0.97; 95%CI: 0.95-0.98, for the 2019 season in the Southeastern Brazil). The predictive models achieved a good prediction of the volume of reported cases of SARI; accordingly, 9,936 cases were observed in 2019 in Southern Brazil, and the prediction made by the models showed a median of 9,405 (95%CI: 9,105-9,738). The identification of the period of occurrence of a SARI outbreak is possible using predictive models generated with neural networks and algorithms that employ time series.
Collapse
Affiliation(s)
| | | | - Tatiana Schäffer Gregianini
- Centro Estadual de Vigilância em Saúde, Secretaria de Saúde do Estado do Rio Grande do Sul, Porto Alegre, Brasil
| | - Leticia Garay Martins
- Centro Estadual de Vigilância em Saúde, Secretaria de Saúde do Estado do Rio Grande do Sul, Porto Alegre, Brasil
| | | |
Collapse
|
10
|
Wang W, Shao J, Jumahong H. Fuzzy inference-based LSTM for long-term time series prediction. Sci Rep 2023; 13:20359. [PMID: 37990124 PMCID: PMC10663611 DOI: 10.1038/s41598-023-47812-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 11/18/2023] [Indexed: 11/23/2023] Open
Abstract
Long short-term memory (LSTM) based time series forecasting methods suffer from multiple limitations, such as accumulated error, diminishing temporal correlation, and lacking interpretability, which compromises the prediction performance. To overcome these shortcomings, a fuzzy inference-based LSTM with the embedding of a fuzzy system is proposed to enhance the accuracy and interpretability of LSTM for long-term time series prediction. Firstly, a fast and complete fuzzy rule construction method based on Wang-Mendel (WM) is proposed, which can enhance the computational efficiency and completeness of the WM model by fuzzy rules simplification and complement strategies. Then, the fuzzy prediction model is constructed to capture the fuzzy logic in data. Finally, the fuzzy inference-based LSTM is proposed by integrating the fuzzy prediction fusion, the strengthening memory layer, and the parameter segmentation sharing strategy into the LSTM network. Fuzzy prediction fusion increases the network reasoning capability and interpretability, the strengthening memory layer strengthens the long-term memory and alleviates the gradient dispersion problem, and the parameter segmentation sharing strategy balances processing efficiency and architecture discrimination. Experiments on publicly available time series demonstrate that the proposed method can achieve better performance than existing models for long-term time series prediction.
Collapse
Affiliation(s)
- Weina Wang
- College of Information and Control Engineering, Jilin Institute of Chemical Technology, Jilin, 132022, China.
| | - Jiapeng Shao
- College of Information and Control Engineering, Jilin Institute of Chemical Technology, Jilin, 132022, China
| | - Huxidan Jumahong
- School of Network Security and Information technology, YiLi Normal University, Yining, 835000, China
| |
Collapse
|
11
|
Li C, Jiang W, Yang Y, Pan S, Huang G, Guo L. Predicting Best-Selling New Products in a Major Promotion Campaign Through Graph Convolutional Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9102-9115. [PMID: 35320107 DOI: 10.1109/tnnls.2022.3155690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Many e-commerce platforms, such as AliExpress, run major promotion campaigns regularly. Before such a promotion, it is important to predict potential best sellers and their respective sales volumes so that the platform can arrange their supply chains and logistics accordingly. For items with a sufficiently long sales history, accurate sales forecast can be achieved through the traditional statistical forecasting techniques. Accurately predicting the sales volume of a new item, however, is rather challenging with existing methods; time series models tend to overfit due to the very limited historical sales records of the new item, whereas models that do not utilize historical information often fail to make accurate predictions, due to the lack of strong indicators of sales volume among the item's basic attributes. This article presents the solution deployed at Alibaba in 2019, which had been used in production to prepare for its annual "Double 11" promotion event whose total sales amount exceeded U.S. $ 38 billion in a single day. The main idea of the proposed solution is to predict the sales volume of each new item through its connections with older products with sufficiently long sales history. In other words, our solution considers the cross-selling effects between different products, which has been largely neglected in previous methods. Specifically, the proposed solution first constructs an item graph, in which each new item is connected to relevant older items. Then, a novel multitask graph convolutional neural network (GCN) is trained by a multiobjective optimization-based gradient surgery technique to predict the expected sales volumes of new items. The designs of both the item graph and the GCN exploit the fact that we only need to perform accurate sales forecasts for potential best-selling items in a major promotion, which helps reduce computational overhead. Extensive experiments on both proprietary AliExpress data and a public dataset demonstrate that the proposed solution achieves consistent performance gains compared to existing methods for sales forecast.
Collapse
|
12
|
Ma Y, Ren J, Liu B, Mao Y, Wu X, Chen S, Ma Y, Jiang L, Wu M, Zhao N, Zhang J, Wu Y, Ullah R. Secure semantic optical communication scheme based on the multi-head attention mechanism. OPTICS LETTERS 2023; 48:4408-4411. [PMID: 37582044 DOI: 10.1364/ol.498997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 07/24/2023] [Indexed: 08/17/2023]
Abstract
In this paper, an artificial-intelligence-based secure semantic optical communication scheme is proposed. The semantic features of the original text information are extracted using Transformer. Compared with other networks, Transformer reduces the complexity of the structure and the associated training cost by using the multi-head attention mechanism. To solve the security problem, the encryption scheme is applied to an orthogonal frequency division multiplexed passive optical network (OFDM-PON). The proposed scheme applies chaotic sequences to produce masking vectors. We encrypt the constellation and frequency, achieving a large key space of 1 × 10270. To prove that Transformer can effectively extract the semantic features of text, we have computed the values of ROUGE-1, ROUGE-2, and ROUGE-L, which are 40.9, 18.02, and 37.17, respectively. An encrypted 16 quadrature amplitude modulation (16QAM) OFDM signal transmission over a 2 km seven-core fiber with a data rate of 78.5 Gbits/s was experimentally demonstrated. During the experiments, the bit error rate (BER) was analyzed and the results show that the proposed system improves efficiency and security in an OFDM-PON system.
Collapse
|
13
|
Li J, Wei S, Dai W. Combination of Manifold Learning and Deep Learning Algorithms for Mid-Term Electrical Load Forecasting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2584-2593. [PMID: 34478386 DOI: 10.1109/tnnls.2021.3106968] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Mid-term load forecasting (MTLF) is of great significance for power system planning, operation, and power trading. However, the mid-term electrical load is affected by the coupling of multiple factors and demonstrates complex characteristics, which leads to low prediction accuracy in MTLF. Furthermore, MTLF is faced with the "curse of dimensionality" problem due to a large number of variables. This article proposes an MTLF method based on manifold learning, which can extract the underlying factors of load variations to help improve the accuracy of MTLF and significantly reduce the calculation. Unlike linear dimensionality reduction methods, manifold learning has better nonlinear feature extraction ability and is more suitable for load data with nonlinear characteristics. Furthermore, long short-term memory (LSTM) neural networks are used to establish forecasting models in the low-dimensional space obtained by manifold learning. The proposed MTLF method is tested on independent system operator (ISO) New England datasets, and load forecasting in 24, 168, and 720 h ahead is carried out. The numerical results validate that the proposed method has higher prediction accuracy than many mature methods in the mid-term time scale.
Collapse
|
14
|
Wang X, Liu H, Yang Z, Du J, Dong X. CNformer: a convolutional transformer with decomposition for long-term multivariate time series forecasting. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04496-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
|
15
|
Wang X, Liu H, Du J, Dong X, Yang Z. A long-term multivariate time series forecasting network combining series decomposition and convolutional neural networks. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
16
|
Soleymani F, Paquet E, Viktor HL, Michalowski W, Spinello D. ProtInteract: A deep learning framework for predicting protein-protein interactions. Comput Struct Biotechnol J 2023; 21:1324-1348. [PMID: 36817951 PMCID: PMC9929211 DOI: 10.1016/j.csbj.2023.01.028] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/20/2023] [Accepted: 01/20/2023] [Indexed: 01/26/2023] Open
Abstract
Proteins mainly perform their functions by interacting with other proteins. Protein-protein interactions underpin various biological activities such as metabolic cycles, signal transduction, and immune response. However, due to the sheer number of proteins, experimental methods for finding interacting and non-interacting protein pairs are time-consuming and costly. We therefore developed the ProtInteract framework to predict protein-protein interaction. ProtInteract comprises two components: first, a novel autoencoder architecture that encodes each protein's primary structure to a lower-dimensional vector while preserving its underlying sequence attributes. This leads to faster training of the second network, a deep convolutional neural network (CNN) that receives encoded proteins and predicts their interaction under three different scenarios. In each scenario, the deep CNN predicts the class of a given encoded protein pair. Each class indicates different ranges of confidence scores corresponding to the probability of whether a predicted interaction occurs or not. The proposed framework features significantly low computational complexity and relatively fast response. The contributions of this work are twofold. First, ProtInteract assimilates the protein's primary structure into a pseudo-time series. Therefore, we leverage the nature of the time series of proteins and their physicochemical properties to encode a protein's amino acid sequence into a lower-dimensional vector space. This approach enables extracting highly informative sequence attributes while reducing computational complexity. Second, the ProtInteract framework utilises this information to identify protein interactions with other proteins based on its amino acid configuration. Our results suggest that the proposed framework performs with high accuracy and efficiency in predicting protein-protein interactions.
Collapse
Affiliation(s)
- Farzan Soleymani
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| | - Eric Paquet
- National Research Council, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada,Corresponding author.
| | - Herna Lydia Viktor
- School of Electrical Engineering and Computer Science, University of Ottawa, ON K1N 6N5, Canada
| | | | - Davide Spinello
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| |
Collapse
|
17
|
Bi J, Zhang L, Yuan H, Zhang J. Multi-indicator Water Quality Prediction with Attention-assisted Bidirectional LSTM and Encoder-Decoder. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2022.12.091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
18
|
Zhang J, Dai Q. Latent adversarial regularized autoencoder for high-dimensional probabilistic time series prediction. Neural Netw 2022; 155:383-397. [PMID: 36115164 DOI: 10.1016/j.neunet.2022.08.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 06/17/2022] [Accepted: 08/28/2022] [Indexed: 11/16/2022]
Abstract
Many practical applications require probabilistic prediction of time series to model the distribution on future horizons. With ever-increasing dimensions, much effort has been invested into developing methods that often make an assumption about the independence between time series. Consequently, the probabilistic prediction in high-dimensional environments has become an essential topic with significant challenges. In this paper, we propose a novel probabilistic model called latent adversarial regularized autoencoder, abbreviated as TimeLAR, specifically for high-dimensional multivariate Time Series Prediction (TSP). It integrates the flexibility of Generative Adversarial Networks (GANs) and the capability of autoencoders in extracting higher-level non-linear features. Through flexible autoencoder mapping, TimeLAR learns cross-series relationships and encodes this global information into several latent variables. We design a modified Transformer for these latent variables to capture global temporal patterns and infer latent space prediction distributions, where only one step is required to output multi-step predictions. Furthermore, we employ the GAN to further refine the performance of latent space predictions, by using a discriminator to guide the training of the autoencoder and the Transformer in an adversarial process. Finally, complex distributions of multivariate time series data can be modeled by the non-linear decoder of the autoencoder. The effectiveness of TimeLAR is empirically underpinned by extensive experiments conducted on five real-world high-dimensional time series datasets in the fields of transportation, electricity, and web page views.
Collapse
Affiliation(s)
- Jing Zhang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Qun Dai
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China.
| |
Collapse
|
19
|
Arslan S. A hybrid forecasting model using LSTM and Prophet for energy consumption with decomposition of time series data. PeerJ Comput Sci 2022; 8:e1001. [PMID: 35721410 PMCID: PMC9202617 DOI: 10.7717/peerj-cs.1001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 05/13/2022] [Indexed: 06/15/2023]
Abstract
For decades, time series forecasting had many applications in various industries such as weather, financial, healthcare, business, retail, and energy consumption forecasting. An accurate prediction in these applications is a very important and also difficult task because of high sampling rates leading to monthly, daily, or even hourly data. This high-frequency property of time series data results in complexity and seasonality. Moreover, the time series data can have irregular fluctuations caused by various factors. Thus, using a single model does not result in good accuracy results. In this study, we propose an efficient forecasting framework by hybridizing the recurrent neural network model with Facebook's Prophet to improve the forecasting performance. Seasonal-trend decomposition based on the Loess (STL) algorithm is applied to the original time series and these decomposed components are used to train our recurrent neural network for reducing the impact of these irregular patterns on final predictions. Moreover, to preserve seasonality, the original time series data is modeled with Prophet, and the output of both sub-models are merged as final prediction values. In experiments, we compared our model with state-of-art methods for real-world energy consumption data of seven countries and the proposed hybrid method demonstrates competitive results to these state-of-art methods.
Collapse
|
20
|
Using Bidirectional Long-Term Memory Neural Network for Trajectory Prediction of Large Inner Wheel Routes. SUSTAINABILITY 2022. [DOI: 10.3390/su14105935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
When a large car turns at an intersection, it often leads to tragedy because the driver does not pay attention to the incoming car or the dead corner of the line of sight of the car body. On the market, the wheel difference warning system used in large cars generally adds sensors or lenses to confirm whether there are incoming vehicles in the dead corner of the line of sight. However, the accident rate of large vehicles has not been reduced due to the installation of a vision subsidy system. The main reason is that motorcycle and bicycle drivers often neglect to pay attention to the inner wheel difference formed when large vehicles turn, resulting in accidents with large vehicles at intersections. This paper proposes a bidirectional long-term memory neural network for the prediction of the inner wheel path trajectory of large cars, mainly from the perspective of motorcycle riders, through the combination of YOLOv4 and the stacked Bi-LSTM model used in this study to analyze the motion of large cars and predict the inner wheel path trajectory. In this study, the turning trajectory of large vehicles at the intersection is predicted by using an object detection algorithm and cyclic neural network model. Finally, the experiment shows that this study uses the stacked Bi-LSTM trajectory prediction model to predict the next second trajectory with one second trajectory data, and the prediction accuracy is 87.77%; it has an accuracy of 75.75% when predicting the trajectory data of two seconds. In terms of prediction error, the system has a better prediction error than LSTM and Bi-LSTM models.
Collapse
|
21
|
Demand Forecasting of E-Commerce Enterprises Based on Horizontal Federated Learning from the Perspective of Sustainable Development. SUSTAINABILITY 2021. [DOI: 10.3390/su132313050] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Public health emergencies have brought great challenges to the stability of the e-commerce supply chain. Demand forecasting is a key driver for the sound development of e-commerce enterprises. To prevent the potential privacy leakage of e-commerce enterprises in the process of demand forecasting using multi-party data, and to improve the accuracy of demand forecasting models, we propose an e-commerce enterprise demand forecasting method based on Horizontal Federated Learning and ConvLSTM, from the perspective of sustainable development. First, in view of the shortcomings of traditional RNN and LSTM demand forecasting models, which cannot handle multi-dimensional time-series problems, we propose a demand forecasting model based on ConvLSTM. Secondly, to address the problem that data cannot be directly shared and exchanged between e-commerce enterprises of the same type, the goal of demand information sharing modeling is realized indirectly through Horizontal Federated Learning. Experimental results on a large number of real data sets show that, compared with benchmark experiments, our proposed method can improve the accuracy of e-commerce enterprise demand forecasting models while avoiding privacy data leakage, and the bullwhip effect value is closer to 1. Therefore, we effectively alleviate the bullwhip effect of the entire supply chain system in demand forecasting, and promote the sustainable development of e-commerce companies.
Collapse
|
22
|
Water Flow Forecasting Based on River Tributaries Using Long Short-Term Memory Ensemble Model. ENERGIES 2021. [DOI: 10.3390/en14227707] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Water flow forecasts are an essential information for energy production, management and hydropower control. Advanced actions to optimize electricity production can be taken based on predicted information. This work proposes an ensemble strategy using recurrent neural networks to generate a forecast of water flow at Jirau Hydroelectric Power Plant (HPP), installed on the Madeira River in Brazil. The ensemble strategy consists of combining three long short-term memory (LSTM) networks that model the Madeira River and two of its tributaries: Mamoré and Abunã rivers. The historical data from streamflow of the Madeira river and its tributaries are used to validate the ensemble LSTM model, where each time series of river tributaries are modeled separated by LSTM models and the result used as input for another LSTM model in order to forecast the streamflow of the main river. The experimental results present low errors for training and test sets for individual LSTM networks and ensemble model. In addition, these results were compared with the operational forecasts performed by Jirau HPP. The proposed model showed better accuracy in four of the five scenarios tested, which indicates a promising approach to be explored in water flow forecasting based on river tributaries.
Collapse
|
23
|
Short-Term Net Load Forecasting with Singular Spectrum Analysis and LSTM Neural Networks. ENERGIES 2021. [DOI: 10.3390/en14144107] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Short-term electricity load forecasting is key to the safe, reliable, and economical operation of power systems. An important challenge that arises with high-frequency load series, e.g., hourly load, is how to deal with the complex seasonal patterns that are present. Standard approaches suggest either removing seasonality prior to modeling or applying time series decomposition. This work proposes a hybrid approach that combines Singular Spectrum Analysis (SSA)-based decomposition and Artificial Neural Networks (ANNs) for day-ahead hourly load forecasting. First, the trajectory matrix of the time series is constructed and decomposed into trend, oscillating, and noise components. Next, the extracted components are employed as exogenous regressors in a global forecasting model, comprising either a Multilayer Perceptron (MLP) or a Long Short-Term Memory (LSTM) predictive layer. The model is further extended to include exogenous features, e.g., weather forecasts, transformed via parallel dense layers. The predictive performance is evaluated on two real-world datasets, controlling for the effect of exogenous features on predictive accuracy. The results showcase that the decomposition step improves the relative performance for ANN models, with the combination of LSTM and SAA providing the best overall performance.
Collapse
|
24
|
Short-Term Load Forecasting Using Neural Networks with Pattern Similarity-Based Error Weights. ENERGIES 2021. [DOI: 10.3390/en14113224] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Forecasting time series with multiple seasonal cycles such as short-term load forecasting is a challenging problem due to the complicated relationship between input and output data. In this work, we use a pattern representation of the time series to simplify this relationship. A neural network trained on patterns is an easier task to solve. Thus, its architecture does not have to be either complex and deep or equipped with mechanisms to deal with various time-series components. To improve the learning performance, we propose weighting individual errors of training samples in the loss function. The error weights correspond to the similarity between the training pattern and the test query pattern. This approach makes the learning process more sensitive to the neighborhood of the test pattern. This means that more distant patterns have less impact on the learned function around the test pattern and lead to improved forecasting accuracy. The proposed framework is useful for a wide range of complex time-series forecasting problems. Its performance is illustrated in several short-term load-forecasting empirical studies in this work. In most cases, error weighting leads to a significant improvement in accuracy.
Collapse
|
25
|
Traditional vs. Machine-Learning Methods for Forecasting Sandy Shoreline Evolution Using Historic Satellite-Derived Shorelines. REMOTE SENSING 2021. [DOI: 10.3390/rs13050934] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Forecasting shoreline evolution for sandy coasts is important for sustainable coastal management, given the present-day increasing anthropogenic pressures and a changing future climate. Here, we evaluate eight different time-series forecasting methods for predicting future shorelines derived from historic satellite-derived shorelines. Analyzing more than 37,000 transects around the globe, we find that traditional forecast methods altogether with some of the evaluated probabilistic Machine Learning (ML) time-series forecast algorithms, outperform Ordinary Least Squares (OLS) predictions for the majority of the sites. When forecasting seven years ahead, we find that these algorithms generate better predictions than OLS for 54% of the transect sites, producing forecasts with, on average, 29% smaller Mean Squared Error (MSE). Importantly, this advantage is shown to exist over all considered forecast horizons, i.e., from 1 up to 11 years. Although the ML algorithms do not produce significantly better predictions than traditional time-series forecast methods, some proved to be significantly more efficient in terms of computation time. We further provide insight in how these ML algorithms can be improved so that they can be expected to outperform not only OLS regression, but also the traditional time-series forecast methods. These forecasting algorithms can be used by coastal engineers, managers, and scientists to generate future shoreline prediction at a global level and derive conclusions thereof.
Collapse
|