1
|
Wang S, Sun L, Yu Y. A dynamic customer segmentation approach by combining LRFMS and multivariate time series clustering. Sci Rep 2024; 14:17491. [PMID: 39080373 PMCID: PMC11289113 DOI: 10.1038/s41598-024-68621-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 07/25/2024] [Indexed: 08/02/2024] Open
Abstract
To successfully market to automotive parts customers in the Industrial Internet era, parts agents need to perform effective customer analysis and management. Dynamic customer segmentation is an effective analytical tool that helps parts agents identify different customer groups. RFM model and time series clustering algorithms are commonly used analytical methods in dynamic customer segmentation. The original RFM model suffers from the problems of R index randomness and ignoring customers' perceived value. For most existing studies on dynamic customer segmentation, time series clustering techniques largely focus on univariate clustering, with less research on multivariate clustering. To solve the above problems, this paper proposes a dynamic customer segmentation approach by combining LRFMS and multivariate time series clustering. Firstly, this method represents each customer behavior as a time series sequence of the Length, Recency, Frequency, Monetary and Satisfaction variables. And then, we apply a multi-dimensional time series clustering algorithm based on three distance measurement methods called DTW-D, SBD, and CID to carry out customer segmentation. Finally, an empirical study and comparative analyses are conducted using customer transaction data of parts agents to verify the effectiveness of the approach. Additionally, a detailed analysis of different customer groups is made, and corresponding marketing suggestions are provided.
Collapse
Affiliation(s)
- Shuhai Wang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, China.
- Manufacturing Industry Chain Collaboration and Information Support Technology Key Laboratory of Sichuan Province, Southwest Jiaotong University, Chengdu, 610031, China.
| | - Linfu Sun
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, China
- Manufacturing Industry Chain Collaboration and Information Support Technology Key Laboratory of Sichuan Province, Southwest Jiaotong University, Chengdu, 610031, China
| | - Yang Yu
- School of Computer Science, Southwest Petroleum University, Chengdu, 610500, China
| |
Collapse
|
2
|
Bothwell S, Kaizer A, Peterson R, Ostendorf D, Catenacci V, Wrobel J. Pattern-based clustering of daily weigh-in trajectories using dynamic time warping. Biometrics 2023; 79:2719-2731. [PMID: 36217829 PMCID: PMC10393286 DOI: 10.1111/biom.13773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 09/29/2022] [Indexed: 11/29/2022]
Abstract
"Smart"-scales are a new tool for frequent monitoring of weight change as well as weigh-in behavior. These scales give researchers the opportunity to discover patterns in the frequency that individuals weigh themselves over time, and how these patterns are associated with overall weight loss. Our motivating data come from an 18-month behavioral weight loss study of 55 adults classified as overweight or obese who were instructed to weigh themselves daily. Adherence to daily weigh-in routines produces a binary times series for each subject, indicating whether a participant weighed in on a given day. To characterize weigh-in by time-invariant patterns rather than overall adherence, we propose using hierarchical clustering with dynamic time warping (DTW). We perform an extensive simulation study to evaluate the performance of DTW compared to Euclidean and Jaccard distances to recover underlying patterns in adherence time series. In addition, we compare cluster performance using cluster validation indices (CVIs) under the single, average, complete, and Ward linkages and evaluate how internal and external CVIs compare for clustering binary time series. We apply conclusions from the simulation to cluster our real data and summarize observed weigh-in patterns. Our analysis finds that the adherence trajectory pattern is significantly associated with weight loss.
Collapse
Affiliation(s)
- Samantha Bothwell
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Alex Kaizer
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Ryan Peterson
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Danielle Ostendorf
- Department of Medicine, Division of Endocrinology, Metabolism, and Diabetes, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Victoria Catenacci
- Department of Medicine, Division of Endocrinology, Metabolism, and Diabetes, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Julia Wrobel
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| |
Collapse
|
3
|
Ostmeyer J, Cowell L, Christley S. Dynamic kernel matching for non-conforming data: A case study of T cell receptor datasets. PLoS One 2023; 18:e0265313. [PMID: 36881590 PMCID: PMC9990938 DOI: 10.1371/journal.pone.0265313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 03/01/2022] [Indexed: 03/08/2023] Open
Abstract
Most statistical classifiers are designed to find patterns in data where numbers fit into rows and columns, like in a spreadsheet, but many kinds of data do not conform to this structure. To uncover patterns in non-conforming data, we describe an approach for modifying established statistical classifiers to handle non-conforming data, which we call dynamic kernel matching (DKM). As examples of non-conforming data, we consider (i) a dataset of T-cell receptor (TCR) sequences labelled by disease antigen and (ii) a dataset of sequenced TCR repertoires labelled by patient cytomegalovirus (CMV) serostatus, anticipating that both datasets contain signatures for diagnosing disease. We successfully fit statistical classifiers augmented with DKM to both datasets and report the performance on holdout data using standard metrics and metrics allowing for indeterminant diagnoses. Finally, we identify the patterns used by our statistical classifiers to generate predictions and show that these patterns agree with observations from experimental studies.
Collapse
Affiliation(s)
- Jared Ostmeyer
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- * E-mail:
| | - Lindsay Cowell
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Scott Christley
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| |
Collapse
|
4
|
Liang M, Wang X, Wu S. Improving stock trend prediction through financial time series classification and temporal correlation analysis based on aligning change point. Soft comput 2022. [DOI: 10.1007/s00500-022-07630-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
5
|
Hao Q, Wang C, Xiao Y, Lin H. IMGC-GNN: A multi-granularity coupled graph neural network recommendation method based on implicit relationships. APPL INTELL 2022; 53:14668-14689. [PMID: 36340421 PMCID: PMC9628402 DOI: 10.1007/s10489-022-04215-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2022] [Indexed: 11/23/2022]
Abstract
In the application recommendation field, collaborative filtering (CF) method is often considered to be one of the most effective methods. As the basis of CF-based recommendation methods, representation learning needs to learn two types of factors: attribute factors revealed by independent individuals (e.g., user attributes, application types) and interaction factors contained in collaborative signals (e.g., interactions influenced by others). However, existing CF-based methods fail to learn these two factors separately; therefore, it is difficult to understand the deeper motivation behind user behaviors, resulting in suboptimal performance. From this point of view, we propose a multi-granularity coupled graph neural network recommendation method based on implicit relationships (IMGC-GNN). Specifically, we introduce contextual information (time and space) into user-application interactions and construct a three-layer coupled graph. Then, the graph neural network approach is used to learn the attribute and interaction factors separately. For attribute representation learning, we decompose the coupled graph into three homogeneous graphs with users, applications, and contexts as nodes. Next, we use multilayer aggregation operations to learn features between users, between contexts, and between applications. For interaction representation learning, we construct a homogeneous graph with user-context-application interactions as nodes. Next, we use node similarity and structural similarity to learn the deep interaction features. Finally, according to the learned representations, IMGC-GNN makes accurate application recommendations to users in different contexts. To verify the validity of the proposed method, we conduct experiments on real-world interaction data from three cities and compare our model with seven baseline methods. The experimental results show that our method has the best performance in the top-k recommendation.
Collapse
Affiliation(s)
- Qingbo Hao
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| | - Chundong Wang
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| | - Yingyuan Xiao
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| | - Hao Lin
- School of Computer Science and Engineering, Tianjin University of Technology, Binshui West Road, Tianjin, 300191 Tianjin China
- Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
- Engineering Research Center of Learning-Based Intelligent System, Ministry of Education, Binshui West Road, Tianjin, 300191 Tianjin China
| |
Collapse
|
6
|
Time-series classification with SAFE: Simple and fast segmented word embedding-based neural time series classifier. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.103044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
7
|
Duan Z, Xu H, Wang Y, Huang Y, Ren A, Xu Z, Sun Y, Wang W. Multivariate time-series classification with hierarchical variational graph pooling. Neural Netw 2022; 154:481-490. [DOI: 10.1016/j.neunet.2022.07.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 03/22/2022] [Accepted: 07/26/2022] [Indexed: 10/16/2022]
|
8
|
A novel transfer learning-based short-term solar forecasting approach for India. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07328-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
Tong Y, Liu J, Yu L, Zhang L, Sun L, Li W, Ning X, Xu J, Qin H, Cai Q. Technology investigation on time series classification and prediction. PeerJ Comput Sci 2022; 8:e982. [PMID: 35634126 PMCID: PMC9138170 DOI: 10.7717/peerj-cs.982] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 04/25/2022] [Indexed: 06/01/2023]
Abstract
Time series appear in many scientific fields and are an important type of data. The use of time series analysis techniques is an essential means of discovering the knowledge hidden in this type of data. In recent years, many scholars have achieved fruitful results in the study of time series. A statistical analysis of 120,000 literatures published between 2017 and 2021 reveals that the topical research about time series is mostly focused on their classification and prediction. Therefore, in this study, we focus on analyzing the technical development routes of time series classification and prediction algorithms. 87 literatures with high relevance and high citation are selected for analysis, aiming to provide a more comprehensive reference base for interested researchers. For time series classification, it is divided into supervised methods, semi-supervised methods, and early classification of time series, which are key extensions of time series classification tasks. For time series prediction, from classical statistical methods, to neural network methods, and then to fuzzy modeling and transfer learning methods, the performance and applications of these different methods are discussed. We hope this article can help aid the understanding of the current development status and discover possible future research directions, such as exploring interpretability of time series analysis and online learning modeling.
Collapse
Affiliation(s)
- Yuerong Tong
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Jingyi Liu
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Lina Yu
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Liping Zhang
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Linjun Sun
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Weijun Li
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
- Shenzhen DAPU Microelectronics Co., Ltd., Shenzhen, China
| | - Xin Ning
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Jian Xu
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Hong Qin
- Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
| | - Qiang Cai
- National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing, China
| |
Collapse
|
10
|
A Novel Feature Representation for Prediction of Global Horizontal Irradiance Using a Bidirectional Model. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3040047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Complex weather conditions—in particular clouds—leads to uncertainty in photovoltaic (PV) systems, which makes solar energy prediction very difficult. Currently, in the renewable energy domain, deep-learning-based sequence models have reported better results compared to state-of-the-art machine-learning models. There are quite a few choices of deep-learning architectures, among which Bidirectional Gated Recurrent Unit (BGRU) has apparently not been used earlier in the solar energy domain. In this paper, BGRU was used with a new augmented and bidirectional feature representation. The used BGRU network is more generalized as it can handle unequal lengths of forward and backward context. The proposed model produced 59.21%, 37.47%, and 76.80% better prediction accuracy compared to traditional sequence-based, bidirectional models, and some of the established states-of-the-art models. The testbed considered for evaluation of the model is far more comprehensive and reliable considering the variability in the climatic zones and seasons, as compared to some of the recent studies in India.
Collapse
|
11
|
Vázquez I, Villar JR, Sedano J, Simić S, de la Cal E. An ensemble solution for multivariate time series clustering. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.09.093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
12
|
Deep Learning-Based Phenological Event Modeling for Classification of Crops. REMOTE SENSING 2021. [DOI: 10.3390/rs13132477] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Classification of crops using time-series vegetation index (VI) curves requires appropriate modeling of phenological events and their characteristics. The current study explores the use of capsules, a group of neurons having an activation vector, to learn the characteristic features of the phenological curves. In addition, joint optimization of denoising and classification is adopted to improve the generalizability of the approach and to make it resilient to noise. The proposed approach employs reconstruction loss as a regularizer for classification, whereas the crop-type label is used as prior information for denoising. The activity vector of the class capsule is applied to sample the latent space conditioned on the cell state of a Long Short-Term Memory (LSTM) that integrates the sequences of the phenological events. Learning of significant phenological characteristics is facilitated by adversarial variational encoding in conjunction with constraints to regulate latent representations and embed label information. The proposed architecture, called the variational capsule network (VCapsNet), significantly improves the classification and denoising results. The performance of VCapsNet can be attributed to the suitable modeling of phenological events and the resilience to outliers and noise. The maxpooling-based capsule implementation yields better results, particularly with limited training samples, compared to the conventional implementations. In addition to the confusion matrix-based accuracy measures, this study illustrates the use of interpretability-based evaluation measures. Moreover, the proposed approach is less sensitive to noise and yields good results, even at shallower depths, compared to the main existing approaches. The performance of VCapsNet in accurately classifying wheat and barley crops indicates that the approach addresses the issues in crop-type classification. The approach is generic and effectively models the crop-specific phenological features and events. The interpretability-based evaluation measures further indicate that the approach successfully identifies the crop transitions, in addition to the planting, heading, and harvesting dates. Due to its effectiveness in crop-type classification, the proposed approach is applicable to acreage estimation and other applications in different scales.
Collapse
|
13
|
Qi J, Luo N. Using Stacked Auto-Encoder and Bi-Directional LSTM For Batch Process Quality Prediction. JOURNAL OF CHEMICAL ENGINEERING OF JAPAN 2021. [DOI: 10.1252/jcej.19we235] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Jiakang Qi
- Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology
| | - Na Luo
- Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology
| |
Collapse
|
14
|
A Deep Learning-Based Model for the Automated Assessment of the Activity of a Single Worker. SENSORS 2020; 20:s20092571. [PMID: 32366014 PMCID: PMC7248754 DOI: 10.3390/s20092571] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Revised: 04/27/2020] [Accepted: 04/29/2020] [Indexed: 11/24/2022]
Abstract
Nowadays, it is necessary to verify the accuracy of servicing work, undertaken by new employees, within a manufacturing company. A gap in the research has been observed in effective methods to automatically evaluate the work of a newly employed worker. The main purpose of the study is to build a new, deep learning model, in order to automatically assess the activity of the single worker. The proposed approach integrates the methods known as CNN, CNN + SVM, CNN + R-CNN, four new algorithms and a piece of work from a selected company, using this as an own-created dataset, in order to create a solution enabling assessment of the activity of single workers. Data were collected from an operational manufacturing cell without any guided or scripted work. The results reveal that the model developed is able to accurately detect the correctness of the work process. The model’s accuracy mostly exceeds current state-of-the-art methods for detecting work activities in manufacturing. The proposed two-stage approach, firstly, assigning the appropriate graphic instruction to a given employee’s activity using CNN and then using R-CNN to isolate the object from the reference frames, yields 94.01% and 73.15% accuracy of identification, respectively.
Collapse
|