1
|
A framework for inferring and analyzing pharmacotherapy treatment patterns. BMC Med Inform Decis Mak 2024; 24:68. [PMID: 38459459 PMCID: PMC10924394 DOI: 10.1186/s12911-024-02469-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 02/26/2024] [Indexed: 03/10/2024] Open
Abstract
BACKGROUND To discover pharmacotherapy prescription patterns and their statistical associations with outcomes through a clinical pathway inference framework applied to real-world data. METHODS We apply machine learning steps in our framework using a 2006 to 2020 cohort of veterans with major depressive disorder (MDD). Outpatient antidepressant pharmacy fills, dispensed inpatient antidepressant medications, emergency department visits, self-harm, and all-cause mortality data were extracted from the Department of Veterans Affairs Corporate Data Warehouse. RESULTS Our MDD cohort consisted of 252,179 individuals. During the study period there were 98,417 emergency department visits, 1,016 cases of self-harm, and 1,507 deaths from all causes. The top ten prescription patterns accounted for 69.3% of the data for individuals starting antidepressants at the fluoxetine equivalent of 20-39 mg. Additionally, we found associations between outcomes and dosage change. CONCLUSIONS For 252,179 Veterans who served in Iraq and Afghanistan with subsequent MDD noted in their electronic medical records, we documented and described the major pharmacotherapy prescription patterns implemented by Veterans Health Administration providers. Ten patterns accounted for almost 70% of the data. Associations between antidepressant usage and outcomes in observational data may be confounded. The low numbers of adverse events, especially those associated with all-cause mortality, make our calculations imprecise. Furthermore, our outcomes are also indications for both disease and treatment. Despite these limitations, we demonstrate the usefulness of our framework in providing operational insight into clinical practice, and our results underscore the need for increased monitoring during critical points of treatment.
Collapse
|
2
|
Roses Have Thorns: Understanding the Downside of Oncological Care Delivery Through Visual Analytics and Sequential Rule Mining. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:1227-1237. [PMID: 38015695 PMCID: PMC10842255 DOI: 10.1109/tvcg.2023.3326939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
Personalized head and neck cancer therapeutics have greatly improved survival rates for patients, but are often leading to understudied long-lasting symptoms which affect quality of life. Sequential rule mining (SRM) is a promising unsupervised machine learning method for predicting longitudinal patterns in temporal data which, however, can output many repetitive patterns that are difficult to interpret without the assistance of visual analytics. We present a data-driven, human-machine analysis visual system developed in collaboration with SRM model builders in cancer symptom research, which facilitates mechanistic knowledge discovery in large scale, multivariate cohort symptom data. Our system supports multivariate predictive modeling of post-treatment symptoms based on during-treatment symptoms. It supports this goal through an SRM, clustering, and aggregation back end, and a custom front end to help develop and tune the predictive models. The system also explains the resulting predictions in the context of therapeutic decisions typical in personalized care delivery. We evaluate the resulting models and system with an interdisciplinary group of modelers and head and neck oncology researchers. The results demonstrate that our system effectively supports clinical and symptom research.
Collapse
|
3
|
Classification and variable selection using the mining of positive and negative association rules. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.02.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
|
4
|
Real-time recommendations for energy-efficient appliance usage in households. Front Big Data 2022; 5:972206. [PMID: 36204447 PMCID: PMC9530195 DOI: 10.3389/fdata.2022.972206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 08/30/2022] [Indexed: 11/17/2022] Open
Abstract
According to several studies, the most influencing factor in a household's energy consumption is user behavior. Changing user behavior to improve energy usage leads to efficient energy consumption, saving money for the consumer and being more friendly for the environment. In this work we propose a framework that aims at assisting households in improving their energy usage by providing real-time recommendations for efficient appliance use. The framework allows for the creation of household-specific and appliance-specific energy consumption profiles by analyzing appliance usage patterns. Based on the household profile and the actual electricity use, real-time recommendations notify users on the appliances that can be switched off in order to reduce consumption. For instance, if a consumer forgets their A/C on at a time that it is usually off (e.g., when there is no one at home), the system will detect this as an outlier and notify the consumer. In the ideal scenario, a household has a smart meter monitoring system installed, that records energy consumption at the appliance level. This is also reflected in the datasets available for evaluating such systems. However, in the general case, the household may only have one main meter reading. In this case, non-intrusive load monitoring (NILM) techniques, which monitor a house's energy consumption using only one meter, and data mining algorithms that disaggregate the consumption into appliance level, can be employed. In this paper, we propose an end-to-end solution to this problem, starting with the energy disaggregation process, and the creation of user profiles that are then fed to the pattern mining and recommendation process, that through an intuitive UI allows users to further refine their energy consumption preferences and set goals. We employ the UK-DALE (UK Domestic Appliance-Level Electricity) dataset for our experimental evaluations and the proof-of-concept implementation. The results show that the proposed framework accurately captures the energy consumption profiles of each household and thus the generated recommendations are matching the actual household energy habits and can help reduce their energy consumption by 2–17%.
Collapse
|
5
|
Developing customer attrition management system: discovering action rules for making recommendations to retain customers. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03614-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
6
|
Technological trend mining: identifying new technology opportunities using patent semantic analysis. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.102993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
7
|
Machine learning in postgenomic biology and personalized medicine. WILEY INTERDISCIPLINARY REVIEWS. DATA MINING AND KNOWLEDGE DISCOVERY 2022; 12:e1451. [PMID: 35966173 PMCID: PMC9371441 DOI: 10.1002/widm.1451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 12/22/2021] [Indexed: 06/15/2023]
Abstract
In recent years Artificial Intelligence in the form of machine learning has been revolutionizing biology, biomedical sciences, and gene-based agricultural technology capabilities. Massive data generated in biological sciences by rapid and deep gene sequencing and protein or other molecular structure determination, on the one hand, requires data analysis capabilities using machine learning that are distinctly different from classical statistical methods; on the other, these large datasets are enabling the adoption of novel data-intensive machine learning algorithms for the solution of biological problems that until recently had relied on mechanistic model-based approaches that are computationally expensive. This review provides a bird's eye view of the applications of machine learning in post-genomic biology. Attempt is also made to indicate as far as possible the areas of research that are poised to make further impacts in these areas, including the importance of explainable artificial intelligence (XAI) in human health. Further contributions of machine learning are expected to transform medicine, public health, agricultural technology, as well as to provide invaluable gene-based guidance for the management of complex environments in this age of global warming.
Collapse
|
8
|
A New Method Combining Pattern Prediction and Preference Prediction for Next Basket Recommendation. ENTROPY 2021; 23:e23111430. [PMID: 34828128 PMCID: PMC8623780 DOI: 10.3390/e23111430] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 10/18/2021] [Accepted: 10/25/2021] [Indexed: 11/16/2022]
Abstract
Market basket prediction, which is the basis of product recommendation systems, is the concept of predicting what customers will buy in the next shopping basket based on analysis of their historical shopping records. Although product recommendation systems develop rapidly and have good performance in practice, state-of-the-art algorithms still have plenty of room for improvement. In this paper, we propose a new algorithm combining pattern prediction and preference prediction. In pattern prediction, sequential rules, periodic patterns and association rules are mined and probability models are established based on their statistical characteristics, e.g., the distribution of periods of a periodic pattern, to make a more precise prediction. Products that have a higher probability will have priority to be recommended. If the quantity of recommended products is insufficient, then we make a preference prediction to select more products. Preference prediction is based on the frequency and tendency of products that appear in customers’ individual shopping records, where tendency is a new concept to reflect the evolution of customers’ shopping preferences. Experiments show that our algorithm outperforms those of the baseline methods and state-of-the-art methods on three of four real-world transaction sequence datasets.
Collapse
|
9
|
|
10
|
|
11
|
An annotated association mining approach for extracting and visualizing interesting clinical events. Int J Med Inform 2020; 148:104366. [PMID: 33485216 DOI: 10.1016/j.ijmedinf.2020.104366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 11/29/2020] [Accepted: 12/08/2020] [Indexed: 11/16/2022]
Abstract
OBJECTIVE This work aims at deriving interesting clinical events using association rule mining based on a user-annotated order of clinical features. MATERIALS AND METHODS A user specifies a partial temporal order of features by indexing features of interest, with repeated and bundled indexes allowed as needed. An association mining algorithm plugin was designed to generate rules that adhere to the user-specified temporal order. The plugin uses temporal and sequence constraints to reduce rule permutations early in the rule generation process. The method was evaluated with a large medical claims dataset to generate clinical events. RESULTS Using the plug-in algorithm, the database is scanned to calculate the support of item sequences whose sequential order conforms with the user annotated feature order. In our experiments with 20,000 medical claim data records, our method generated rules in a significantly less time than the standalone Apriori algorithm. Our approach generates dendrograms to organize the rules into meaningful hierarchies and provides a graphical interface to navigate the rules and unfold interesting clinical events. DISCUSSION Since many associations in healthcare are of sequential nature, some of the derived rules may describe interesting clinical flows or events, while others may be contextually irrelevant. Our method exploits user-specified sequence constraints to eliminate irrelevant rules and reduce rule permutations, speeding up rule mining. CONCLUSION This work can be the foundation for future association rule mining studies to extract sequential events based on interestingness. The work can support clinical education where the instructor defines feature sequence constraints, and students unfold and examine extracted sequential rules.
Collapse
|
12
|
eXplainable Artificial Intelligence (XAI) for the identification of biologically relevant gene expression patterns in longitudinal human studies, insights from obesity research. PLoS Comput Biol 2020; 16:e1007792. [PMID: 32275707 PMCID: PMC7176286 DOI: 10.1371/journal.pcbi.1007792] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 04/22/2020] [Accepted: 03/17/2020] [Indexed: 12/18/2022] Open
Abstract
Until date, several machine learning approaches have been proposed for the dynamic modeling of temporal omics data. Although they have yielded impressive results in terms of model accuracy and predictive ability, most of these applications are based on "Black-box" algorithms and more interpretable models have been claimed by the research community. The recent eXplainable Artificial Intelligence (XAI) revolution offers a solution for this issue, were rule-based approaches are highly suitable for explanatory purposes. The further integration of the data mining process along with functional-annotation and pathway analyses is an additional way towards more explanatory and biologically soundness models. In this paper, we present a novel rule-based XAI strategy (including pre-processing, knowledge-extraction and functional validation) for finding biologically relevant sequential patterns from longitudinal human gene expression data (GED). To illustrate the performance of our pipeline, we work on in vivo temporal GED collected within the course of a long-term dietary intervention in 57 subjects with obesity (GSE77962). As validation populations, we employ three independent datasets following the same experimental design. As a result, we validate primarily extracted gene patterns and prove the goodness of our strategy for the mining of biologically relevant gene-gene temporal relations. Our whole pipeline has been gathered under open-source software and could be easily extended to other human temporal GED applications.
Collapse
|
13
|
The potential role of auditory prediction error in decompensated tinnitus: An auditory mismatch negativity study. Brain Behav 2019; 9:e01242. [PMID: 30895749 PMCID: PMC6456780 DOI: 10.1002/brb3.1242] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 12/08/2018] [Accepted: 12/10/2018] [Indexed: 01/25/2023] Open
Abstract
INTRODUCTION Some tinnitus subjects habituate to their tinnitus but some others do not and complain of its annoyance tremendously. Normal sensory memory and change detection processes are needed for detecting the tinnitus signal as a prediction error and habituation to tinnitus. The purpose of this study was to compare auditory mismatch negativity as the index of sensory memory and change detection among the studied groups to search for the factors involving in the perception of tinnitus and preventing habituation in decompensated tinnitus subjects. METHODS Electroencephalography was recorded from scalp electrodes in compensated tinnitus, decompensated tinnitus, and no tinnitus control subjects. Mismatch negativity was obtained using the oddball paradigm with frequency, duration, and silent gap deviants. Amplitude, latency, and area under the curve of mismatch negativities were compared among the three studied groups. RESULTS The results showed lower mismatch negativity amplitude and area under the curve for the higher frequency deviant and for the silent gap deviant in decompensated tinnitus group compared to normal control and compensated tinnitus group. CONCLUSIONS This study revealed a deficit in sensory memory and change detection processing in decompensated tinnitus subjects. This causes persistent prediction errors; tinnitus signal is consistently detected as a new signal and activates the brain salience network and consequently prevents habituation to tinnitus. Mismatch negativity is proposed as an index for monitoring tinnitus rehabilitation.
Collapse
|
14
|
|
15
|
|
16
|
Web page recommendation system based on partially ordered sequential rules. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2017. [DOI: 10.3233/jifs-169244] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
17
|
FCloSM, FGenSM: two efficient algorithms for mining frequent closed and generator sequences using the local pruning strategy. Knowl Inf Syst 2017. [DOI: 10.1007/s10115-017-1032-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
18
|
Location Prediction Based on Transition Probability Matrices Constructing from Sequential Rules for Spatial-Temporal K-Anonymity Dataset. PLoS One 2016; 11:e0160629. [PMID: 27508502 PMCID: PMC4980015 DOI: 10.1371/journal.pone.0160629] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 07/23/2016] [Indexed: 11/19/2022] Open
Abstract
Spatial-temporal k-anonymity has become a mainstream approach among techniques for protection of users' privacy in location-based services (LBS) applications, and has been applied to several variants such as LBS snapshot queries and continuous queries. Analyzing large-scale spatial-temporal anonymity sets may benefit several LBS applications. In this paper, we propose two location prediction methods based on transition probability matrices constructing from sequential rules for spatial-temporal k-anonymity dataset. First, we define single-step sequential rules mined from sequential spatial-temporal k-anonymity datasets generated from continuous LBS queries for multiple users. We then construct transition probability matrices from mined single-step sequential rules, and normalize the transition probabilities in the transition matrices. Next, we regard a mobility model for an LBS requester as a stationary stochastic process and compute the n-step transition probability matrices by raising the normalized transition probability matrices to the power n. Furthermore, we propose two location prediction methods: rough prediction and accurate prediction. The former achieves the probabilities of arriving at target locations along simple paths those include only current locations, target locations and transition steps. By iteratively combining the probabilities for simple paths with n steps and the probabilities for detailed paths with n-1 steps, the latter method calculates transition probabilities for detailed paths with n steps from current locations to target locations. Finally, we conduct extensive experiments, and correctness and flexibility of our proposed algorithm have been verified.
Collapse
|
19
|
|
20
|
Supporting the Design of Machine Learning Workflows with a Recommendation System. ACM T INTERACT INTEL 2016. [DOI: 10.1145/2852082] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Machine learning and data analytics tasks in practice require several consecutive processing steps. RapidMiner is a widely used software tool for the development and execution of such analytics workflows. Unlike many other algorithm toolkits, it comprises a visual editor that allows the user to design processes on a conceptual level. This conceptual and visual approach helps the user to abstract from the technical details during the development phase and to retain a focus on the core modeling task. The large set of preimplemented data analysis and machine learning operations available in the tool, as well as their logical dependencies, can, however, be overwhelming in particular for novice users.
In this work, we present an add-on to the RapidMiner framework that supports the user during the modeling phase by recommending additional operations to insert into the currently developed machine learning workflow. First, we propose different recommendation techniques and evaluate them in an offline setting using a pool of several thousand existing workflows. Second, we present the results of a laboratory study, which show that our tool helps users to significantly increase the efficiency of the modeling process. Finally, we report on analyses using data that were collected during the real-world deployment of the plug-in component and compare the results of the live deployment of the tool with the results obtained through an offline analysis and a replay simulation.
Collapse
|
21
|
An efficient algorithm to maintain the discovered frequent sequences with record deletion. INTELL DATA ANAL 2016. [DOI: 10.3233/ida-160825] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
22
|
|
23
|
|
24
|
Abstract
Sequential pattern mining has become one of the most important topics in data mining. It has broad applications such as analyzing customer purchase data, Web access patterns, network traffic data, DNA sequencing, and so on. Previous studies have concentrated on reducing redundant patterns among the sequential patterns, and on finding meaningful patterns from huge datasets. In sequential pattern mining, closed sequential pattern mining and weighted sequential pattern mining are the two main approaches to perform mining tasks. This is because closed sequential pattern mining finds representative sequential patterns which show exactly the same knowledge as the complete set of frequent sequential patterns, and weight-based sequential pattern mining discovers important sequential patterns by considering the importance of each sequential pattern. In this paper, we study the problem of mining robust closed weighted sequential patterns by integrating two paradigms from large sequence databases. We first show that the joining order between the weight constraints and the closure property in sequential pattern mining leads to different sets of results. From our analysis of joining orders, we suggest robust closed weighted sequential pattern mining without information loss, and present how to discover representative important sequential patterns without information loss. Through performance tests, we show that our approach gives high performance in terms of efficiency, effectiveness, memory usage, and scalability.
Collapse
|
25
|
|
26
|
|
27
|
|
28
|
|