1
|
Lin Q, Gan W, Wu Y, Chen J, Chen CM. Smart System: Joint Utility and Frequency for Pattern Classification. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS 2022. [DOI: 10.1145/3531480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Nowadays, the environments of smart systems for Industry 4.0 and Internet of Things (IoT) are experiencing fast industrial upgrading. Big data technologies such as design making, event detection, and classification are developed to help manufacturing organizations to achieve smart systems. By applying data analysis, the potential values of rich data can be maximized and thus help manufacturing organizations to finish another round of upgrading. In this paper, we propose two new algorithms with respect to big data analysis, namely UFC
gen
and UFC
fast
. Both algorithms are designed to collect three types of patterns to help people determine the market positions for different product combinations. We compare these algorithms on various types of datasets, both real and synthetic. The experimental results show that both algorithms can successfully achieve pattern classification by utilizing three different types of interesting patterns from all candidate patterns based on user-specified thresholds of utility and frequency. Furthermore, the list-based UFC
fast
algorithm outperforms the level-wise-based UFC
gen
algorithm in terms of both execution time and memory consumption.
Collapse
Affiliation(s)
- Qi Lin
- Jinan University of Birmingham Joint Institute, China
| | | | | | | | | |
Collapse
|
2
|
Kumar R, Singh K. A survey on soft computing-based high-utility itemsets mining. Soft comput 2022. [DOI: 10.1007/s00500-021-06613-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
3
|
Anari Z, Hatamlou A, Anari B. Finding Suitable Membership Functions for Mining Fuzzy Association Rules in Web Data Using Learning Automata. INT J PATTERN RECOGN 2021. [DOI: 10.1142/s0218001421590266] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Transactions in web data are huge amounts of data, often consisting of fuzzy and quantitative values. Mining fuzzy association rules can help discover interesting relationships between web data. The quality of these rules depends on membership functions, and thus, it is essential to find the suitable number and position of membership functions. The time spent by users on each web page, which shows their level of interest in those web pages, can be considered as a trapezoidal membership function (TMF). In this paper, the optimization problem was finding the appropriate number and position of TMFs for each web page. To solve this optimization problem, a learning automata-based algorithm was proposed to optimize the number and position of TMFs (LA-ONPTMF). Experiments conducted on two real datasets confirmed that the proposed algorithm enhances the efficiency of mining fuzzy association rules by extracting the optimized TMFs.
Collapse
Affiliation(s)
- Zohreh Anari
- Department of Computer Engineering and Information Technology, Payame Noor University (PNU), P. O. Box, 19395-4697 Tehran, Iran
| | - Abdolreza Hatamlou
- Department of Computer Engineering, Khoy Branch, Islamic Azad University, Khoy, Iran
| | - Babak Anari
- Department of Computer Engineering, Shabestar Branch, Islamic Azad University, Shabestar, Iran
| |
Collapse
|
4
|
Gan W, Lin JCW, Fournier-Viger P, Chao HC, Yu PS. HUOPM: High-Utility Occupancy Pattern Mining. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1195-1208. [PMID: 30794524 DOI: 10.1109/tcyb.2019.2896267] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Mining useful patterns from varied types of databases is an important research topic, which has many real-life applications. Most studies have considered the frequency as sole interestingness measure to identify high-quality patterns. However, each object is different in nature. The relative importance of objects is not equal, in terms of criteria, such as the utility, risk, or interest. Besides, another limitation of frequent patterns is that they generally have a low occupancy, that is, they often represent small sets of items in transactions containing many items and, thus, may not be truly representative of these transactions. To extract high-quality patterns in real-life applications, this paper extends the occupancy measure to also assess the utility of patterns in transaction databases. We propose an efficient algorithm named high-utility occupancy pattern mining (HUOPM). It considers user preferences in terms of frequency, utility, and occupancy. A novel frequency-utility tree and two compact data structures, called the utility-occupancy list and frequency-utility table, are designed to provide global and partial downward closure properties for pruning the search space. The proposed method can efficiently discover the complete set of high-quality patterns without candidate generation. Extensive experiments have been conducted on several datasets to evaluate the effectiveness and efficiency of the proposed algorithm. Results show that the derived patterns are intelligible, reasonable, and acceptable, and that HUOPM with its pruning strategies outperforms the state-of-the-art algorithm, in terms of runtime and search space, respectively.
Collapse
|
5
|
|
6
|
Managing technological knowledge for supporting R&D activities: scientometrics-based approach. KNOWLEDGE MANAGEMENT RESEARCH & PRACTICE 2017. [DOI: 10.1057/kmrp.2012.18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
7
|
|
8
|
Krishnamoorthy S, Sadasivam GS, Rajalakshmi M, Kowsalyaa K, Dhivya M. Privacy Preserving Fuzzy Association Rule Mining in Data Clusters Using Particle Swarm Optimization. INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES 2017. [DOI: 10.4018/ijiit.2017040101] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
An association rule is classified as sensitive if its thread of revelation is above certain confidence value. If these sensitive rules were revealed to the public, it is possible to deduce sensitive knowledge from the published data and offers benefit for the business competitors. Earlier studies in privacy preserving association rule mining focus on binary data and has more side effects. But in practical applications the transactions contain the purchased quantities of the items. Hence preserving privacy of quantitative data is essential. The main goal of the proposed system is to hide a group of interesting patterns which contains sensitive knowledge such that modifications have minimum side effects like lost rules, ghost rules, and number of modifications. The proposed system applies Particle Swarm Optimization to a few clusters of particles thus reducing the number of modification. Experimental results demonstrate that the proposed approach is efficient in terms of lost rules, number of modifications, hiding failure with complete avoidance of ghost rules.
Collapse
Affiliation(s)
| | - G. Sudha Sadasivam
- PSG College of Technology, Department of Computer Science & Engineering, Tamil Nadu, India
| | - M. Rajalakshmi
- Coimbatore Institute of Technology, Department of Computer Science & Engineering, Tamil Nadu, India
| | - K. Kowsalyaa
- PSG College of Technology, Department of Computer Science & Engineering, Tamil Nadu, India
| | - M. Dhivya
- SSN College of Engineering, Department of Computer Science & Engineering, Tamil Nadu, India
| |
Collapse
|
9
|
|
10
|
Huang TCK, Huang CH, Chuang YT. Change discovery of learning performance in dynamic educational environments. TELEMATICS AND INFORMATICS 2016. [DOI: 10.1016/j.tele.2015.10.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
11
|
Ting CK, Wang TC, Liaw RT, Hong TP. Genetic algorithm with a structure-based representation for genetic-fuzzy data mining. Soft comput 2016. [DOI: 10.1007/s00500-016-2266-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
12
|
|
13
|
|
14
|
|
15
|
|
16
|
Chamazi MA, Bidgoli BM, Nasiri M. Deriving support threshold values and membership functions using the multiple-level cluster-based master–slave IFG approach. Soft comput 2013. [DOI: 10.1007/s00500-012-0973-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
17
|
Abstract
Data mining is most commonly used in attempts to induce association rules from transaction data. Since transactions in real-world applications usually consist of quantitative values, many fuzzy association-rule mining approaches have been proposed on single- or multiple-concept levels. However, the given membership functions may have a critical influence on the final mining results. In this paper, we propose a multiple-level genetic-fuzzy mining algorithm for mining membership functions and fuzzy association rules using multiple-concept levels. It first encodes the membership functions of each item class (category) into a chromosome according to the given taxonomy. The fitness value of each individual is then evaluated by the summation of large 1-itemsets of each item in different concept levels and the suitability of membership functions in the chromosome. After the GA process terminates, a better set of multiple-level fuzzy association rules can then be expected with a more suitable set of membership functions. Experimental results on a simulation dataset also show the effectiveness of the algorithm.
Collapse
Affiliation(s)
- CHUN-HAO CHEN
- Department of Computer Science and Information Engineering, Tamkang University, Taipei, 251, Taiwan, R.O.C
| | - TZUNG-PEI HONG
- Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, 811, Taiwan, R.O.C
- Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, 804, Taiwan, R.O.C
| | - YEONG-CHYI LEE
- Department of Information Management, Cheng Shiu University, Kaohsiung, Taiwan, R. O. C
| |
Collapse
|
18
|
Chiu HP, Tang YT, Hsieh KL. Applying cluster-based fuzzy association rules mining framework into EC environment. Appl Soft Comput 2012. [DOI: 10.1016/j.asoc.2011.08.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
19
|
|
20
|
Linguistic Fuzzy Rules in Data Mining: Follow-Up Mamdani Fuzzy Modeling Principle. COMBINING EXPERIMENTATION AND THEORY 2012. [DOI: 10.1007/978-3-642-24666-1_8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
21
|
Chen CH, Hong TP, Tseng V. Finding Pareto-front Membership Functions in Fuzzy Data Mining. INT J COMPUT INT SYS 2012. [DOI: 10.1080/18756891.2012.685314] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
22
|
|
23
|
HONG TZUNGPEI, KUO CHANSHENG, CHI SHENGCHAI. TRADE-OFF BETWEEN COMPUTATION TIME AND NUMBER OF RULES FOR FUZZY MINING FROM QUANTITATIVE DATA. INT J UNCERTAIN FUZZ 2011. [DOI: 10.1142/s0218488501001071] [Citation(s) in RCA: 125] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most conventional data-mining algorithms identify the relationships among transactions using binary values. Transactions with quantitative values are however commonly seen in real-world applications. We proposed a fuzzy mining algorithm by which each attribute used only the linguistic term with the maximum cardinality int he mining process. The number of items was thus the same as that of the original attributes, making the processing time reduced. The fuzzy association rules derived in this way are not complete. This paper thus modifies it and proposes a new fuzzy data-mining algorithm for extrating interesting knowledge from transactions stored as quantitative values. The proposed algorithm can derive a more complete set of rules but with more computation time than the method proposed. Trade-off thus exists between the computation time and the completeness of rules. Choosing an appropriate learning method thus depends on the requirement of the application domains.
Collapse
Affiliation(s)
- TZUNG-PEI HONG
- Department of Information Management, I-Shou University, Kaohsiung, 84008, Taiwan, R.O.C
| | - CHAN-SHENG KUO
- Department of Management Information Systems, National Chengchi University, Taipei, 11623, Taiwan, R.O.C
| | - SHENG-CHAI CHI
- Department of Industrial Management, Huafan University, Taipei, 223, Taiwan, R.O.C
| |
Collapse
|
24
|
|
25
|
|
26
|
Lin SF, Chang JW, Hsu YC. A self-organization mining based hybrid evolution learning for TSK-type fuzzy model design. APPL INTELL 2010. [DOI: 10.1007/s10489-010-0271-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
27
|
Ma WM, Wang K, Liu ZP. Mining potentially more interesting association rules with fuzzy interest measure. Soft comput 2010. [DOI: 10.1007/s00500-010-0579-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
28
|
Chen CM, Chen YY, Liu CY. Learning Performance Assessment Approach Using Web-Based Learning Portfolios for E-learning Systems. ACTA ACUST UNITED AC 2007. [DOI: 10.1109/tsmcc.2007.900641] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
29
|
Chien BC, Zhong MH, Wang JJ. Mining Fuzzy Association Rules on Has-A and Is-A Hierarchical Structures. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2007. [DOI: 10.20965/jaciii.2007.p0423] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Preliminary studies on data mining focus on finding association rules from transaction databases containing items without relationships among them. However, relationships among items often exist in real applications. Most of the previous works only concern about Is-A hierarchy. In this paper, hierarchical relationships include a Has-A hierarchy and multiple Is-A hierarchies are discussed. The proposed method first reduces a Has-A & Is-A hierarchy into an extended Has-A hierarchy using the IsA-Reduce algorithm. The quantitative data is transformed into fuzzy items. The RPFApriori algorithm is then applied to find fuzzy association rules from the fuzzy item data and the extended Has-A hierarchy.
Collapse
|
30
|
Subramanyam RBV, Goswami A. Mining Frequent Fuzzy Grids in Dynamic Databases with Weighted Transactions and Weighted Items. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT 2006. [DOI: 10.1142/s0219649206001487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Incremental mining algorithms that derive the latest mining output by making use of previous mining results are attractive to business organisations. In this paper, a fuzzy data mining algorithm for incremental mining of frequent fuzzy grids from quantitative dynamic databases is proposed. It extends the traditional association rule problem by allowing a weight to be associated with each item in a transaction and with each transaction in a database to reflect the interest/intensity of items and transactions. It uses the information about fuzzy grids that are already mined from original database and avoids start-from-scratch process. In addition, we deal with "weights-of-significance" which are automatically regulated as the incremental databases are evolved and implant themselves in the original database. We maintain "hopeful fuzzy grids" and "frequent fuzzy grids" and our algorithm changes the status of the grids which have been discovered earlier so that they reflect the pattern drift in the updated quantitative databases. Our heuristic approach avoids maintaining many "hopeful fuzzy grids" at the initial level. The algorithm is illustrated with one numerical example and demonstration of experimental results are also incorporated.
Collapse
Affiliation(s)
- R. B. V. Subramanyam
- Department of Mathematics, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - A. Goswami
- Department of Mathematics, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| |
Collapse
|
31
|
|
32
|
Han KR, Kim JY. FCILINK: Mining Frequent Closed Itemsets Based on a Link Structure between Transactions. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT 2005. [DOI: 10.1142/s0219649205001213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The problem of discovering association rules between items in a database is an emerging area of research. Its goal is to extract significant patterns or interesting rules from large databases. Recent studies of mining association rules have proposed a closure mechanism. It is no longer necessary to mine the set of all of the frequent itemsets and their association rules. Rather, it is sufficient to mine the frequent closed itemsets and their corresponding rules. In the past, a number of algorithms for mining frequent closed itemsets have been based on items. In this paper, we use the transaction itself for mining frequent closed itemsets. An efficient algorithm called FCILINK is proposed that is based on a link structure between transactions. A given database is scanned once and then a much smaller sub-database is scanned twice. Our experimental results show that our algorithm is faster than previously proposed methods. Furthermore, our approach is significantly more efficient for dense databases.
Collapse
Affiliation(s)
- Kyong Rok Han
- Department of Industrial Engineering, Hanyang University, Seoul, Republic of Korea
| | - Jae Yearn Kim
- Department of Industrial Engineering, Hanyang University, Seoul, Republic of Korea
| |
Collapse
|
33
|
Chen YL, Huang TCK. Discovering Fuzzy Time-Interval Sequential Patterns in Sequence Databases. ACTA ACUST UNITED AC 2005; 35:959-72. [PMID: 16240771 DOI: 10.1109/tsmcb.2005.847741] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Given a sequence database and minimum support threshold, the task of sequential pattern mining is to discover the complete set of sequential patterns in databases. From the discovered sequential patterns, we can know what items are frequently brought together and in what order they appear. However, they cannot tell us the time gaps between successive items in patterns. Accordingly, Chen et al. have proposed a generalization of sequential patterns, called time-interval sequential patterns, which reveals not only the order of items, but also the time intervals between successive items. An example of time-interval sequential pattern has a form like (A, I2, B, I1, C), meaning that we buy A first, then after an interval of I2 we buy B, and finally after an interval of I1 we buy C, where I2 and I1 are predetermined time ranges. Although this new type of pattern can alleviate the above concern, it causes the sharp boundary problem. That is, when a time interval is near the boundary of two predetermined time ranges, we either ignore or overemphasize it. Therefore, this paper uses the concept of fuzzy sets to extend the original research so that fuzzy time-interval sequential patterns are discovered from databases. Two efficient algorithms, the fuzzy time interval (FTI)-Apriori algorithm and the FTI-PrefixSpan algorithm, are developed for mining fuzzy time-interval sequential patterns. In our simulation results, we find that the second algorithm outperforms the first one, not only in computing time but also in scalability with respect to various parameters.
Collapse
Affiliation(s)
- Yen-Liang Chen
- Department of Information Management, National Central University, Chung-Li, Taiwan.
| | | |
Collapse
|
34
|
Lee YC, Hong TP, Lin WY. Mining association rules with multiple minimum supports using maximum constraints. Int J Approx Reason 2005. [DOI: 10.1016/j.ijar.2004.11.006] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
35
|
Hong TP, Kuo CS, Wang SL. A fuzzy AprioriTid mining algorithm with reduced computational time. Appl Soft Comput 2004. [DOI: 10.1016/j.asoc.2004.03.009] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
36
|
Gyenesei A, Teuhola J. Multidimensional fuzzy partitioning of attribute ranges for mining quantitative data. INT J INTELL SYST 2004. [DOI: 10.1002/int.20039] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
37
|
Lee YC, Hong TP, Lin WY. Mining Fuzzy Association Rules with Multiple Minimum Supports Using Maximum Constraints. LECTURE NOTES IN COMPUTER SCIENCE 2004. [DOI: 10.1007/978-3-540-30133-2_171] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
38
|
|