1
|
Wang LM, Chen P, Mammadov M, Liu Y, Wu SY. Alleviating the independence assumptions of averaged one-dependence estimators by model weighting. INTELL DATA ANAL 2021. [DOI: 10.3233/ida-205400] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Of numerous proposals to refine naive Bayes by weakening its attribute independence assumption, averaged one-dependence estimators (AODE) has been shown to be able to achieve significantly higher classification accuracy at a moderate cost in classification efficiency. However, all one-dependence estimators (ODEs) in AODE have the same weights and are treated equally. To address this issue, model weighting, which assigns discriminate weights to ODEs and then linearly combine their probability estimates, has been proved to be an efficient and effective approach. Most information-theoretic weighting metrics, including mutual information, Kullback-Leibler measure and the information gain, place more emphasis on the correlation between root attribute (value) and class variable. We argue that the topology of each ODE can be divided into a set of local directed acyclic graphs (DAGs) based on the independence assumption, and multivariate mutual information is introduced to measure the extent to which the DAGs fit data. Based on this premise, in this study we propose a novel weighted AODE algorithm, called AWODE, that adaptively selects weights to alleviate the independence assumption and make the learned probability distribution fit the instance. The proposed approach is validated on 40 benchmark datasets from UCI machine learning repository. The experimental results reveal that, AWODE achieves bias-variance trade-off and is a competitive alternative to single-model Bayesian learners (such as TAN and KDB) and other weighted AODEs (such as WAODE).
Collapse
Affiliation(s)
- Li-Min Wang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Peng Chen
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Musa Mammadov
- School of Information Technology, Deakin University, Victoria, Australia
| | - Yang Liu
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
| | - Si-Yuan Wu
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
| |
Collapse
|
2
|
Wang L, Qi S, Liu Y, Lou H, Zuo X. Bagging k-dependence Bayesian network classifiers. INTELL DATA ANAL 2021. [DOI: 10.3233/ida-205125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Bagging has attracted much attention due to its simple implementation and the popularity of bootstrapping. By learning diverse classifiers from resampled datasets and averaging the outcomes, bagging investigates the possibility of achieving substantial classification performance of the base classifier. Diversity has been recognized as a very important characteristic in bagging. This paper presents an efficient and effective bagging approach, that learns a set of independent Bayesian network classifiers (BNCs) from disjoint data subspaces. The number of bits needed to describe the data is measured in terms of log likelihood, and redundant edges are identified to optimize the topologies of the learned BNCs. Our extensive experimental evaluation on 54 publicly available datasets from the UCI machine learning repository reveals that the proposed algorithm achieves a competitive classification performance compared with state-of-the-art BNCs that use or do not use bagging procedures, such as tree-augmented naive Bayes (TAN), k-dependence Bayesian classifier (KDB), bagging NB or bagging TAN.
Collapse
Affiliation(s)
- Limin Wang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Sikai Qi
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Yang Liu
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
| | - Hua Lou
- Department of Software and Big Data, Changzhou College of Information Technology, Changzhou, Jiangsu, China
| | - Xin Zuo
- School of Foreign Languages, Changchun University of Technology, Changchun, Jilin, China
| |
Collapse
|
3
|
Wang L, Chen P, Chen S, Sun M. A novel approach to fully representing the diversity in conditional dependencies for learning Bayesian network classifier. INTELL DATA ANAL 2021. [DOI: 10.3233/ida-194959] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Bayesian network classifiers (BNCs) have proved their effectiveness and efficiency in the supervised learning framework. Numerous variations of conditional independence assumption have been proposed to address the issue of NP-hard structure learning of BNC. However, researchers focus on identifying conditional dependence rather than conditional independence, and information-theoretic criteria cannot identify the diversity in conditional (in)dependencies for different instances. In this paper, the maximum correlation criterion and minimum dependence criterion are introduced to sort attributes and identify conditional independencies, respectively. The heuristic search strategy is applied to find possible global solution for achieving the trade-off between significant dependency relationships and independence assumption. Our extensive experimental evaluation on widely used benchmark data sets reveals that the proposed algorithm achieves competitive classification performance compared to state-of-the-art single model learners (e.g., TAN, KDB, KNN and SVM) and ensemble learners (e.g., ATAN and AODE).
Collapse
Affiliation(s)
- Limin Wang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Peng Chen
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Shenglei Chen
- School of Economics, Nanjing Audit University, Nanjing, Jiangsu, China
| | - Minghui Sun
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| |
Collapse
|
4
|
|
5
|
Duan Z, Wang L, Chen S, Sun M. Instance-based weighting filter for superparent one-dependence estimators. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106085] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
6
|
Li H, Wang F, Li H. A safe control scheme under the abnormity for the thickening process of gold hydrometallurgy based on Bayesian network. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2016.11.026] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|