Significance Statement
This paper introduces a new approach of ensemble learning called Collaborative Rule Generation (CRG). The new approach involves multiple base algorithms learning from a single data set to generate a single rule set, which aims to enable each single rule to have a higher quality. In other words, the new approach of ensemble learning is designed to improve the quality of each single rule generated and thus to improve the overall classification accuracy through scaling up algorithms.
The proposed approach addresses weaknesses in current ensemble learning approaches, as outlined below.
Firstly, the collaborative rule generation approach only generates a single rule set and rule based models are highly interpretable. In contrast, some popular methods of ensemble learning, such as Bagging, Boosting and Random Forests, suffer from incomprehensibility of the predictions made by different rule sets and are thus poorly interpretable. Therefore, the collaborative rule generation approach would fit better the purpose of knowledge discovery especially on interpretability.
Secondly, Bagging, Boosting and Random Forests all aim to improve accuracy for prediction through scaling down data. However, there is nothing done by scaling up algorithms for improving accuracy. It is necessary to deal with the issues on both algorithms and data sides in order to comprehensively improve the accuracy. Another ensemble learning approach, called Collaborative and Competitive Random Decision Rules (CCRDR), has been recently introduced by the authors in order to fill the gap. However, the authors argue that the CCRDR approach only enables each rule set (as a whole) to be of high quality on average, which indicates that there may be still some single rules of low quality. The authors conclude that the collaborative rule generation approach would be useful and effective to help the CCRDR approach fill the gap relating to the quality of each single rule and thus also complements the other three popular ensemble learning methods mentioned above.
This paper includes an experimental study validating the CRG approach and discusses the results in both quantitative and qualitative terms. In particular, the experimental study is set up to validate that the combination of different rule learning algorithms usually improves the overall accuracy and the quality of each single rule on average compared with the use of a single base algorithm. 20 data sets from the UCI repository were used for the validation.
We compare the CRG approach with other single base algorithms in terms of classification accuracy, and provide average metrics for the quality of each single rule. The results indicate that the CRG approach is useful for improving the quality of each single rule generated, thus improving the overall accuracy of classification.
Citation: Han Liu, Alexander Gegov, Mihaela Cocea. Collaborative Rule Generation: An Ensemble Learning Approach. Journal of Intelligent & Fuzzy Systems, 2016, 30 (4). pp. 2277-2287.
Affiliation: School of Computing, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, United Kingdom.
Go To Journal of Intelligent & Fuzzy Systems