Skoči na glavni sadržaj

Izvorni znanstveni članak

Using the m-estimate in rule induction

Sašo Džeroski ; Artificial Intelligence Laboratory, Jožef Stefan Institute, Ljubljana, Slovenia
Bojan Cestnik ; Artificial Intelligence Laboratory, Jožef Stefan Institute, Ljubljana, Slovenia
Igor Petrovski ; Artificial Intelligence Laboratory, Jožef Stefan Institute, Ljubljana, Slovenia


Puni tekst: engleski pdf 4.983 Kb

str. 37-46

preuzimanja: 492

citiraj


Sažetak

Rule induction, a subarea of machine learning, is concerned with the problem of constructing rules from examples. In rule induction systems, various heuristic functions are used to estimate the quality of rules. Most of them use some form of probability estimates, relative frequency being the most common. This has resulted in the problem of small disjuncts, where specific rules produce high error rates, due to unreliable probability estimates from small samples. Tu alleviate this problem, the Laplace estimate has been used in the rule induction system CN2. We have replaced the Laplace estimate by a general Bayesian probability estimate, the m-estimate, which does not rely on the Laplacian assumption of equally likely classes. The parameter m in the m-estimate allows for adapting to the learning domain. Depending on the level of noise in the examples and other properties of the domain, the appropriate level of generalization can be achieved by setting the m parameter to an appropriate value. We compare the performance of rules derived by using the Laplace and the m-estimate on several practical domains in terms of classification accuracy and the theoretically underpinned measure of relative information score.

Ključne riječi

Hrčak ID:

150515

URI

https://hrcak.srce.hr/150515

Datum izdavanja:

30.3.1993.

Posjeta: 951 *