Number of Instances for Reliable Feature Ranking in a Given Problem

Bohanec, Marko; Kljajić Borštnar, Mirjana; Robnik-Šikonja, Marko

doi:10.2478/bsrj-2018-0017

Business Systems Research : International journal of the Society for Advancing Innovation and Research in Economy, Vol. 9 No. 2, 2018.

Izvorni znanstveni članak

https://doi.org/10.2478/bsrj-2018-0017

Number of Instances for Reliable Feature Ranking in a Given Problem

Marko Bohanec orcid.org/0000-0002-5295-5111 ; Salvirt Ltd., Ljubljana, Slovenia
Mirjana Kljajić Borštnar orcid.org/0000-0003-4608-9090 ; Faculty of Organizational Sciences, University of Maribor, Kranj, Slovenia
Marko Robnik-Šikonja orcid.org/0000-0002-1232-3320 ; Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia

Puni tekst: engleski pdf 528 Kb

str. 35-44

preuzimanja: 457

citiraj

APA 6th Edition

Bohanec, M., Kljajić Borštnar, M. i Robnik-Šikonja, M. (2018). Number of Instances for Reliable Feature Ranking in a Given Problem. Business Systems Research, 9 (2), 35-44. https://doi.org/10.2478/bsrj-2018-0017

MLA 8th Edition

Bohanec, Marko, et al. "Number of Instances for Reliable Feature Ranking in a Given Problem." Business Systems Research, vol. 9, br. 2, 2018, str. 35-44. https://doi.org/10.2478/bsrj-2018-0017. Citirano 20.04.2024.

Chicago 17th Edition

Bohanec, Marko, Mirjana Kljajić Borštnar i Marko Robnik-Šikonja. "Number of Instances for Reliable Feature Ranking in a Given Problem." Business Systems Research 9, br. 2 (2018): 35-44. https://doi.org/10.2478/bsrj-2018-0017

Harvard

Bohanec, M., Kljajić Borštnar, M., i Robnik-Šikonja, M. (2018). 'Number of Instances for Reliable Feature Ranking in a Given Problem', Business Systems Research, 9(2), str. 35-44. https://doi.org/10.2478/bsrj-2018-0017

Vancouver

Bohanec M, Kljajić Borštnar M, Robnik-Šikonja M. Number of Instances for Reliable Feature Ranking in a Given Problem. Business Systems Research [Internet]. 2018 [pristupljeno 20.04.2024.];9(2):35-44. https://doi.org/10.2478/bsrj-2018-0017

IEEE

M. Bohanec, M. Kljajić Borštnar i M. Robnik-Šikonja, "Number of Instances for Reliable Feature Ranking in a Given Problem", Business Systems Research, vol.9, br. 2, str. 35-44, 2018. [Online]. https://doi.org/10.2478/bsrj-2018-0017

Sažetak

Background: In practical use of machine learning models, users may add new features to an existing classification model, reflecting their (changed) empirical understanding of a field. New features potentially increase classification accuracy of the model or improve its interpretability. Objectives: We have introduced a guideline for determination of the sample size needed to reliably estimate the impact of a new feature. Methods/Approach: Our approach is based on the feature evaluation measure ReliefF and the bootstrap-based estimation of confidence intervals for feature ranks. Results: We test our approach using real world qualitative business-to-business sales forecasting data and two UCI data sets, one with missing values. The results show that new features with a high or a low rank can be detected using a relatively small number of instances, but features ranked near the border of useful features need larger samples to determine their impact. Conclusions: A combination of the feature evaluation measure ReliefF and the bootstrap-based estimation of confidence intervals can be used to reliably estimate the impact of a new feature in a given problem.

Ključne riječi

machine learning; feature ranking; feature evaluation

Hrčak ID:

203480

URI

https://hrcak.srce.hr/203480

Datum izdavanja:

11.7.2018.

Posjeta: 992 *

Prijava i registracija

Business Systems Research : International journal of the Society for Advancing Innovation and Research in Economy, Vol. 9 No. 2, 2018.

Sažetak

Ključne riječi

Hrčak ID:

URI

Datum izdavanja: