Technical gazette, Vol. 30 No. 2, 2023.
Original scientific paper
https://doi.org/10.17559/TV-20220823104912
Research on Feature Selection Methods based on Random Forest
Zhuo Wang
; School of Software, Nanchang University, Nanchang, Jiangxi, China
Abstract
Aiming to deal with the irrelevant or redundant features, this paper proposes eight kinds of feature selection methods. The first seven feature selection methods include CART and Random Forests (CART-RF), CHIAD and Random Forests (CHIAD-RF), SVM and Random Forests (SVM-RF), Bayesian Network and Random Forests (BN-RF), neural Network and Random Forests (NN-RF), K-Means and Random Forests (K-Means-RF) and Kohonen and Random Forests (Kohonen-RF). These methods use CART, CHAID, SVM, BN, NN, K-Means and Kohonen to evaluate the importance and ranking of features, and then obtain feature subsets through RF algorithm. The eighth method is named hybrid integration methods and random forests (Integrate-RF). Integrate-RF uses the average importance of the seven methods and the optimal features subset can be selected based on the OOB data classification error rate. Experimental results indicate that feature selection methods proposed in this article can effectively select features and reduce the data dimension.
Keywords
feature selection; irrelevant; random forest; redundant
Hrčak ID:
294405
URI
Publication date:
26.2.2023.
Visits: 1.967 *