Technical gazette, Vol. 21 No. 2, 2014.
Original scientific paper
Information-value-based feature selection algorithm for anomaly detection over data streams
Xiaozhen Zhou
; Zhejiang University, 38 Zheda Road, Hangzhou 310012, China
Shanping Li
; Zhejiang University, 38 Zheda Road, Hangzhou 310012, China
Cheng Chang
; Zhejiang University, 38 Zheda Road, Hangzhou 310012, China
Jianfeng Wu
; Technology Center of Shanghai Stock Exchange, 528 South Pudong Road, Shanghai 200120, China
Kai Liu
; Technology Center of Shanghai Stock Exchange, 528 South Pudong Road, Shanghai 200120, China
Abstract
Computer systems are becoming more and more complex, and system anomalies have a serious impact on system availability. One effective way to achieve high availability is to use anomaly detection tools to find the abnormal activities in the computer system so that they can be repaired. Because of the complexity of modern computing systems, many system metrics need to be monitored. For this reason, one major challenge of anomaly detection is multi-dimensionality. Large numbers of metrics increase the processing time of anomaly detection technology and lower the accuracy. To overcome this problem, we use information-value to ascertain the importance of features with respect to detecting anomalies. However, the information-value method does not take redundant features into account. Thus, correlations between features are evaluated to remove redundant features. This paper compares the presented method to other feature selection methods using a real system anomaly data set. Experimental results show that the presented method can learn the model more efficiently and detect anomalies more accurately.
Keywords
anomaly detection; data stream classification; feature selection; information-value
Hrčak ID:
120371
URI
Publication date:
26.4.2014.
Visits: 3.126 *