Izvorni znanstveni članak
https://doi.org/10.24138/jcomss-2023-0133
SOM-US: A Novel Under-Sampling Technique for Handling Class Imbalance Problem
Ajay Kumar
; KIET Group of Institutions, India
Sažetak
A significant research challenge in data mining and machine learning is class imbalance classification since the majority of real-world datasets are imbalanced. When the dataset is highly unbalanced, the majority of available classification techniques frequently underperform on minority-class cases. This is due to the fact that they disregard the relative distribution of each class in favor of maximizing the overall accuracy. Various techniques based on sampling methods, cost-sensitive learning, and ensemble methods have recently been employed to handle the class imbalance problem. This paper proposes a new clustering-based under-sampling (US) technique, called SOM-US, for handling the class imbalance problem using the self-organized map (SOM). To validate the proposed approach, an experimental study was conducted to improve the capability of a classifier-logistic regression for software defect prediction by applying SOM-US over a NASA software defect dataset. The proposed approach was compared with six existing under-sampling methods on two performance measures. The results demonstrate that the SOM-US significantly improves the prediction capability of logistic regression over other under-sampling techniques for software defect prediction.
Ključne riječi
Class Imbalance; Under-Sampling; Software Defect Prediction
Hrčak ID:
314265
URI
Datum izdavanja:
30.1.2024.
Posjeta: 961 *