Skip to the main content

Original scientific paper

https://doi.org/10.24138/jcomss-2023-0133

SOM-US: A Novel Under-Sampling Technique for Handling Class Imbalance Problem

Ajay Kumar ; KIET Group of Institutions, India


Full text: english pdf 1.312 Kb

page 69-75

downloads: 292

cite


Abstract

A significant research challenge in data mining and machine learning is class imbalance classification since the majority of real-world datasets are imbalanced. When the dataset is highly unbalanced, the majority of available classification techniques frequently underperform on minority-class cases. This is due to the fact that they disregard the relative distribution of each class in favor of maximizing the overall accuracy. Various techniques based on sampling methods, cost-sensitive learning, and ensemble methods have recently been employed to handle the class imbalance problem. This paper proposes a new clustering-based under-sampling (US) technique, called SOM-US, for handling the class imbalance problem using the self-organized map (SOM). To validate the proposed approach, an experimental study was conducted to improve the capability of a classifier-logistic regression for software defect prediction by applying SOM-US over a NASA software defect dataset. The proposed approach was compared with six existing under-sampling methods on two performance measures. The results demonstrate that the SOM-US significantly improves the prediction capability of logistic regression over other under-sampling techniques for software defect prediction.

Keywords

Class Imbalance; Under-Sampling; Software Defect Prediction

Hrčak ID:

314265

URI

https://hrcak.srce.hr/314265

Publication date:

30.1.2024.

Visits: 961 *