Skip to the main content

Original scientific paper

https://doi.org/10.17559/TV-20151219112129

Sampling imbalance dataset for software defect prediction using hybrid neuro-fuzzy systems with Naive Bayes classifier

K. Punitha ; Anna University, Sardar Patel Road, Chennai 600025, Tamil Nadu, India
B. Latha ; Sri Sai Ram Engineering College, Sai Leo Nagar, West Tambaram, Chennai 600 044, Tamil Nadu, India


Full text: croatian pdf 997 Kb

page 1795-1804

downloads: 767

cite

Full text: english pdf 997 Kb

page 1795-1804

downloads: 499

cite


Abstract

Software defect prediction (SDP) is a process with difficult tasks in the case of software projects. The SDP process is useful for the identification and location of defects from the modules. This task will tend to become more costly with the addition of complex testing and evaluation mechanisms, when the software project modules size increases. Further measurement of software in a consistent and disciplined manner offers several advantages like accuracy in the estimation of project costs and schedules, and improving product and process qualities. Detailed analysis of software metric data also gives significant clues about the locations of possible defects in a programming code. The main goal of this proposed work is to introduce software defects detection and prevention methods for identifying defects from software using machine learning approaches. This proposed work used imbalanced datasets from NASA’s Metrics Data Program (MDP) and software metrics of datasets are selected by using Genetic algorithm with Ant Colony Optimization (GACO) method. The sampling process with semi supervised learning Modified Co Forest method generates the balanced labelled using imbalanced datasets, which is used for efficient software defect detection process with machine learning Hybrid Neuro-Fuzzy Systems with Naive Bayes methods. The experimental results of this proposed method proves that this defect detecting machine learning method yields more efficiency and better performance in defect prediction result of software in comparison with the other available methods.

Keywords

Genetic algorithm with Ant Colony Optimization (GACO); NASA’s Metrics Data Program (MDP); Semi supervised learning Modified Co Forest method; Software defect prediction (SDP)

Hrčak ID:

169699

URI

https://hrcak.srce.hr/169699

Publication date:

29.11.2016.

Article data in other languages: croatian

Visits: 2.510 *