Technical gazette, Vol. 23 No. 6, 2016.
Original scientific paper
https://doi.org/10.17559/TV-20151219112129
Sampling imbalance dataset for software defect prediction using hybrid neuro-fuzzy systems with Naive Bayes classifier
K. Punitha
; Anna University, Sardar Patel Road, Chennai 600025, Tamil Nadu, India
B. Latha
; Sri Sai Ram Engineering College, Sai Leo Nagar, West Tambaram, Chennai 600 044, Tamil Nadu, India
Abstract
Software defect prediction (SDP) is a process with difficult tasks in the case of software projects. The SDP process is useful for the identification and location of defects from the modules. This task will tend to become more costly with the addition of complex testing and evaluation mechanisms, when the software project modules size increases. Further measurement of software in a consistent and disciplined manner offers several advantages like accuracy in the estimation of project costs and schedules, and improving product and process qualities. Detailed analysis of software metric data also gives significant clues about the locations of possible defects in a programming code. The main goal of this proposed work is to introduce software defects detection and prevention methods for identifying defects from software using machine learning approaches. This proposed work used imbalanced datasets from NASA’s Metrics Data Program (MDP) and software metrics of datasets are selected by using Genetic algorithm with Ant Colony Optimization (GACO) method. The sampling process with semi supervised learning Modified Co Forest method generates the balanced labelled using imbalanced datasets, which is used for efficient software defect detection process with machine learning Hybrid Neuro-Fuzzy Systems with Naive Bayes methods. The experimental results of this proposed method proves that this defect detecting machine learning method yields more efficiency and better performance in defect prediction result of software in comparison with the other available methods.
Keywords
Genetic algorithm with Ant Colony Optimization (GACO); NASA’s Metrics Data Program (MDP); Semi supervised learning Modified Co Forest method; Software defect prediction (SDP)
Hrčak ID:
169699
URI
Publication date:
29.11.2016.
Visits: 2.510 *