Skip to the main content

Original scientific paper

https://doi.org/10.32985/ijeces.13.10.15

Software Defect Prediction using Deep Learning by Correlation Clustering of Testing Metrics

Kamal Kant Sharma ; Department of Information Technology, KIET Group of Institutions, Delhi-NCR, Ghaziabad Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
Amit Sinha ; Department of Information Technology, ABES Engineering College, Ghaziabad Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
Arun Sharma ; Department of AI and Data Sciences Indira Gandhi Delhi Technical University for Women, Delhi, India


Full text: english pdf 763 Kb

page 953-960

downloads: 339

cite


Abstract

The software industry has made significant efforts in recent years to enhance software quality in businesses. The use of proactively defect prediction in the software will assist programmers and white box testing in detecting issues early, saving time and money. Conventional software defect prediction methods focus on traditional source code metrics such as code complexities, lines of code, and so on. These capabilities, unfortunately, are unable to retrieve the semantics of source code. In this paper, we have presented a novel Correlation Clustering fine-tuned CNN (CCFT-CNN) model based on testing Metrics. CCFT-CNN can predict the regions of source code that contain faults, errors, and bugs. Abstract Syntax Tree (AST) tokens are extracted as testing Metrics vectors from the source code. The correlation among AST testing Metrics is performed and clustered as a more relevant feature vector and fed into Convolutional Neural Network (CNN). Then, to enhance the accuracy of defect prediction, fine-tuning of the CNN model is performed by applying hyperparameters. The result analysis is performed on the PROMISE dataset that contains samples of open-source Java applications such as Camel Dataset, Jedit dataset, Poi dataset, Synapse dataset, Xerces dataset, and Xalan dataset. The result findings show that the CCFT- CNN model increases the average F-measure by 2% when compared to the baseline model.

Keywords

Software Engineering; Software Testing; Abstract Syntax Tree; Machine Learning; Convolution Neural Network;

Hrčak ID:

290502

URI

https://hrcak.srce.hr/290502

Publication date:

21.12.2022.

Visits: 777 *