Mining Software Repositories for Defect Categorization

Kumaresh, Sakthi; Baskaran, Ramachandran

doi:10.24138/jcomss.v11i1.115

Journal of Communications Software and Systems, Vol. 11 No. 1, 2015.

Original scientific paper

https://doi.org/10.24138/jcomss.v11i1.115

Mining Software Repositories for Defect Categorization

Sakthi Kumaresh ; MOP Vaishnav College for Women, Chennai, India
Ramachandran Baskaran ; Department of Computer Science and Engineering, Anna University, Chennai, India

Full text: english pdf 1.935 Kb

page 31-36

downloads: 1.160

cite

APA 6th Edition

Kumaresh, S. & Baskaran, R. (2015). Mining Software Repositories for Defect Categorization. Journal of Communications Software and Systems, 11 (1), 31-36. https://doi.org/10.24138/jcomss.v11i1.115

MLA 8th Edition

Kumaresh, Sakthi and Ramachandran Baskaran. "Mining Software Repositories for Defect Categorization." Journal of Communications Software and Systems, vol. 11, no. 1, 2015, pp. 31-36. https://doi.org/10.24138/jcomss.v11i1.115. Accessed 5 Jul. 2026.

Chicago 17th Edition

Kumaresh, Sakthi and Ramachandran Baskaran. "Mining Software Repositories for Defect Categorization." Journal of Communications Software and Systems 11, no. 1 (2015): 31-36. https://doi.org/10.24138/jcomss.v11i1.115

Harvard

Kumaresh, S., and Baskaran, R. (2015). 'Mining Software Repositories for Defect Categorization', Journal of Communications Software and Systems, 11(1), pp. 31-36. https://doi.org/10.24138/jcomss.v11i1.115

Vancouver

Kumaresh S, Baskaran R. Mining Software Repositories for Defect Categorization. Journal of Communications Software and Systems [Internet]. 2015 [cited 2026 July 05];11(1):31-36. https://doi.org/10.24138/jcomss.v11i1.115

IEEE

S. Kumaresh and R. Baskaran, "Mining Software Repositories for Defect Categorization", Journal of Communications Software and Systems, vol.11, no. 1, pp. 31-36, 2015. [Online]. https://doi.org/10.24138/jcomss.v11i1.115

Abstract

Early detection of software defects is very important to decrease the software cost and subsequently increase the software quality. Success of software industries not only depends on gaining knowledge about software defects, but largely reflects from the manner in which information about defect is collected and used. In software industries, individuals at different levels from customers to engineers apply diverse mechanisms to detect the allocation of defects to a particular class. Categorizing bugs based on their characteristics helps the Software Development team take appropriate actions to reduce similar defects that might get reported in future releases. Classification, if performed manually, will consume more time and effort. Human resource having expert testing skills & domain knowledge will be required for labeling the data. Therefore, the need of automatic classification of software defect is high.
This work attempts to categorize defects by proposing an algorithm called Software Defect CLustering (SDCL). It aims at mining the existing online bug repositories like Eclipse, Bugzilla and JIRA for analyzing the defect description and its categorization. The proposed algorithm is designed by using text clustering and works with three major modules to find out the class to which the defect should be assigned. Software bug repositories hold software defect data with attributes like defect description, status, defect open and close date. Defect extraction module extracts the defect description from various bug repositories and converts it into unified format for further processing. Unnecessary and irrelevant texts are removed from defect data using data preprocessing module. Finally grouping of defect data into clusters of similar defect is done using clustering technique. The algorithm provides classification accuracy more than 80% in all of the three above mentioned repositories.

Keywords

Software Defect; Defect Classification; Bug Repository; Clustering; Bug Categorization

Hrčak ID:

179768

URI

https://hrcak.srce.hr/179768

Publication date:

20.3.2015.

Visits: 2.532 *

Login and registration

Journal of Communications Software and Systems, Vol. 11 No. 1, 2015.

Abstract

Keywords

Hrčak ID:

URI

Publication date: