Design and Implementation of a Web-Based Application for Code Smells Repository

: Pitfalls in software development process can be prevented by learning from other people's mistakes. Software practitioners and researchers document lessons learned and the knowledge about best practices is spread over literature. Presence of code smells does not indicate that software won’t work, but it will reveal deeper problems and rising risk of failure in future. Software metrics are applied to detect code smells whereas refactoring can remove code smells, improve code quality and make it simpler and cleaner. Detection tools facilitate management of code smells. Knowledge about code smells and related concepts can assist the software maintenance process. Exploratory analysis of code smells carried out in this paper, covers collecting data about code smells, identifying related concepts, categorizing and organizing this knowledge into a code smell repository, which can be made available to software developers. A detailed literature survey is carried out to identify code smells and related concepts. An initial list of 22 code smells proposed in 1999 has grown over the years into 65 code smells. The relationship between code smells, software metrics, refactoring methods and detection tools available in literature is also documented. Templates are designed that capture knowledge about code smells and related concepts. A code smell repository is designed and implemented to maintain all the information gathered about code smells and related concepts and is made available to software practitioners. All the knowledge about code smells found in literature is collected, organized and made accessible.


INTRODUCTION
Software quality goes to ruin over time because of various reasons such as software ageing, improper design, unsuitable requirement analysis and inappropriate coding practices. Code smell is an indication of some obstacles in the code that shows something is wrong in some parts of code or system design [1]. Bad smell occurrence has a drastic influence on the quality of code. It makes system more complex, less comprehensible and causes maintainability problems [2,3]. Bad smells are usually not bugs; however, researchers have proposed that a huge number of bad smells connect with bugs and maintainability issues [4][5][6][7]. Bad smells don't currently inhibit the functioning of code. But, they detect clear signs in design which may lead to slowing down development or growing the probability of bugs or software rot because of long term decays. In 1990, Kent Beck proposed that refactoring can modify source code to improve its quality. Refactoring is a systematic process of improving source code without creating new functionality that can change a disorder into spotless code and uncomplicated design [8]. Code smell detection can be effectively carried out by using appropriate software metrics [9]. Software metric is a measurement indicator for the latent attributes possessed by software system or software development process [4,10,11]. These detection approaches interpret code metrics which are evoked from a particular system element by applying a set of threshold filter rules [12]. The main target of this strategy is providing a mechanism for engineers that give permission to them to work on a more abstract level that conceptually is closer to real goals in using metrics. Furthermore, several tools have been developed for detection of code smells and improve the code quality during software development [13]. Software tools support developers by automatic or semi-automatic detection of bad smells. Tools focus on the entities which most likely present code smells [14].
This paper proposes an exploratory study of code smells. Literature survey shows that there are plenty of code smells with corresponding detection methods using software metrics. Also, there is a large set of tools that support code smells detection. Moreover, there are well defined Refactoring methods that can be used to remove code smells. There is a need to organize this knowledge into a Code smells repository so that it is readily available to developers and practitioners.
The contribution of this paper is organized as follows: section 2 describes background and related work. Exploratory analysis of code smells is explained in section 3. Organizing the code smell knowledge is presented in section 4 which is followed by conclusion in section 5.

BACKGROUND AND RELATED WORKS
Several scientists such as Opdyke et al. [15] showed that some situations in source code may need refactoring. The process of changing a software system in this manner that external behaviour does not change but improves its internal structure is called refactoring. It can improve the design of a software process and reduce its complexity. After refactoring, software systems are easier to comprehend and maintain. Webster [16] and Brown et al. [17] discovered some code smells such as Blob, Spaghetti Code, etc. [18]. Later, Kent Beck and Martin Fowler [19] called those situations, which may need refactoring as the bad smells. An initial list of code smells was proposed by them which was indicative of something incorrect in the system code. They introduced a list of 22 code smells without categorizing them and claimed that there is not a set of precise metrics which can be specified to recognise the need of refactoring. Thus, bad smells are kind of cooperation amongst the ambiguous programming and precise source code metrics. Fowler presented a group of refactoring with step wise comments on how each smell can be removed. He did not give the particular characteristics, detecting techniques and refactoring process. Van Emden and Moonen [20] proposed the first formalization of code smells. They revealed the undesirable effect of bad smells on the software product.
They suggested an automatic detection and visualization of code smell, with a methodology for reducing the impact of code smells on java source code. "Jcosmo" was the name of resulted work in code smell browser. Later, they had other survey and discovered that presence of smells has maximum influence on quality of software [18]. Kerievsky [21] presented more refactoring. Also, he introduced some new code smells such as Conditional Complexity, Combinatorial Explosion and Indecent Exposure in his refactoring book. Mantyla [22] presented Divergent Change as concealed smell. This smell cannot be detected by a simple look at the code or by tools. Also, detecting process need good understanding of the code and having experience for implementing the changes to the source code. Then, in 2003, he [23] introduced more smells. In addition, he discovered a classification of 22 code smells into seven units where every individual unit reveals a similar impression [23,13]. Li and Shatnawi [24] examined the association among class error probability and code smells for three different levels such as High (Blocker and Critical), Medium (Major), and Low (Normal and Minor). They described that refactoring of a class improves the architectural quality as well as decreases the probability of the class errors when system is released. Also, in 2008 they [25] extended their study about relationship between software metrics, code smells and class error probability. Fontana et al. [26] proposed a comparative study of code smells which are detected by various refactoring tools and their support of semi-automatic refactoring. Ouni et al. [27] defined a search based refactoring strategy for maintaining domain semantic of a code when refactoring is decided/ implemented automatically. They discussed that refactoring may be syntactically correct and have right behaviour but model incorrectly the domain semantics. Palomba et al. [28] surveyed observations of developers about bad smells. They mentioned that there is a gap between theory and practice. Their survey promised insights about bad smells which are not yet explored sufficiently. Pinto and Kamei [29] examined StackOverflow's data for exploring obstacles for approval of code smell detection tools. They prepared a list of problems that revealed the adoption/usability problems, which users explained about StackOverflow. Tufano et al. [30] surveyed hundreds of projects to explore the problems of bad smells. They discovered the reason for bad smells in the code. Kaur and Dhiman [31] had a detailed survey on Search-Based Tools and Techniques to Identify Bad Code Smells in Object-Oriented Systems. Authors point out lack of a standard benchmark system for comparing outcomes of existing's code smell detection strategies. Fontana et al. [32] believed that code smells and architectural smells are not same. They suggested developers to more focus on hazardous architectural smells. Reis et al. [33] performed a Systematic Literature Review (SLR) on the state-of-the-art methods and tools applied for code smells detection and visualization. Their results showed that the most repeatedly applied detection methods are based on search-based techniques, which mainly apply ML algorithms. Martins et al. [34] presented a survey on harmfulness of co-occurrences of code smells and its influences on Internal Quality Attributes. The elimination of code smells co-occurrences reduce complexity of the system. Kaur [35] published a Systematic Literature Review on Empirical Analysis of the Relationship between Code Smells and Software Quality Attributes. Researcher observed that most used data sets for studies are small in size and written in Java programming language. Also, most impact of code smells is on external quality attributes. Al-Shaaby et al. [36] recently presented a systematic literature review with reference to bad smell detection using machine learning techniques. Their research outcomes showed that God Class and Long Method, Feature Envy, and Data Class are the most occurring detected code smells and Java programing and Weka have most used by researchers.

EXPLORATORY ANALYSIS OF CODE SMELLS
Exploratory Analysis of code smells involves collecting data about code smells, identifying related concepts, categorizing and organizing this knowledge into a code smell repository so that it can be made readily available to software developers and practitioners.

Collection of Data about Code Smells
Kent Beck as the originator of extreme programming revealed the importance of design quality through the developing software in 1990s and made popular the usage of word code smell. This word grew into a universal term in coding when it was introduced in the book Refactoring: Improving the Design of Existing Code by Martin Fowler, a famous software scientist which propagated the practice of refactoring. An initial list of 22 code smells was introduced by Kent Beck and Martin Fowler in 1999 as situations revealing of something improper in the system code. Initial list has grown over the years and knowledge about a large set of code smells is spread out across the literature. For the exploratory study, 65 code smells are gathered from the existing literature as shown in Tab. 1. Name of Code smells are a part of designer's language vocabulary. Sometimes newbies designers don't certainly know code smell's particular definitions and simply use them out of familiarity. Researcher has prepared uncomplicated short definitions via Tab. 1 to assist designers and other researchers to identify promising motivations for solving code smell problems. The code which has been used earlier, but is not presently used.
27 Brain Class Class centralize system functionality but does not use considerable data of foreign classes and is more cohesive.

Brain Method
Brain Methods centralize the functionality of a class.

Extensive (Dispersed) Coupling
A single operation calls one or few methods from extreme number of provider classes.

Intensive Coupling
A method calls many other operations in the system from one or a few classes.

Tradition Breaker
Inherited class hardly concentrates inherited services which are unrelated on inherited functionality by base class.
32 Spaghetti Code A class without structure executes long and complex methods, connects among them without parameters using global variables. 33 Speculative Generality An abstract class that is unused, but will be used in the system in coming system releases.
34 Inappropriate Intimacy Two classes exhibiting high coupling between them or one class consumes the internal fields and methods of another class.

Complex Class
Classes having high complexity. 36 Class Data Should Be Private (CDSBP) A class exposes its attributes and violates the principle of data hiding.

Instanceof
Having a chain of "instanceof" operators in the same block of code.

Typecast
The process of explicitly converting an object from one class type into another. 39 Missing Template Method Two different components have major similarities, but do not use an interface.
40 Cyclic (Circular) Dependencies Two or more subsystems are involved in one cycle and this is contravention of Acyclic Dependencies Principle.

Blob Operation
Huge and complex operation have a tendency to centralize too much of the functionality of a class or module. 42 Sibling Duplication An equivalent functionality described by two or more siblings in an inheritance hierarchy. 43  62 Functional Decomposition Writing highly procedural and non-object oriented code in an object oriented language. It happens while a class is considered with the intent of performing a single function. 63 Deficient encapsulation Declare availability of one or more members of an abstraction is more permissive than actually necessary.

God Method
Method gets more functionality until it becomes out of control and difficult to maintain and extend. 65 Type checking Presented for Selecting a variation of an algorithm that should be executed based on the value of an attribute.

Identifying Related Concepts
Code smells are some symptoms in the source code that probably indicates a deeper problem in software system. Detection of code smells is challenging for practitioners and developers. Different viewpoints conduct to the application of several detection metrics, detection tools and refactoring actions [14]. Software metrics are a standard measurement by using performance value named threshold to assess the maintainability of the software systems and to distinguish code smells. Tools are another way of code smells detection. A variety of detection tools have been developed for detection of bad smells based on different approaches and specific parameters for detecting particular smells. Refactoring is an organized procedure for improving source code without making new functionalities that change code to clean code with a simple design. Fig. 1 depicts that code smells are the centre of study along with the related concepts that are software metrics, detection tools and refactoring actions. Following subsections are used for describing these related concepts in further detail.

Detection Tools
There are several tools and IDE (integrated development environment) available for detecting of code smells [37]. Code smells are suggested as an attempt by programmers to reform their software. When programmers are writing their code, bad smells go unnoticed. Therefore, detection tools are developed to make programmers aware about the existence of bad smells in their code and to aid them recognize the reason of those bad smells. Several code smell detection tools are available but it is difficult to enumerate all of them and define exactly which bad smells they are able to detect. Therefore, a short introduction is provided to some of the well-known tools. a) infusion: inFusion is the modern and commercial development of iPlasma and detects 22 code smells. Refactoring is not available but it is linked to the code. [13,14,26,38] b) iPlasma: iPlasma is for quality assessment of objectoriented systems, supports all steps of analysis. Refactoring and link to code are not available. [26,38] c) JDeodorant: JDeodorant automatically recognizes code smells and is able to determine proper sequence of refactoring. Also, it is linked to code. [11,13,14,38] d) JSpIRIT: JSpIRIT supports java codes to recognize and arrange code smells. Automated refactoring and link to code are not available. [14,26,38,39] e) PMD: PMD supports programs and searches for faults. Refactoring is not available and detection technique is based on software metrics. [13,14,26,38] f) Checkstyle: Checkstyle is similar to PMD for using software metrics and thresholds for detection of bad smells. Automatic refactoring is not available and it is linked to the code. [13,26,38] g) Stench Blossom: Stench Blossom gives a visualization environment to show the programmers a high-level outlook of the bad smells in their code. Automated refactoring is not available but there is direct link to code. [13,26,38] h) DÉCOR: DÉCOR automatically permits the specification and detection of bad smells. Refactoring is available as well as code links. [13,26,38] i) inCode: inCode is commercial and based on inFusion for detecting of bad smells that supports programmers for writing code in programming environment. [13,40]  Tab. 2 shows relationship between code smells and detection tools. According to the table, iPlasma scores maximum points on detection of code smells with 17 detected code smells. It can be interpreted that iPlasma is a functional tool compared to other tools and should be selected by developers as it covers the detection of larger set of code smells. There are only 28 code smells that are detected by one or other of these 9 detection tools.

Software Metrics
Software Quality Metrics refer to measurement of software attributes related to software quality during software development process. Many software metrics are available to systems realized in various paradigms like Objects Oriented Programming (OOP). Finding factors of software quality and planning them into quantitative measures is a critical issue in sustainable success of an end product. Software metric has involved a lot of consideration between researchers and developers in last one decade [41]. Computer science experts are placing all their struggles in measuring quantitative information from software component. Therefore, software metrics are often classified into some types [42]. It is depending on different lookouts. Shepperd and Ince [43] proposed a classification of two metrics: traditional metrics and object-oriented metrics.
Later, Saker [44] suggested a category of software metrics established upon subject and paradigm. In his category software metrics divided to project based metrics and design based metrics. Fenton and Bieman [45] offered different category that it was two dimensional classifications and divided to project metrics (product, process or resources) and the level of visibility that can be internal or external metrics [42]. Also, by other researchers it was divided into basic and additional metrics, objective or subjective, project classification, and static and dynamic. For assessment of quality of software systems, it is significant to define thresholds for software metrics [46]. Software metrics are deliberated for bad smell detection in source code. Existing bad smells in source code shows inacceptable architecture design of software that makes it severe to maintain in future. Software measurement is a process that represents software product or process characteristic to a numeric value [45]. The results are compared with a set of standards that are defined by individuals or organizations and a software quality is concluded [47]. Software metrics can be used to each phase of software development process such as requirements, design, implementation, testing and evaluation, maintenance and use for evaluating of quality of software. Tab. 3 shows 49 software metrics used for detection of code smells with abbreviation and a short definition. All code smells in Tab. 1 do not have metrics to detect. Each code smell can be detected by one or more metrics. For instance, Feature Envy can be detected by three metrics [13].
In programming, objects are used as a structure for keeping together data and operations which process that data. Feature Envy indicates Methods that look to be more concerned in other classes than its real occupied. Feature Envy methods access a variety of data of foreign classes. This may possibly is because of misplacing methods and they should move to another class. Data and operations should be close as feasible. This proximity can help to improve the cohesion and ripple effects reduction. Detection of Feature Envy considers counting the number of data members that used by method outside of its own class. Detection technique follows below steps: 1) Method uses more than few attributes of other classes and this measures by ATFD (Access To Foreign Data) metric. 2) Method uses more attributes from other classes compare as its own class and this measure by LAA (Locality of Attribute Accesses) metric.
3) The used foreign attributes are from a few outside classes and this measure by FDP (Foreign Data Providers) metric. This step considers because, if method uses foreign attributes of one or two outside classes it is feature envy smell but if method uses foreign attributes of more outside classes this is Brain Class smell. Therefore, for separation of this two smells researchers consider third condition.
In additional, researchers consider counting of all dependencies of the method, either inside its own class or outside its own class, and they use FDP metric because if method uses a few attributes from foreign classes, method can move easily to foreign classes and dispersion of classes will decrease. Also, foreign class includes less functionality and Feature Envy method has high complexity and size.
Feature Envy can be detected by the Eq. (1) and Fig. 2.
where FEW takes the value of 5 [13,48]. On the other hand, Large Class can be detected only by LOC [13].
Tab. 4 illustrates the relationship between some code smells and their code smell detection metrics.

Refactoring Actions
As reported by Fowler, code smells can be removed by refactoring. Refactoring develops the design of existing code of software system by modification of internal structure without affecting its external structure. The main target of refactoring action is improving software design quality and developing quality features like understandability, flexibility, and reusability. Refactoring is not developing the design of the software system through its initial step of design, but Feature Envy developing its design through the maintenance phase [7]. Tab. 5 shows refactoring actions name and definition. In addition, Tab. 6 describes relationship between code smells and corresponding refactoring actions. Clients cooperate with two classes, but one of them has a preferred interface. Then, these interfaces unify with an adapter.

Categorization of Code Smells and Related Concepts
Categorization is grouping objects according to their similarities and common features or relationship between all members in the group. It is an essential process for cognition of things. Categorization organises knowledge and improves understandability as element inherits categorical attributes. Fig. 3 shows as an overall view of categorization of code smells and related concepts.
Each of these categories is explored further in detail in following subsections.

Categorization of Code Smells
Mantyla [23] proposed a classification of code smells because; some of the code smells are closely related. Each category has an appropriate name which is according to relationship between the bad smells in each category. This classification is provided to better understanding of smells and to identify relationship between them. Over the years, the initial categorization is slightly changed by researchers. As follows in Tab. 7, classification of bad smells with their definition is explained [23,49,50].
In literature 22 code smells are classified. Researcher tried to find out the classification of all code smells that are covered in Tab. 1, and Tab. 8 shows categorization of each code smell.  It reveals one part of code that has grown so large and cannot be successfully handled. 2

Object-Orientation Abusers
It reveals incorrect or incomplete use of objectoriented concepts.

Change Preventers
If changing in one place of code requires many changes in other places too.

Dispensables
Display something unnecessary in the code whose absence would make the code more effective.

Couplers
Lead to excessive coupling among classes or indicate what happens if coupling replaced by excessive delegation. 6 Other Smells These smells do not fit in any of the above classification.

Categorization of Tools
Various detection tools are able to execute automatic code inspection. Smell detection tools are categorized either as plug-in or as stand-alone application [13]. These tools adopt a little different approaches for detecting code smells. The Eclipse framework is a common integrated advance environment, planned to assist tools that can be used to develop applications and tools or to handle all varieties of documents. A small plug-in loader is placed at the core of Eclipse and entire extra functionalities are performed by plugins [51]. A standalone tool performs locally on the device and doesn't need anything else to be functional. Standalone tools have continuity and interpretation disadvantages. For development of code detection, tool requires a visual integration into the IDE (Integrated development environment). A standalone tool cannot understand which part of code is edited by programmer, therefore continuity cannot be achieved. On the other hand, a smell detector plugin shows continuously whether any code smells have been realized without forcing the programmer to leave his IDE. After finding a smell, tool can easily shows the existence of the smell and a suggestion how to remove it. This performance underlines the usability factor. Thus, Smell detection tools with integrated IDEs are more effective compared to standalone detection tools. Tab. 9 shows categorization of covered detection tools, and also languages supported by each of them.

Categorization of Software Metric
Every bad smell involves a particular kind of system element like classes or methods which can be appraised by its inner and external characteristics. Metrics can use in filelevel, class-level, component-level, method-level, processlevel and quantitative values-level metrics [52]. In this exploration study, Software metrics are categorized as class level and method level metrics. Class level metrics measure features of class as well as information on the collaboration among classes. Class level metrics that measure class communications give information for design the system more than code. Some of the class level metrics determine division of labour between methods while others determine the amount of code affect in other classes with changing a special class. The best situation is changes in one class have minimum changes in other classes. When a high level dependency is between classes, they should locate in same package. Method level metrics are one of the most useful metrics. One of the ideal guidelines of programming is that each method should execute a single clear distinct function because a long part of code is difficult to understand [53,54]. Tab. 10 shows categorization of covered software metrics in Tab. 3.

Categorization of Refactoring Actions
Various refactoring actions are available that some researchers have divided them into 6 categories as follows [50]. 1) Composing methods: Most of the refactoring is concerned with accurately composing methods because extremely long methods are root cause of all destructive qualities. Therefore, this group restructures methods, eliminates code duplication, and provides better future improvements.
2) Moving features between objects: Moving functionality between classes, building new classes, and hiding performance features from public access is supported by these refactoring actions.
3) Organizing data: This group assists data management, replacing primitives with rich class functionality and helps to solve class associations that construct classes more portable and reusable. 4) Simplifying conditional expressions: Preventing conditionals from getting more and more complicated in their logic over time is facilitated by this group of refactoring actions. 5) Simplifying method calls: This group streamlines the interfaces for collaboration between classes and creates method calls uncomplicated and more obvious to understand. 6) Dealing with generalizations: Moving functionality across the class inheritance hierarchy, building new classes and interfaces, and substituting inheritance with delegation and vice versa or anything related to abstraction is handled by this group of actions. Researcher tried to classify all refactoring actions in Tab. 5 and shows each code smell belongs to which categorization in Tab. 11.

ORGANIZING THE CODE SMELL KNOWLEDGE
One of the most important objectives of code smell exploratory study is organisation of knowledge of code smells. Designing of a code smell repository improves software process, decreases the research gaps and prepares structural sources to developers. Organising the code smell knowledge is showed in follows steps.

Designing Code Smell Template
Designing code smell template is according to relationships between code smells, software metrics, detection tools and refactoring actions. A code smell template is designed and an instance of it is presented.

Code smell Template
Name: name of special code smell Alias: This is an alternate name for Code Smell Definition: Definition of special code smell Links: The list of databases that information about special code smell is available Category: The name of category that special code smell belongs to Detection Tools: The name of tools that can detect special code smell Software Metrics: The name of metrics that can detect special code smell Refactoring Actions: The name of refactoring techniques that are able to remove special code smell

Designing Code Smell Database Schema
Based on designed code smell template a schema is designed for describing of its structure. A code smell database schema characterizes the tables and corresponding fields contained in a database. It displays as a list of tables that every table contains a sub list of fields beside the related data type. Code smell database schema includes main tables such as code smell table, metric table, tool table, refactoring  table and relational tables

Designing Code Smell Database Schema
Based on designed code smell template a schema is designed for describing of its structure. A code smell database schema characterizes the tables and corresponding fields contained in a database. It displays as a list of tables that every table contains a sub list of fields beside the related data type. Code smell database schema includes main tables such as code smell table, metric table, tool table, refactoring  table and relational tables between the main tables.

Making the Knowledge Accessible on Cloud Platform
Code smell knowledge collected from different sources, is organized and made accessible on cloud platform. A new code smell web application is designed using Angular, Material Design, Node Js, Express JS and MongoDB for organization of code smell knowledge. Angular is an application design framework and development platform for creating efficient and sophisticated mobile and desktop single-page applications. Some screenshots of code smell web application are given in Fig. 4.
The application is available on Heroku cloud platform at https://serene-tundra-28026.herokuapp.com and is under construction. All the tables from 1 to 11 with details are available in the site.

CONCLUSION
Code smell topic requires to be understood in depth. The objective of this exploratory study is to explore the code smell problem and its related concepts. A code smell repository is designed and code smell knowledge is arranged systematically. It is accessible on cloud platform. It enables developers and practitioners to set up a powerful foundation for exploring their idea about code smells. Also, this study can help other researchers to preserve a lot of time and resources. In the future, researcher plans to enhance the code smell repository by adding formulas and threshold values. Further this repository can be used to analyse the relationship between code smells and related concepts for identifying a minimal set of metrics, tools or refactoring actions to detect maximum set of code smells. Data mining techniques such as association rule mining can be used for finding representative software metrics for each code smell category. Clustering can be used to get a new way of categorizing code smells. Code smell repository and techniques of extracting insights from it can be made available to developers and practitioners.