Market Segmentation of Leisure Boats Exhibited in the Boat Show by Using Multivariate Statistical Techniques

The aim of this study is to segment recreational boats according to their basic parameters in order to develop marketing strategies and to investigate the benefit/cost factors in consumer preferences across segments. For this purpose, 69 recreational boats under 10 meters exhibited at the Istanbul Boat Show were clustered using basic parameters. In the study, in which hierarchical clustering and multidimensional scaling analysis were used, the boats were divided into four clusters and these results were intended to create an input in the marketing strategies of the boats. These clusters are labelled from the lowest segment to the highest segment, A, B, C and D in ascending order. Based on the calculated averages of these segments for five variables, their intended use is introduced. This segmentation provides guiding findings in different areas such as marketing, advertising and production strategies from the arrangement of the boats within the fair. In addition, alternative actions have been determined for both the customer and the seller by revealing the costs to be incurred in the event that customers prefer different segments.


Introduction
Recreational boats are products that have a special design and can be used for several decades. These boats are used for recreational activities, fishing, fast driving etc. (Wellsandta et al., 2015). There has been a significant in-. There has been a significant increase in coastal tourism, which started in the 19th century, including recreational boats and sea sports (Davenport & Davenport, 2006). Yachts and recreational boats are gaining popularity worldwide. They have a 2% annual growth rate in the United States alone, and this growth has created a growing market for new boats (Vasconcellos & Latorre, 1999). Satellite image of almost all waterways in developed countries shows that the existing water areas are largely covered by composite boats under 20 meters in length. The International Council of the Marine Industry Associations (ICOMIA) estimates that there are more than 6 million recreational boats only in Europe (Marsh, 2013). Approximately 49% of the 850 recreational boats registered in the US database are less than 8.2 m (25 ft) in length (Vasconcellos & Latorre, 1999).
Boat designs evolved from wood boats to composite boats over time, depending on the demands and expectations of customers. Since wooden boats are used at sea, they are exposed to unfavorable effects of such adverse weather conditions as sun, rain, sea water, wave force or wind (Kaygın & Aytekin, 2005). Also, over time, biofilm (germ layer), fouling (biological contamination) layers are formed in parts of wooden ships, boats and yachts that contact water (Bülbül & Filik, 2019). Small recreational boats are mostly made of composite materials. The share of composite materials in the manufacture of boats less than 50 meters in length is around 70% (Dokos & Mondal, 2013). Composite materials have many advantages over other materials in the manufacture of recreational boats. These materials provide 30-40% reduction in the total weight of the boat. Composites are also very flexible and useful in boat design (Garcia, 2013).
Boat Shows are regularly organized in order to present recreational boats to potential customers in different regions of the world. In order to better understand the needs of visitors in boat shows and respond to them effectively, more comprehensive information about boat markets is needed (Park, 2009).
The purpose of marketing is to bring consumers together with suppliers who can meet their needs and demands in the most appropriate way (Dolnicar, Grün, & Leisch, 2018). However, segmentation in marketing is about differentiation (Plenert, 2014). Segmentation method is used in many different research fields such as processing medical images, data clustering, computer algorithms, linguistics studies, biometric studies, supply chain management and tourism research (Kainmueller, 2013 (Smith, 1956). Segmentation is the process of dividing customers into homogeneous groups to develop differentiated marketing strategies (Tsiptsis & Chorianopoulos, 2009). Market segmentation is a very popular and widely used tool in strategic marketing (Dolnicar & Leisch, 2013).
Given both a marketing and research perspective, segmentation is mainly used to create a manageable number of groups that share well-defined features (Dexter, 2002). A good market segmentation ensures that cluster members are as similar as possible within the same cluster and as different as possible between clusters. Good market segmentation contributes to a full understanding of the market, an accurate prediction of behaviour, an increased probability of identifying and using new market opportunities, and identification of groups worth following. If segmentation is properly implemented, it will guide companies to adapt their products and services to groups with a high probability of purchase (Tuma, Decker, & Scholz, 2011).
One of the most frequently used methods in market segmentation is cluster analysis. Muller and Hamm analyzed the change over time in market segmentation using cluster analysis (Mueller & Hamm, 2014). Kuo et al. compared three different cluster analysis methods in market segmentation in their studies (Kuo, Ho, & Hu, 2002). Hruschka and Natter compared K-averages clustering technique and artificial neural network algorithm in market segmentation (Hruschka & Natter, 1999). Arimond and Elfessi used cluster analysis in categorical data and tourism market segmentation (Arimond & Elfessi, 2001). Saunders evaluated the use of cluster analysis in market segmentation (Saunders, 1980). Dolnicar evaluated cluster analysis studies used in market segmentation according to the method used, number of variables and the number of clusters (Dolnicar, 2002). Apart from cluster analysis, there are various techniques used in market segmentation. Green and Krieger used Konjoint analysis in market segmentation (Green & Krieger, 1991 (William, Dauglas, & Gary, 1979). Kelly et al. investigated the potential of artificial neural networks in comparison to other methods in market segmentation. As a result of the research, they revealed that artificial neural networks perform a more accurate classification than discriminant analysis and logistic regression analysis (Kelly, James, & Milam, 1995).
Segmentation studies have been popular subjects in marketing and recreation (Park, 2009  No studies on segmentation of recreational boats have been found in the literature. In the study conducted by Park, segmentation was carried out for visitors attending a big boat show fair and three different visitor segments were identified (Park, 2009). The segmentation of the products offered to them as well as the segmentation of the customers is important for marketing.
The overall dimensions of the boats are clustered around certain measures due to the evolution of the boat over time, depending on the purpose of use, the expectations of owners from the boat, the tax regime and exemptions, as well as the natural conditions in the waters and regions where the boat is used. In this case, it is necessary to divide boats into certain segments and create more homogeneous subclasses. Thus, the manufacturer will have a clearer idea of how many boats it has to produce, and the segment that consumers will demand, the benefit it will receive, and the cost to bear. It will contribute to the revival of the boat market since both the manufacturer and the consumer side will have clear ideas about the market.
Due to the high construction and operating costs of boats and the disadvantages mentioned before, the majority of recreational boats are made of composite material with a paint less than 10 meters. Therefore, this study focused on boats that meet these criteria.
It is understood that none of the studies in the literature have made a market segmentation for recreational boats, and this is the first study with such a scope. For ex-this is the first study with such a scope. For ex-For example, passenger cars such as automobiles are produced by companies in different segments, but there is no such study for boats yet. Another innovative aspect of this study is that, besides revealing market segmentation for boats, hierarchical clustering analysis and multidimensional scaling analysis findings are presented visually under a single graph, making it easier to understand the structure of naturally occurring boat clusters.

Data
In this study, the basic dimensions (LOA, width, underwater depth, weight, maximum power) of 69 boats under 10 meters in length, which were presented at the "The Eighth Marine Vessels Equipment and Accessories Expo (CNR Eurasia Boat Show)" held at Yeşilköy / İstanbul CNR Expo Center between 13-21 February 2015, were used.
In data collection, the values in the printed catalogs of the boats were taken. The reason for using these variables in the segmentation of boats is that they are common variables in all catalogs and they are the basic indicators of the design of boats. In this study, each of 69 boats is numbered from 1 to 69.
These boats can generally be divided into two main categories, with inboard and outboard engines. The engines of inboard boats are fixed on the boat and diesel engines are used. In outboard boats, gasoline engines that can be dismantled are used. Outboard engines are used in relatively small boats of up to a certain size, often in situations where high speed is required while inboard diesel engines are preferred in larger and heavier boats where high torque is needed. In this study, 24 of 69 boats with a length of less than 10 meters have an inboard engine while 45 of them have an outboard engine. Descriptive statistics for 69 boats used in the study are given in Table 1.
As can be seen in Table 1, maximum power variable yielded the highest coefficient of variation followed by weight, underwater depth, full length and width variables respectively.

Method
The complexity of most cases requires researchers to observe and collect data on many different variables related to each other. This method is called multivariate analysis because the data includes simultaneous measurements of many variables (Johnson & Wichern, 2014). As the name implies, multivariate statistical techniques are very powerful and useful techniques as they can include many variables in the analysis at the same time. For important reasons, researchers in all scientific fields have long ceased to rely on classical univariate design (Harris, 2001). Multivariate statistics is an extension of univariate statistics. Multivariate data analysis handles many variables together, and therefore data evaluation often acquires a new and higher quality (Varmuza & Filzmoser, 2008).
The aim of this study is to present information to manufacturers and potential consumers in this market by making in-depth analysis of the dimensions and machine forces of small boats for entertainment purposes by applying hierarchical cluster analysis and multidimensional scaling analysis which are multivariate statistical techniques.

Hierarchical Cluster Analysis
Cluster analysis is one of the multivariate statistical techniques for finding groups within the data (Kaufman & Rousseeuw, 2005). Cluster analysis is defined as the process of dividing objects into natural groups based on their similarities; it is used to reveal previously undetected relationships between objects, to reduce size and to detect outliers (Ferreira & Hitchcock, 2009).
Cluster analysis is basically divided into two groups, hierarchical and non-hierarchical cluster analysis. The four basic steps to follow in performing the hierarchical cluster analysis can be counted as follows (Romesburg, 2004): i) Creating a data matrix that specifies columns, objects to be clustered, and rows are attributes that define those objects. ii) Standardizing the data matrix. iii) Calculating similarity coefficient values to measure similarities between all object pairs. iv) Using a clustering method to process the values of the similarity coefficient, called a dendogram, resulting in a diagram showing the similarity hierarchy between all object pairs. There are several methods used in hierarchical cluster analysis. These methods are; nearest neighbor, furthest neighbor, median clustering, between-groups linkage, within-groups linkage, centroid clustering and Ward method. Ward method is generally seen as the method that ANOVA is used to compare the averages of k groups for the dependent variable examined. If one categorical variable is effective on the dependent variable, it is called "oneway ANOVA", and if two categorical variables are effective, it is called "two-way ANOVA". In the analysis of variance, the following hypotheses are tested: While the null hypothesis states that k groups are not different from each other in terms of the dependent variable averages examined; the alternative hypothesis shows that at least one group is different from the others in terms of the dependent variable averages examined (Işığıçok, 2018). Post Hoc tests are used to determine the groups different from others. In this study, Bonferroni test, which assumes homogeneity of variances from Post Hoc tests and which can be used in unequal sample volumes, was used.
Various assumptions must be made in order to perform the ANOVA. Accordingly, the data must be independent of each other, the dependent variable must be a continuous variable measured at the interval or proportional measurement level, and the variances for the groups must be homogeneous. Also, the data in the sample group should be normally distributed. If the distribution structure is unknown, it will be appropriate to have a sample volume of at least 30 according to the central limit theorem (Işığıçok, 2018). (Kruskal, 1964) introduced the multidimensional scaling as the problem of representing n objects (eg: number of boats) geometrically with the n point (eg: boat's position in two-dimensional space) and stated that the distances between points represent in a sense experimental differenc-experimental differenc-experimental differences between objects. In the same study, Kruskal introduced numerical methods of multidimensional scaling and stated that they wanted to find the most suitable position for the differences between objects in multidimensional scaling. To this end, Kruskal defined a measure of the natural fit goodness we call stress to create a solid theoretical basis for multidimensional scaling. Stress measures how much any positioning fits the data. As for desired positioning; it is the smallest stress value found by numerical analysis methods (Kruskal, 1964). Kruskal states that goodness of fit in positioning the data can be interpreted as in Table 2 according to the stress value obtained.  Kruskal, 1964 In multidimensional scaling analysis, there are two types of methods, metric and non-metric scaling, depending on the data type. Metric and non-metric analysis methods draw different assumptions about the data and the relationships between the data calculated from the coordinates estimated by the multi-dimensional scaling model. While the relationship is assumed to have the least spaced scale features in the metric method, only sequential scale features are required in the non-metric method (Mackay & Zinnes, 1986). The non-metric multidimensional scaling analysis, first introduced by Shepard (Shepard, 1962) and having a stricter algorithm with an objective optimization criterion by Kruskal, has attracted great interest theoretically as it eliminates the linearity assumption of metric methods (Kenkel & Orloci, 1986).

Hierarchical Cluster Analysis Findings
Ward technique was chosen for hierarchical cluster analysis and the squared Euclidean distance was used. Also, since variables have different units of measure, all variables are standardized in the range of -1 and +1. To decide on the number of clusters, a line graph consisting of coefficients related to the stages of the hierarchical clustering was used. In Figure 1, the coefficient values corresponding to the clustering stages are shown with a line chart. While the numbers shown on the x axis show the stages in the clustering analysis, the values on the y axis are a coefficient expressing the distance/difference between these boats. As this coefficient on the Y axis increases, the boats differ in terms of the parameters used in cluster analysis.
Considering the dramatic leaps between the coefficients indicate the transition to a new set, it can be said that the line graph points to a four-clustered structure. Table 3 shows four clusters formed as a result of hierarchical cluster analysis. Descriptive statistics for the clusters were calculated and given in Table 4.
In Table 4, it is seen that the number of boats per cluster is distributed evenly. Thus, it can be said intuitively that the examined vessels are not homogeneous and can be represented by four segments.   When descriptive statistics are analysed, it is seen that Cluster 4 has the highest averages based on all variables except underwater depth. In other words, 14 boats in Cluster 4 constitute the longest, widest, heaviest, and highest power capacity cluster on average. In this sense, Cluster 4 is followed by Cluster 1, Cluster 2 and Cluster 3. While Cluster 3 was the one with the highest underwater depth average, it received the lowest values in all other variables. It is seen that the boats are concentrated in Cluster 1 and Cluster 2, which have more medium di-Cluster 2, which have more medium di-2, which have more medium dimensions. While the total number of boats in these two clusters is 42, the total number of boats in Cluster 3 and Cluster 4 with the lowest and highest statistics is 27.

Variance Analysis Findings
For clusters, whose descriptive statistics were calculated, variance analysis was performed to determine whether there was a statistically significant difference between the averages of the variables in question and the level of significance of the variables in the cluster. The findings are as in Table 5. However, the most effective variable in the formation of clusters in this way was the full length with the highest F value (99.642). The full-length variable is followed by weight (58.077), width (57.147), maximum power (33.036) and underwater depth (19.241). The most important variable that creates significant difference between boat types is the length of the boat. However, the effect of the variables of weight and width on the cluster can be said to be almost the same.
It is no surprise that boat lengths emerged as the most important variable in clustering. The form coefficients used in boat designs, tax exemptions and rates are determined according to the length of the boats. As a matter of fact, ordinary users express their boats according to their height while defining their boats in daily life.
Bonferroni multiple comparison test was performed to determine which cluster pairs lead to significant difference determined as a result of variance analysis. As a result of the Bonferroni test, it was observed that the variable of full length and width varied significantly in all clusters (p <0.01). Significant difference in underwater depth variable is between Cluster 2 and the other three clusters. There was no significant difference between Cluster 2 and Cluster 3 only in Weight and Maximum power variable. Figure 2 shows the changes of the boat characteristics for the four clusters obtained. The course of all variables by clusters is the same, but only the depth of the underwater variable is different. Notice that the boats with the highest underwater depth are the ones in Cluster 3, which are the smallest in size. The reason for this is that the small area of the boat needs to sink deeper to meet the total weight.  In Figure 3, it is seen how the characteristics of the five Boats examined have changed in four clusters. Accordingly, it is clearly seen that Cluster 4 has the highest values, followed by Cluster 1, Cluster 2 and Cluster 3.

Multidimensional Scaling Analysis Findings
Multidimensional scaling analysis was performed by creating a similarity matrix from the original values. Squared Euclidean distance was used in the analysis, and the indicators were standardized between -1 and +1 as in cluster analysis. As a result of multidimensional scaling analysis, stress coefficient was found to be 0.06338. Therefore, it can be said that the visualization made with the data fits well with the real situation. However, the D.A.F. value, which gives an idea about the goodness of fit, was found to be 0.93662 and Tucker's coefficient was 0.96779. The fact that these values are very close to 1 also shows that goodness of fit is at the desired level. As a re-sult of multi-dimensional scaling analysis, 69 boats are shown in two-dimensional space according to five indicators as in Figure 4. The dimension values specified in the graph refer to the coordinates calculated based on the determined parameters of the boats.
In Figure 4, the boats located close to each other are more similar to each other in terms of these five variables, while the distant ones are less similar. The first reason why Boat40, Boat52 and Boat25 are clearly different from other boats is that, as ANOVA findings support, these boats are the longest boats in their clusters.
Since there is a similarity relationship in hierarchical cluster analysis and multidimensional scaling analysis, it is expected that the dendogram graphic results obtained from hierarchical cluster analysis will match the findings of multidimensional scaling analysis. Based on this understanding, the findings from these two analyses were evaluated together.

Evaluating Multidimensional Scaling and Hierarchical Cluster Analysis Results Together
The coordinates obtained in the multi-dimensional scaling analysis show the positions of the boats in twodimensional space. Each boat shown here is assigned to four different clusters by hierarchical cluster analysis. Therefore, the coordinates of each boat are matched with the cluster to which it belongs, and the clusters to which the boats belong are visualized in two-dimensional space. In Figure 5, the cluster structure formed as a result of hierarchical cluster analysis is seen in two-dimensional space with the help of coordinates obtained from multi-dimensional scaling analysis. The results in Figure 5 are visualization of how the visual table, which is determined to be in good agreement with the real situation, gained a structure through cluster analysis. Thus, as a result of hierarchical clustering analysis and multi-dimensional scaling analysis as a single analysis, the boats are seen in two-dimensional space as in Figure 5.
When the relationship between clusters is examined in Figure 5, it is seen that the averages of Cluster 1 and Cluster 2 are quite close to each other. However, Cluster 3 and Cluster 4 are located as the most distant clusters. This finding is supported in Table 4, where descriptive statistics on clusters are analyzed. The most important reason for this position of the clusters is the full-length variable, as can be understood from the analysis of variance in Table 5. As a matter of fact, the first reason why Boat40 and Boat52 are positioned differently from other boats is their height of 9.90 m and 9.74 m, respectively. Similarly, Boat 25, which is located separately in its own cluster, is the longest, heaviest, and underwater depth boat of its own cluster.
The average values of the variables of the boats according to the segmentation resulting from the cluster analysis are summarized in Table 6.
A-B-C-D segments are ordered from small to large in terms of the length, width, weight and maximum power of the boats. Although this order is only disrupted in the  Most of the boats in the A segment are relatively economical boats, which have the advantage of tax exemption due to their height, have an open deck, do not have a cabin and are used for daily hobby purposes, rudder control at the stern. Boats in the B segment are generally with a small cabin for the material at the top, where the rudder control is provided by the steering wheel at the top, and it meets the power, speed and comfort needs better than the A segment. Boats in the C segment address the same purpose of use as the B segment and have more closed spaces as half cabins. The D segment is a group of boats that are not only for hobby purposes, but also have basic equipment for life and offer overnight accommodation. While boats with a length of less than 5 meters have a tax exemption, the same tax is collected from boats with 5 to 9 meters of paint. This situation supports the statistics obtained as a result of segmentation. In this case, it is seen that most of the A segment boats have tax advantages.
The segmentation in Table 6 can be used to determine the arrangement of the boats in the fair. The most important variable of such a segmentation is the length of the boats, as can be seen from the analysis results. In this way, customers can meet boats according to their intended use such as fishing boat, strolling, etc. Customers who want to use tax exemptions for daily hobby purposes can be directed to boats in the A segment. A customer group with a high income and looking for an accommodation alternative on board can be directed to the D segment. Customers who find the boats in the A segment unusable and do not prefer the boats in the D segment due to their high cost can be directed to 42 boats in the B and C segments. As a matter of fact, according to the results of the analysis, most of the boats in the fair consist of the middle segment boats that fit the B and C segments.
Customers want to know what comfort advantage they will get in exchange for a unit cost they will incur during the transition between segments. In this case, it is thought to rate the percentage change in cost to the percentage change in the usage area of the boat. Since all the boats are made of composite material, the average weight of the boat in the relevant segment is based on the boat cost. Table 7 shows the percentage costs that customers have to bear compared to 1% area increase in transition from one segment to another. According to Table 7, if a boat in segment B is preferred instead of a boat in segment A, the customer has to bear a cost of 2.613% for every extra 1% area in segment B. When interpreted in this way, the most rational transition that maximizes the benefit will be from the C segment to the D segment. Considering the intended use of these segments, it can be said that the transition from segment A to the other three segments has a higher value. The transition from the B segment to the C segment is low compared to the benefit obtained, and the purposes and structural features of these two segments are similar; the preference between the two segments is more likely to be used for the C segment. However, this type of transition may also be attractive to customers as the transition from C segment to D segment is the lowest compared to the benefits achieved across all other segments.
From the perspective of the producer, these rates can also be used in determining pricing strategies. In addition, such segmentation will help develop advertising and sales strategies depending on the purpose of use. For example, Boat40 and Boat52 can be positioned and marketed differently within the fair as the largest boats in the D segment.
Especially the costs of composite boats are significantly reduced in mass productions. For this reason, boat molds can be created in standard sizes according to the segmentation results.

Conclusion and suggestions
Market segmentation is an important issue for marketing and no such study has been found in the literature to date for small recreational boats. In this study, 69 recreational composite boats under 10 meters presented at the boat show were examined by using hierarchical cluster analysis, variance analysis and multi-dimensional scaling analysis according to 5 basic design dimensions; boat length, width, weight, underwater depth and maximum power.
Based on the results of the analysis, the boats are divided into 4 segments, namely A, B, C and D, from small to large. While the length of the boats is the most effective variable in segmentation, weight, width, maximum power, and underwater depth are followed, respectively.
It is believed that the purpose of use of the boats in the segments obtained differs depending on the basic design dimensions. Based on this understanding, many strategies can be developed from the in-fair arrangement of the boats to the marketing based on the determined segments. The findings of this study are thought to be beneficial in bringing different suggestions to potential customers in terms of benefit / cost in preferences between segments.