Searching for an Optimal Partition of Incomplete Data with Application in Modeling Energy Efficiency of Public Buildings

Authors

  • Rudolf Scitovski Department of Mathematics, University of Osijek, Croatia
  • Marijana Zekić Sušac Faculty of Economics in Osijek, J. J. Strossmayer University of Osijek, Croatia
  • Adela Has Faculty of Economics in Osijek, J. J. Strossmayer University of Osijek, Croatia

Abstract

In this paper, we consider the problem of searching for an optimal partition with the most appropriate number of clusters for an incomplete data set in which several outliers might occur. Special attention is given to the application of the Least Squares distance-like function. The procedure of preparing the incomplete data set and the outlier elimination procedure are proposed such that the clustering process gives acceptable solutions. Appropriate justifications with proof are provided for these procedures. An incremental algorithm for searching for optimal partitions with 2, 3, ... clusters is applied on the prepared data set. After that, by using the Davies-Bouldin and the Calinski-Harabasz index the most appropriate number of clusters is determined. The whole procedure is organized as an algorithm given in the paper. In order to illustrate its applicability, the above steps are applied on the real data set of public buildings and their energy efficiency data, providing clear clusters that could be used for further modeling procedures.

Downloads

Published

2018-12-11

Issue

Section

CRORR Journal Regular Issue