Collaborative Filtering with Temporal Dynamics with Using Singular Value Decomposition

Nowadays, Collaborative Filtering (CF) is a widely used recommendation system. However, traditional CF techniques are harder to make fast and accurate suggestions due to changes in user preferences over time, the emergence of new products and the availability of too many users and too many products in the system. Therefore, it becomes more important to make suggestions that are both fast and take the changes in time into consideration. In the presented study, a new method for providing suggestions customized according to the users' preference and taste as they change over time was developed. By combining the time-dependent changes through the SVD (Singular Value Decomposition), a faster suggestion system was developed. Thus, an attempt was made to enhance product prediction success. In the present study all techniques on Netflix data and the results were compared. The results obtained on the accuracy of the predicted ratings were found out to be promising.


INTRODUCTION
Recommendation systems implement techniques that discover information in order to make personalized recommendations during interaction with products and services, and they deal with substantial amount of data. They are used widely in today's e-commerce and web applications. However, due to the presence of great numbers of users and products in the system, it may prove difficult to make instant and accurate suggestions to users. It requires substantial processing time and memory.
Yet, particularly in online systems, users expect their requests to be met instantly. In addition, while making recommendations to users, new products and services that change individuals' tastes may emerge.
Also seasonal changes or regional factors [1,4] may change users' perception and perspective over time. For instance, users may develop a new perspective for an actor or movie genre, or may change the types of movies he or she likes over time. Furthermore, every user may experience a different change. Alternatively, a change in a person itself or the person's family may change the person's needs.
Due to the variations in users' requests, on the basis of regional or any other various factors, it becomes hard to make accurate and reliable suggestions for every user. The traditional item-based CF technique makes recommendations by considering users' ratings and the similarities across items [8]. However, traditional itembased CF technique cannot make recommendations pertinent to the changing requests and tastes of users. Since these systems ignore such changes over time, they recommend the same product every time. If it is an ecommerce application that implements this technique, the client will start losing interest since the application will not be able to meet the changing interests of the client, and as a result company productivity will fall [1].
A system that will take into consideration time dependent changes and that will observe the changes in every user's behaviour is needed in order to keep the users' interest alive and to enhance company profitability. With item-based CF with temporal dynamics using the age of ratings proposed in this study, temporal variations were taken into consideration and an attempt was made to increase prediction success. The longer a product rated by the users has been in the market, the more users has it been rated by. In other words, older products have remained longer and accordingly have been seen and rated by more users. In prediction systems, the dates on which ratings were given are as important as the ratings. The ratings given in the recent past to a product registered in the system are more valuable than the older ratings the said product has received. Accordingly, in the proposed system the ages of the user ratings were calculated and a plan was developed to increase the weights of newer ratings while decreasing the weights of older ratings. Assessing ratings on the basis of their age has enabled rating prediction success to increase significantly. Furthermore, in order to make faster and more accurate recommendations, the traditional item-based CF technique was integrated with SVD. SVD is a method that is quite effective in solving the problems faced in CF approaches such as data scarcity, scaling, synonym and latency. By means of the SVD method, dimension of data was reduced and the ratings users give to products were calculated faster. In order to enhance product prediction success, the success of the temporal CF technique with SVD method was calculated. With the SVD dimension reduction method, both the required memory and the processing time were minimized. In this way, more accurate recommendations were made at shorter periods of time.

RELATED WORK
Recommendation systems constitute a subclass of the data filtering systems that recommend items peculiar to a user on the basis of the rating that user had given to similar items. Today, these systems are used widely in various areas including movies, music, videos, news, books, social networks and web pages.
Recommendation systems generate recommendations in two ways -as Collaborative Filtering (CF) and Content-Based Filtering. CF makes recommendations by evaluating users' past behaviours, interaction with other users or product similarities. Content-based filtering, on the other hand, makes recommendations with consideration of the demographic characteristics of users such as age, race, occupation or where they live, and the typical characteristics of items that separate them into various categories.
Coi and Suh tried to enhance the quality of the recommendations by using similarity function for the selected neighbours of the item under consideration [2]. In order to solve the data scarcity and scalability problems of content-based CF and collaborative filtering methods, Liu proposed a hybrid recommendation system. In his study, Liu developed a method that integrates user and item behaviours. In order to resolve data scarcity, the author assigned average ratings to unrated items. As for the scaling problem, he classified users on the basis of their personal characteristics [3].
Xiang proposed a session-based temporal graph method that integrates short and long-term temporal variations [4]. Lathia, Hailes, Capra and Amarian compared the results they obtained from item-based CF, the k closest neighbour algorithm, and the SVD methoda matrix factorization technique -in terms of rating accuracy. Also with these methods, the authors examined the effects of temporal dynamics on user ratings [5].
For the purpose of solving the inadequacies of memory-based CF algorithms made using several similarity parameters, Hoffman developed model-based CF algorithms based on dimension reduction methods such as clustering, classification and SVD [20] Yang and Liu used the MapReduce application in order to generate effective recommendations in short periods of time. In this way, the authors managed to reduce the calculation cost in data sets [7] Liu, Hu and Zhu developed an effective method with a new similarity model to improve the accuracy. This model showed context information on user rating and increased performance [13].

ITEM BASED COLLABORATIVE FILTERING
In an item-based approach for k items most similar to the item i rated by the active user, { 1 , 2 , .... , } similarities { ,1 , ,2 , .... , , } are calculated. , shows the similarity item i has to item j. The most important step of item-based CF algorithms is to determine similar elements and to group the closest items. The main idea in calculating similarity is to determine the users that rated both item i and item j, and to calculate the , similarity among the jointly rated items. The similarity between the items i and j is calculated via Pearson correlation as shown in Eq. (1) [10].
where , is the rating user u gave to item i, � is the average of the ratings given to item i, , is the rating user u gave to item j, and � is the average of the ratings given to item j [2]. In item-based CF technique, after the similarity between items is calculated, a prediction of rating ( , ) is made [10].

SINGULAR VALUE DECOMPOSITION
Singular value decomposition (SVD) is a matrix factorization technique. On consideration that A is a m×n sized matrix and r is a rank, A matrix decomposed through SVD is presented in Eq. (3) [15]. The SVD technique expresses the matrix A in the best way. k<<r singular values are obtained, the remaining values are removed and the entries of the matrix S are sorted. In this way the similarities and interactions among users that are latent and not visible in the original matrix A, yet still significant, are found. While the obtained diagonal matrix is referred to as , the matrixes U and V are referred to as and . is generated by removing the columns r-k from the matrix U, while is generated by removing the columns r-k from the matrix V. A k matrix is connoted as shown in Eq. (4) [15]:

USE OF SVD IN ITEM BASED CF TECHNIQUES
SVD is used in both memory-based and model-based recommendation systems. It is a Latent Semantic Indexing (LSI) technique and it is quite effective in solving the problems faced in CF approaches such as data scarcity, scaling, synonym and latency [20]. The form where the A m×n dimensioned original user-item matrix was separated into its singular eigenvalues is as shown below [14,16,19]: In order to detect the latent relations and similarities between items and make more accurate predictions, the matrix A is reduced to the k dimension. While U√ shows the users in the k dimension of the m×k matrix, √ V shows the items in the dimension n×k. In order to make more reliable predictions to the user, the original matrix A is shown at k dimension space with , and . SVD usage in item-based CF techniques is carried out as shown in Eq. (6) [16]: While , is used for calculating the rating user u gave to item i, indicates the average point user u gave to all items and U, S and V respectively show the eigenvalue, diagonal matrix and eigenvectors obtained after the original matrix A was subjected to SVD application. ×n = ×r � ×r � ×r ×n , in item-based CF techniques SVD is shown as follows [14,16,19].
Uses of item-based CF and SVD in recommendation systems are presented in Eqs. (5), (6) and (7). In U eigenvalues, while the active user's (u) line � ×k is multiplied with all columns, all lines of � ×k and × are multiplied with the column of the item the rating of which is sought (i).

OUR SOLUTION
Item based CF with temporal dynamics by using age of ratings is implemented and prediction performance is enhanced [9]. When calculating the similarity between items in the traditional item-based CF technique, only the ratings given by the users to the products are used [11,12].The ratings which are given to items are presented in Tab. 1, in the form of a user-product matrix. In these systems, an attempt is made to predict the ratings of the active user on the basis of the calculated item similarities. In item-based CF with temporal dynamics using the age of ratings, users' ratings are not predicted solely on the basis of the similarities among items, as it is the case in traditional item-based CF method. As distinct from the traditional item-based CF method, the age of the rating is added into the system. In other words, item similarity and the predicted rating will change according to the age of the ratings.
In item-based CF with temporal dynamics using age of ratings, , indicates the age of the rating user n gave to the product m. Let ,3 be the rating test example the active user gave to the product number 3. While calculating product similarity, also the age of each rating is needed. For instance, when calculating the age of 1,3 the difference between the rating dates of 1,3 and test data will be taken into consideration. Since the date on which the test data was to be calculated is 27.01.2005 and the date of 1,3 is 27.01.2003, the difference between these two dates is 731 days or approximately 2 years [18]. This period indicates the age of 1,3 on the basis of the present date (test date 27.01.2005). As it can also be seen in Tab.1, the ages of the ratings are dynamic pieces of information that change as per the date on which they are viewed. As can be seen in Tab. 1, the ages of the ratings are calculated based on the number of days and according to the date of test. In the developed system, the ages of these ratings were converted to years and used in that form. The objective here was to reduce errors in rating predictions by using the age of the ratings given in the item-based CF technique. The following paragraphs will outline the results of various tests.
In the proposed method, the weights of the ratings given by users were decreased for older ratings and increased for younger ratings. For this increase or decrease of rating weights, various conversion functions were tested and it was tried to determine the most suitable conversion function. Fig. 1 presents the conversion functions used to weight the ages of ratings. This conversion function was implemented as shown in Eq. (8) [18].
In this function, while stands for the age of the present rating, w indicates the weighted age of the present rating.
Since the proposed method was carried out on the basis of the ratings Netflix customers gave to a selection of movies in the last two years, takes a value within the range of [0,2]. w takes the values calculated with the conversion function shown in Fig. 1.
Here, is the current rating and is the rating weighted in terms of time. In conventional item-based CF technique explained in Section 2, is used instead of r.
Linear conversion is made through the w = + function. Tab. 2 shows varying conversion ranges for m while n = 0.25 is kept fixed. While

EXPERIMENTAL STUDY
For the purpose of comparatively measuring the success of the proposed system, the ratings Netflix clients gave to various movies during the last two years were taken into consideration. Netflix dataset consists of the ratings given by approximately half a million of clients to 17770 movies from 1999 to 2005 [17]. There are more than 100 million entries of ratings. It is a real dataset obtained from an U.S. company operating in the area of movie and video rentals and it is frequently used in recommendation systems. The results of the conducted application were calculated through RMSE (Root Mean Squared Error). RMSE is among the most widely used evaluation parameters in the Netflix dataset.
In this equation, while n shows the total number all users rated, , shows the predicted rating user i gave to item j and , shows the actual rating user i gave to item j [1]. Traditional item based CF results are shown in Tab. 3. In Fig. 2 was showed RMSE error rate obtained with temporal dynamics item-based CF. In Fig. 3 the results obtained from the neighborhoods of the classical itembased CF technique and the SVD method were compared to the results obtained from the neighborhoods of CF technique with temporal dynamics and SVD method. CF with temporal dynamics combined with SVD method provided more successful results than the results of the traditional item-based CF technique.     Fig. 4 it can be seen that the results obtained from the neighborhoods for k = 35 of the application of SVD on the [0.5…1.1] range where the most successful ratings were obtained with the temporal itembased CF technique, are more successful than the results obtained from traditional item-based CF technique.

CONCLUSION
The traditional item-based CF technique makes recommendations through the sole consideration of user ratings. For this reason, it usually falls short in making recommendations suitable for the changing requests, tastes and interests of users. Also, it requires too much processing time. With the present study, an attempt was made to rectify the inadequacies of the traditional itembased CF technique. Available ratings were associated with their ages and through the use of several conversion coefficients, they were either increased or decreased in weight. In this way a more accurate, reliable and faster system that generates recommendations suitable for users' changing interests was designed. By developing a method that keeps users' interest alive and considers temporal dynamics, product prediction success was enhanced.