The Integrated Usage of LBP and HOG Transformations and Machine Learning Algorithms for Age Range Prediction from Facial Images

: Age prediction is an active study field that can be used in many computer vision problems due to its importance and effectiveness. In this paper, we present extensive experiments and provide an efficient and accurate approach for age range prediction of people from facial images. First, we apply image resizing to unify all images’ size, and Histogram Equalization technique to reduce the illumination effects on all facial images taken from FG-NET and UTD aging databases. Second, Histogram of Oriented Gradient (HOG) and Local Binary Pattern (LBP) are used to extract the features of these images, and then we combined both HOG and LBP features in order to attain better prediction. Finally, Support Vector Machine (SVM) and k-Nearest Neighbour (k-NN) are used for the classification processes. In addition, k-fold, Leave-One-Out (LOO) and Confusion Matrix (CM) are used to evaluate the performance of proposed methods. The extensive and intensified experiments show that combining HOG and LBP features improved the age range predicting performance up to 99.87%.


INTRODUCTION
We can recognize and distinguish much information by looking at the human face, such as age, gender, ethnic, emotions, skin colour, etc. Whereas, these abilities that human beings have are not available in machines.Therefore, researchers have been trying to design and develop accurate systems to make the machine meet these challenging tasks.The age prediction of people from facial images is one of these tasks; and it is an interesting field of research and has been given increased attention in recent years.In this paper, an accurate system is proposed in order to predict age ranges of people from their facial images.
Predicting a person's age from his/her face image is not easy because of the large variation of face appearances, like variety of human race, poses, and facial expressions.However, age prediction has been recognized as an important module for many computer vision applications such as demographic profiling, forensic art, age-specific human-computer interfaces, security control, age-oriented advertisement systems, and electronic customer relationship management (ECRM).
In general, age prediction approaches are divided into two different groups that are: Age Group Classification and Actual Age Estimation (cumulative years lived).In Age Group Classification, the age range is divided into classes, each class has a range of years (e.g., from 10 to 20 years).On the other hand, in the Actual Age Estimation, we need to determine the specific and correct age of the people, which is usually based on regression methods or a hybrid of classifications and regressions to give an exact number of the age.However, this part is very difficult, and to the best of our knowledge, there are few studies for the Actual Age Estimation on embedded platforms.Therefore, this study focuses on Age Group Classification approach by dividing the age ranges into 11 classes.The organization of the paper is as follows: in Section 2 the related work about age-range prediction is given.In section 3 the materials and methods followed in the study are presented.In Section 4 the experimental results are provided, and finally in Section 5 the conclusion and discussion about the results are given.

RELATED WORKS
There are many applications and studies trying to predict the age range of individuals from facial images.The first study related to age prediction was published in 1994 by Kwon and Lobo [1], and studied the classification of age by using Anthropometric Models (AM) [2] to find the primary features of the face such as eyes, mouth, nose, and chin.In addition, an Appearance Model (APM) [3] is also used to determine the density of wrinkles in each face using the snake-lets [4].They create a facial database for their study, which contains 47 faces, and then divide the images into three age groups: babies, young adults, and senior adults.To classify these images, they used six ratios computed from the distances of different facial features like eye to nose, mouth to chin, etc. and combined these ratios with wrinkles information to get overall result of 77% success.Lanitis et al. [5] used the Active Appearance Model (AAM) [6] for age prediction.They extracted facial features of 500 images from 0 to 35 years old people by using PCA.To estimate the age, they adopted the Neural Network for classification of data, and reported 94.42% age prediction performance.Guo et al. [7] also used the AAM to estimate the age of 500 facial images from FG-NET [8] database between 0 to 69 years old people.Images features have been extracted by using a Locally Adjusted Robust Regresses (LARR).Then, SVM and Support Vector Regression (SVR) methods are investigated to classify the age.The results of their experiments showed 94.93% success estimation rate.Kanno et al. [9] used an APM to determine the density of wrinkles and extract the information of the faces.Consequently, accuracy of 80% has been achieved when Artificial Neural Networks (ANN) have been employed to classify four age groups of 110 facial images that were selected from FG-NET aging database.Bauckhage et al. [10] offered an accurate and efficient approach for age prediction from facial images.They combined HOG and Compute histograms of range filter.They trained their system on UTD database and the FG-Net database.As a result, in a first experiment, they verified whether the algorithm correctly ordered the two images and they measured an accuracy of 77%.In a second experiment, they fixed the candidate images to strictly frontal pose of faces.As a result, the classification accuracy has been improved up to 85%.
Iga et al. [11] used graph matching with Gabor Wavelet Transformation (GWT) to extract images' information such as, skin color, moustache, hair, etc.All this information was classified by applying SVM on 300 facial images divided into 6 age groups from 15 to 64 years old and taken from Softopia Japan HOIP database.Consequently, they achieved 67.4% age estimation rate.Yang and Ai [12] employed LBP feature extractor to know the Chi square distance between the extracted LBPH and a reference histogram.Moreover, real Adaboost algorithm was used as a strong classifier, which learns a sequence of best local features.By using 696 images from PIE [13] database, accuracy of 92.12% age prediction has been achieved.Shirkey and Gupta [14] developed an age recognition algorithm based on rectangle features method used to describe sub-regions of a human face, and hence component wise data can be transformed from pixel-wise data.They achieved a performance of 85% for age classification.Fazl-Ersi et al. [15] compared three methods, LBP, Color Histogram (CH) and Scale-Invariant Feature Transform (SIFT) to predict the age range.Moreover, they used SVM classifier to classify the features that are extracted from images obtained from Gallagher facial images database [16].As a result, they obtained the best accuracy by combining LBP, CH, and SIFT features, which is 63.01%age estimation.Eidinger et al. [17] used LBP and the related Four Patch LBP codes (FPLBP) to learn and extract the most important properties of images features.These images have been selected from Gallagher database.By combining LBP and FPLBP features and SVM, they achieved an accuracy of 80.7% of age estimation.

MATERIALS AND METHODS
In this study, the proposed system passes through four phases: pre-processing phase; feature extraction phase, classification phase, and evaluation phase (see Fig. 1).In this part, we first review a general description of the used databases and pre-processing stages of the proposed system.Second, we present HOG and LBP algorithms used for the feature extraction.Then, we propose to combine the extracted HOG and LBP features of each image together, and save them in a single vector.Third, details about the SVM and k-NN classification algorithms are provided.Then, we provide a brief insight into the techniques that are used to evaluate the performance of algorithms used in this study.

Databases
In this study we used two aging databases that are FG-NET database [8] and UTD Database.The FG-NET aging database (Fig. 2) was released in 2004 to help understanding the changes in facial appearance caused by the age.It is non-commercial and used to support researchers in various disciplines such as age progression, age estimation, age-invariant face recognition, or any other academic research-related activities.The FG-NET database contains 1,002 facial images from 82 different individuals whose ages are ranging from 0 to 69 years old.On the other hand, UTD database (Fig. 3) is also a noncommercial facial image database used mainly for age and gender prediction.It contains 580 facial images of people from 18 to 99 years old, where 352 of these images are Female and 228 images are Male.UTD can be used also for emotional recognition because all images in the database are detailed with face expressions such as happy, angry, annoyed, disgusted, grumpy, sad, and surprised.

Pre-Processing Phase
Solving age prediction problem requires overcoming some main difficulties, such as differing image dimensions and qualities, varying levels of luminosity, and employing sufficient number of images in each experiment.Therefore, it is necessary to apply pre-processing techniques on the images before processing them.In this study, Histogram Equalization (HE) technique and dimensions alignment (image re-sizing) have been used to help solving age prediction problem.

Illumination Normalization
To normalize the illumination of the facial images, we applied Histogram Equalization (HE) technique, which helps to reduce the effect of light and unify luminosity of all images in the databases, and this positively affects the accuracy and performance of the system.
Histogram Equalization (HE) [18] is a fast, simple and effective image illumination enhancing technique, which can effectively confirm the details of the density in any region (see Fig. 4).

Dimensions Alignment (Image Re-sizing)
Dimensions alignment means making an input image size smaller from a bigger image, or making an input image size bigger from a smaller image.However, in this study, we used two databases with a wide number of images, each image has a different resolution (more than 400×400).Therefore, we applied the image re-sizing to decrease the size of images, and make all images' size equal (all images = 192×128 size).This has advantages of helping the feature extractor algorithm to extract the same number of features from all images.

Feature Extraction Phase
In this study, two different feature extraction algorithms are used.The details of these algorithms are given below.

Histograms of Oriented Gradients (HOG)
HOG was introduced by Dalal and Triggs in 2005 [19], which became later one of the excellent local feature descriptors that has largely been used in computer vision and image processing for grabbing and capturing the distribution of local intensity gradients or the edge direction of objects.It has given promising performance in variety of computer vision problems related to object detection and recognition as an appearance based feature extraction method.In addition, HOG has many advantages such as easy to use with discriminate classifiers, and due to its ability to capture shape of an object from edges (gradients), HOG gives good results to identify object from cluttered background without using any segmentation algorithm.
HOG algorithm follows some substantial steps to describe objects in the images; first, it divides the input image into blocks, and divides each block into smaller connected cells.Then it computes a histogram of gradient directions for all the pixels within the cell.According to these gradient orientations, each pixel is reshaped into angular bins and then participated gradient to its parallel bin.Then it normalizes the group of cells (block) histograms, which represent a one-dimensional array of histograms called the descriptor [20].

LBP Features
The LBP, which was presented by Ojala et al in 2002 [21], is an efficient and powerful texture descriptor that is widely used in image processing and computer vision areas as a feature extractor.LBP algorithm has the ability to capture the shape of body in the image by looking to each pixel's neighbors.The main LBP mechanism is that the input image is divided into local regions composed of a '3×3' neighborhood of pixels.Then, the type of binary pattern assigned a label to each pixel according to its intensity value, where the distribution of these binary patterns in each block represents the results with an 8-bits integer, where the calculation of these patterns is represented as a one-dimensional array of patterns used as a feature representation [22].

Classification Phase
In this study k-Nearest Neighbour (k-NN) and Support Vector Machine (SVM) are used as a classifer.The details of the classifiers are given below.

k-NN
k-NN is one of the simplest classifiers for predicting the class of a test sample used in machine learning, which is based on training samples that are very close to each other in the features scope [23].The main idea of the k-NN classifier mechanism is based on calculating and computing the distances between all training objects to test object, then finding and gathering a collection of k objects in the training set that are nearest to the test object, and finally calculating the average of them (see Fig. 5).Although the k-NN classifier performance is highly sensitive to the number of k value and the results are affected by any changes in it, k-NN classifier is widely used and very easy to implement in many classification problems.However, determining k value is not easy because it is affected by the parameters like the type of feature extractor algorithm, and number of samples that is available in Training set [24].
Fig. 5 shows the mechanism of k-NN classifier.It is based on the value of k, which is used to compute the distances between training objects (circular shapes) and test object (star shape).For instance, in the case of considering k value = 3, the k-NN classifies the closest 3 training objects to the test object, and then calculates the average of them.In this case, the star is as purple-circle.Similarly, in case of considering k value = 6, k-NN classifies the closest 6 training objects to the test object, and then calculates the average of them.In this case, the star is classified as yellow circle.

SVM
SVM was developed by Cortes and Vapnik in 1995 [25] and has extensively been used as a powerful classification algorithm for pattern recognition applications.In addition, it gives promising and excellent performance on the range of machine learning by applying it to different classification problems, data separation, regression, and density estimation [26].
SVM classifier has many advantages, which make it one of the accurate and robust algorithms, such as: • Gives promising performance even with small number of images in training set.• Not sensitive to the number of dimensions, which gives it promising performance with any images size.• Ability to minimize empirical and structural risk, which leads to better generalization for data classification.
The main task of SVM is based on searching for the OSH "Optimal Separating Hyper-plane", which is the closest point between two classes (positive and negative samples) of data in the training set.By increasing the margin between these classes, SVM can modify the input data into a high-dimensional feature space where a hyperplane may be found.Furthermore, it can reduce the structural risk; hence reducing the number of predictable errors [24].However, the nearest OSH data to the border of each class are called the "Support Vectors" (see Fig. 6).
Figure 6 Basic Concept of SVM Fig. 6 shows how SVM classifier can distinguish between two classes, where the Class 1 (star shapes) contains the positive features; and Class 2 (circular shapes) contains the negative features.SVM starts with increasing the margin between the two classes bit by bit to find the OSH, which are the closest points between these classes.The OSH features in each class (orange color) are known as "Support Vectors" and used by SVM in classification process.

Evaluation Phase 3.5.1 Cross-Validation Techniques
The mechanism of Cross-Validation techniques, such as Leave-One-Out Cross-Validation (LOO), 2-fold Cross-Validation, or 10-fold Cross-Validation, is simple; the dataset is split into N subsets, where N is the number of samples in the dataset.Then, the classification process is repeated N times, in each time, N-1 of subsets are used to train the classifier, and only one subset is used for evaluation.In this study, Cross-Validation techniques (2fold, 10-fold, and LOO) have been applied.

Confusion Matrix
The confusion matrix, which is also called an error matrix or a contingency table, provides a simple detail and visualization about predicted and actual classes that are accomplished by a classifier.The system's performance is generally evaluated by using the details mentioned in this matrix.

EXPERIMENTAL CLASSIFICATION RESULTS AND ANALYSIS
As we explained in section (3.1.),FG-NET aging database contains 1,002 facial images, whereas UTD database contains 580 facial images.Therefore, we have combined both databases in one bigger database containing 1,582 facial images, and then separated these images into 11 classes depending on their ages.We review all experiments done in order to predict the age range of people in following the sub-sections.

SVM Based Classification
In here, we trained liner, polynomial, and RBF kernels of SVM classifier on 1,582 images from FG-NET and UTD databases by using Cross-Validation techniques.Consequently, when the HOG features are used with SVM, age prediction accuracy was 98.60% (best result: HOG + Liner_SVM + LOO), whereas when LBP features are used with SVM, the accuracy was 98.29% (best result: LBP + Liner_SVM + LOO).However, we proposed to combine the extracted HOG and LBP features from each image together, and save them in a single vector.In case of combined HOG and LBP features, 99.87% age prediction rate has been achieved by using the combined features with Liner_SVM.
Tab. 2 shows all experimental results achieved by proposed methods.In addition, we evaluated classifiers by using Confusion Matrix (CM), which provides details and visualization about predicted and actual classes that are accomplished by the SVM classifier with HOG, LBP and combined HOG&LBP features.As can be seen from Table 3, the errors occur on the neighbour classes.For example, the error on the 56-60 agerange class is on the (up to 60) class.This shows that the person on the age range of 56-60 is predicted to be in the up to 60 age range

k-NN Based Classification
In these experiments, we trained the k-NN classifier on 1,582 images from FG-Net and UTD databases by using Leave-One-Out technique.As we explained in section 3.4.1,k-NN classifier performance is highly sensitive to the number of k value and the results are affected by any changes in it.Moreover, determining k value is not easy, because it is affected by the parameters like the number of samples that we have in Training set, and the type of feature extractor algorithm we used.Therefore, in this study, much extensive experimentation has been done in order to determine the best optimal k value that can give high performance (see Fig. 7).
As can be seen from Fig. 7, changing the k value from 1 to 30 leads to attaining a different performance.In addition, we notice that the best performance can be achieved when k value is equal to 12, 19, and 21 in case of using HOG, LBP, and combined HOG & LBP features, respectively.Consequently, when the HOG features were used with k-NN, age prediction accuracy was 98.23%.Similarly, when LBP features are used with k-NN, the accuracy was 88.05%, in case of considering k value = 19.In order to increase the system's performance, we combined HOG and LBP features.Thus, we got 99.43% age prediction rate by applying k-NN classifier and considering k value = 21.Tab. 4 shows the summary of the performance of the proposed method.In addition, we have evaluated the classifiers by using Confusion Matrix (CM), which provides simple details and visualization about predicted and actual classes that are accomplished by the k-NN classifier with HOG, LBP and combined HOG&LBP features.

DISCUSSION AND CONCLUSION
In this study, in order to predict the age of any person from his/her face image, many extensive experimentations were carried out to obtain high accuracy and performance.We proposed an efficient and accurate approach by combining the extracted HOG and LBP features from each image in our dataset, and we used SVM and k-NN as the classifiers.Moreover, our improvements for dimensions alignment, which is used to reduce the computation cost, and Histogram Equalization technique, which is used to minimize the illumination effects in different images, had been applied successfully on all images in FG-NET and UTD databases, hence we obtained promising and accurate results.Furthermore, our investigations and extensive experiments confirm that using our proposed method (combining HOG and LBP features), and using correct k value when using k-NN classifier, leads to achieve excellent performance.
In summary, the experimental results show that when Liner SVM is used, the performance is on the range of 98% for each feature extractor (HOG and LBP).However, when the LBP and HOG features are combined, the performance increases to 99%.This is an expected case because in the combined approach the number of features used is increased.The same is true for the k-NN based predictions.When LBP and HOG features are combined together and used with k-NN, the performance is also on the range of 99%.So we conclude that it is necessary to combine LBP and HOG features together in order to get a high performance for age range predictions.
Comparing with other similar studies like [7 and 10] that are discussed in section 2, our results are promising and significant, where the best result achieved by [7] was 94.93% success age estimation by applying LARR + SVM on 500 images from FG-Net database.Similarly, the best accuracy that was achieved by [10] was 85% by applying HOG + Random-forests on UTD and FG-Net databases.On the other hand, the age prediction accuracy of all 1,582 experimental images has been achieved to 99.87% when using our proposed method with Lineer SVM classifier, and 99.43% when using our proposed method with k-NN.Moreover, most of the studies in the literature tried to predict ages based on a small number of age classes, for example, 3 classes like young, young adults or senior adults as performed in [1].Therefore, in the proposed study, we focused on the more complicated problem than the similar studies, and we got promising results.However, LOO technique has a disadvantage that it may consume a lot of processing time when applying it on large amount of data because it evaluates all images present in the dataset one by one.Another disadvantage is that we directly used facial images to test the performance of the proposed methods, so detecting faces in the images is out of scope of this work.In addition, we still predict the age range of the person, not his exact age.In a next study, we plan to focus on predicting the exact age of the people.

Figure 1
Figure 1 General Methodology of Age Prediction System

Figure 2 Figure 3
Figure 2 Examples from FG-NET Database

Figure 4
Figure 4 Histogram Equalization (HE) Fig. 4 shows the histogram for an image before and after Histogram Equalization.It describes the distribution of image's intensity values and the range of illumination.It can be seen that histogram equalization improves the image's contrast, where the intensity values can be affected by faraway pixels, so it tries to redistribute these values equally across the image.However, the Cumulative Histogram line indicates the distribution of intensity values, in which the exact linear ramp means the number of intensity values are equalized.It is noticeable that after applying Histogram Equalization on the image, the Cumulative Histogram is almost linear, which means that the intensity values have been distributed equally.

Figure 5
Figure 5 Basic Concept of k-NN

Table 2
SVM Based Age-Classification of FG-NET and UTD Databases

Table 5
Confusion Matrix Evaluation of Combined HOG & LBP + k-NN