Facial Expression Recognition System Using Haar Cascades Classifier
Facial Expression Recognition System Using Haar Cascades Classifier
https://doi.org/10.22214/ijraset.2022.45141
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VI June 2022- Available at www.ijraset.com
Abstract: Facial expression conveys non-verbal cues, which play a crucial role in social relations. Facial Expression
Recognition is a significant yet challenging task, as we can use it to identify the emotions and the mental state of an individual.
In this system, using image processing and machine learning, we compare the captured image with the trained dataset and then
display the emotional state of the image. To design a robust facial feature recognition system, Local Binary Pattern(LBP) is
used. We then assess the performance of the suggested system by using a database that is trained, with the help of Neural
Networks. The results show the competitive classification accuracy of various emotions.
Keywords: Facial expression recognition, Local binary Pattern(LBP), Convolutional Neural Networks(CNN), Feature
extraction, Haar Cascade Classifier.
I. INTRODUCTION
A person's facial expressions serve as a medium of communication in verbal and nonverbal communication because they are the
outward expression of their affective response, line of thinking, intent, and persona. Generally, gestures or facial expressions or even
involuntary languages are used by humans to express their emotional state, instead of verbal communication. It has been studied for
a long time and has made significant progress in recent decades. Despite the progress, identifying and understanding facial
expressions accurately is still challenging since the complexity is very high and the variety of facial expressions are too many. It has
the ability to become a very useful, nonverbal way for people to communicate with one another in the near future. What matters is
how well the system detects or extracts facial expressions from images. It is gaining popularity because it has the potential to be
widely used in a variety of fields such as lie detection, medical assessment, and human-computer interface.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 4948
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VI June 2022- Available at www.ijraset.com
III. METHODOLOGY
The proposed system offers a solution which minimizes the resources required for identification of emotion using facial features.
The trained face detector scans an image by a sub-window at various scales. A cascade classifier made of several stage classifiers
tests each sub-window. If the sub window clearly does not consist of a part of the face, one of the initial stages in the cascade will
reject it. If it is difficult to identify, a more specific one will further classify it.
The proposed system uses the Haar Cascades or Viola-Jones Technique to identify if the images consist of a face or not. If a face is
present, the areas containing eyes, mouth are determined and cropped out from the image. Sobel edge detection method is used to
detect filters and edges, followed by feature extraction. We train the feature extraction model using the neural networks and then use
it to classify the emotions.
4) Feature Extraction: We use the image of the face after preprocessing for extracting the significant features from it. Such
features are identified and extracted using the Linear Binary Pattern algorithm, which is described below.
5) Local Binary Pattern: As described in section 1.2, LBP codes are obtained for each pixel. Further extension of LBP is to use
uniform patterns. We consider the binary string as circular and check if there are any transitions from 0 to 1 or 1 to 0. If any
such transitions are present, the LBP is said to be uniform.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 4949
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VI June 2022- Available at www.ijraset.com
Where n is the number of distinct labels given as the result by the LBP operator and
This histogram consists of information about the distribution of edges, spots and flat areas all over the image. For efficient face
representation, features extracted should store the previous spatial information. Hence, the image is divided into m small regions and
we define a spatially enhanced histogram as
6) Classification: The dimensionality of data obtained from the feature extraction method is very high so it is reduced during
classification. Features take different values for objects belonging to different classes so classification will be done using CNN.
A typical architecture of a convolutional neural network contains an input layer, some convolutional layers, some fully
connected layers, and an output layer at the end. The CNN is designed with some modifications on LeNet Architecture. The
architecture of the Convolution Neural Network used in the project is shown in the following Figure 3.1.1.
Output layer consists of seven distinct classes. Using the Softmax activation function, the probabilities for all the seven classes are
calculated individually, and the output is obtained. The class with the highest probability is the resultant predicted class.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 4950
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VI June 2022- Available at www.ijraset.com
A. Precision
Precision can be a positive value or a negative value based on the class for which it is being calculated for. It evaluates the
predictive power of the algorithm. Precision is the ratio of true positives to the total number of positives as predicted by the model.
where,
tp = number of correctly predicted positives,
fp = number of negatives that are incorrectly predicted as positives.
B. Recall
Recall is a function of the model’s examples that are classified correctly and its misclassified examples (true positives and false
negatives). Recall is the percentage of correctly assigned expressions in relation to the total number of expressions.
where,
tp = number of correctly predicted positives
fn = number of incorrectly predicted negative classes.
C. F1 score
F1 Score is the harmonic mean of Precision and Recall. It takes both false positives and negatives into account. It is more difficult to
understand than accuracy, but it is more useful when the class distribution is uneven.
V. RESULT ANALYSIS
The confusion matrix and the normalized confusion matrix for all seven facial expression classes are shown below in figures 5.1 and
5.2 respectively :
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 4951
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VI June 2022- Available at www.ijraset.com
The performance metrics for each emotion is calculated and shown in the below table.
The below Figure 5.3 shows the successful detection of the user’s face and classification of the expression as happy when the user
smiles.
The below Figure 5.3 shows the successful recognition of the expression as fear when the user is scared.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 4952
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VI June 2022- Available at www.ijraset.com
The below Figure 5.5 shows the successful recognition of the expression as surprise.
The below Figure 5.6 shows the successful recognition of the expression as neutral when there is no visible expression on the user’s
face.
To measure the performance of proposed algorithms and methods and check the results accuracy, we have evaluated the system
with its resultant values of precision, recall and F1 score. The same dataset was used for both training and testing by splitting the
datasets into training samples and testing samples. The Accuracy obtained from Kaggle dataset was 86.7%, precision was 0.87,
recall was 0.87 and F1-score was 0.87.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 4953
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VI June 2022- Available at www.ijraset.com
VI. CONCLUSION
We can say that the model was successfully able to detect and classify various human emotions. There is a difference in the
accuracy obtained from the model as the landmark positions of the face are known and fixed, i.e. the mouth and both the eyes. So it
can be concluded that this method can be used instead of a traditional face detection, which requires locating more landmarks. A
good combination of efficiency and accuracy can be achieved using Haar Cascades for detection and identifying various types of
emotions through neural networks.
REFERENCES
[1] Dornaika, Fadi & Bosaghzadeh, Alireza & Salmane, Houssam & Ruichek, Yassine. (2014). Graph-based semi-supervised learning with Local Binary Patterns
for holistic object categorization. Expert Systems with Applications. 41. 7744–7753. 10.1016/j.eswa.2014.06.025.
[2] Y. Tian, L. Brown, A. Hampapur, S. Pankanti, A. Senior, and R. Bolle, “Real world real-time automatic recognition of facial expression,” in IEEE PETS,
Australia, March 2003.
[3] B. Fasel and J. Luettin, “Automatic facial expression analysis: a survey,” Pattern Recognition, vol. 36, pp. 259–275, 2003.
[4] Fasel, Beat & Luettin, Juergen. (1999). Automatic Facial Expression Analysis: A Survey.
[5] Srivastava, Saumil. (2012). Real Time Facial Expression Recognition Using a Novel Method. The International Journal of Multimedia & Its Applications. 4.
10.5121/ijma.2012.4204.
[6] Shan, C., Gong, S., & McOwan, P. W. (2005, September). Robust facial expression recognition using local binary patterns. In Image Processing, 2005. ICIP
2005. IEEE International Conference on (Vol. 2, pp. II-370). IEEE.
[7] Chen, J., Chen, Z., Chi, Z., & Fu, H. (2014, August). Facial expression recognition based on facial components detection and hog features. In International
Workshops on Electrical and Computer Engineering Subfields (pp. 884-888).
[8] Happy, S. L., George, A., & Routray, A. (2012, December). A real time facial expression classification system using Local Binary Patterns. In Intelligent
Human Computer Interaction (IHCI), 2012 4th International Conference on (pp. 1-5). IEEE.
[9] Zhang, S., Zhao, X., & Lei, B. (2012). Facial expression recognition based on local binary patterns and local fisher discriminant analysis. WSEAS Trans. Signal
Process, 8(1), 21-31.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 4954