0% found this document useful (0 votes)
161 views26 pages

Face Detection & Emotion Recognition

This document summarizes a student project on face detection and emotion recognition using Python. The students first built a program to detect faces in images using OpenCV and a haar cascade classifier. They then created a deep learning model with Keras to classify emotions from facial expressions into 7 categories (happy, sad, angry, surprise, fear, neutral, disgust) by training on the FER2013 dataset. The model uses convolutional and max pooling layers to extract features from images, followed by dense layers to perform classification. Testing on held-out data achieved accuracy in emotion recognition.

Uploaded by

Nishu Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
161 views26 pages

Face Detection & Emotion Recognition

This document summarizes a student project on face detection and emotion recognition using Python. The students first built a program to detect faces in images using OpenCV and a haar cascade classifier. They then created a deep learning model with Keras to classify emotions from facial expressions into 7 categories (happy, sad, angry, surprise, fear, neutral, disgust) by training on the FER2013 dataset. The model uses convolutional and max pooling layers to extract features from images, followed by dense layers to perform classification. Testing on held-out data achieved accuracy in emotion recognition.

Uploaded by

Nishu Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 26

GUJARAT TECHNOLOGICAL UNIVERSITY

PROJECT REPORT ON
‘FACE DETECTION AND EMOTION RECOGNITION’
SUBMITTED BY
NISHU TIWARI 170420117053

HARDIK VALANKAR 160420117055

AMIN NACHIKET 160420117001

PATEL AAKARSH 160420117035

SUBJECT
Project -2

PROJECT GUIDE
PROF. NANDKISHOR JOSHI

DEPARTMENT
INSTRUMENTATION AND CONTROL
Introduction
• A face is detected and identified within an acquired digital image.
• One or more features of the face is/are extracted from the digital image, including two
independent eyes or subsets of features of each of the two eyes, or lips or partial lips or one or
more other mouth features and one or both eyes, or both.
• A model including multiple shape parameters is applied to the two independent eyes or
subsets of features of each of the two eyes, and/or to the lips or partial lips or one or more
other mouth features and one or both eyes.
• One or more similarities between the one or more features of the face and a library of
reference feature sets is/are determined.
• A probable facial expression is identified based on the determining of the one or more
similarities.
• “Perhaps the most compelling argument for [facial recognition software] is that it can make
law enforcement more efficient,” Shannon Togawa Mercer and Ashley Deeks write on Lawfare.
Face Detection

• We have used python program for detecting human faces through


webcam.
• OpenCV in python is used for real time detection of image.
• Hence, after running the entire code in VS code or python prompt the
webcam detects human faces.
• We have used the haar cascade application in image processing for
detection which is stored as an xml file.
Figure 1
LIBRARIES AND APPLICATIONS USED
• Python – It is an interpreted, high-level and general –purpose programming language.
• OpenCV – It is an Open source computer vision and machine learning library. The library has
more than 2500 optimized algoritms which can be used to detect and recognize faces,
identify objects, classify human actions in videos, track camera movements , etc.
• OpenCV in python – It is a library of python bindings to solve computer vision problems . It
makes use of Numpy , which is an highly optimized library for numerical operations with a
MATLAB-style syntax.
• HAAR-CASCADE – It is a machine learning based approach where a lot of positive and
negative images are used to train the classifier. Positive images are the ones that are
needed to be identified whereas Negative images are the ones which need not be detected.
Code for face detection
• import cv2
• face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml');
• video = cv2.VideoCapture(0);
• while True:
•     check,frame = video.read();
•     faces = face_cascade.detectMultiScale(frame, scaleFactor=1.1, minNeighbors=5);

    for x,y,w,h in faces :
•         frame=cv2.rectangle(frame, (x,y), (x+w,y+h), (0,255,0), 3);
•     cv2.imshow('Face Detector', frame);

    key = cv2.waitKey(30);
•     if key == ord('q'):
•         break;

video.release();
• cv2.destroyAllWindows();
Output of Code
EMOTION RECOGNITION
(KERAS) 
 
Emotion of an human is detected on the basis of their facial expression .There are 7
basic human emotions- happy , sad, angry, surprise, fear, neutral, disgust. And
whenever a human feels an emotion it could be seen through their expressions.
Hence, this model detects the expression of human and gives their emotions as an
output.

We have used Keras as a tool for detecting emotions through webcam using python.
LIBRARIES AND APPLICATIONS USED
• Keras - Keras is a neural networks library written in Python that is high-level in nature –
which makes it extremely simple and intuitive to use. It works as a wrapper to low-level
libraries like TensorFlow or Theano high-level neural networks library, written
in Python that works as a wrapper to TensorFlow or Theano.
• Tensorflow - TensorFlow is the open-source library for a number of various tasks in
machine learning. TensorFlow provides both high-level and low-level APIs.
• Face_emotion_recognition – fer in csv file
• MTCNN - MTCNN (Multi-task Cascaded Convolutional Neural Networks) is an
algorithm consisting of 3 stages, which detects the bounding boxes of faces in an image
along with their 5 Point Face Landmarks .
EPOCH
• One Epoch is when an entire dataset passes forward and backward through the
neural network only ONCE. Since one epoch is too big to feed to the computer at
once we divide it in several small batches.
Code for emotion recognition:
•# print(df.info())  
•# print(df["Usage"].value_counts())  
•# print(df.head())  
Emotion recognition.py •X_train,train_y,X_test,test_y=[],[],[],[]  
•import sys, os   •  
•import pandas as pd   •for index, row in df.iterrows():  
•import numpy as np   •    val=row['pixels'].split(" ")  
•   •    try:  
•from keras.models import Sequential   •        if 'Training' in row['Usage']:  
•from keras.layers import Dense, Dropout, Activation, Flatten   •           X_train.append(np.array(val,'float32'))  
•           train_y.append(row['emotion'])  
•from keras.layers import Conv2D, MaxPooling2D, BatchNormalizati
on,AveragePooling2D   •        elif 'PublicTest' in row['Usage']:  
•from keras.losses import categorical_crossentropy   •           X_test.append(np.array(val,'float32'))  
•from keras.optimizers import Adam   •           test_y.append(row['emotion'])  
•    except:  
•from keras.regularizers import l2  
•        print(f"error occured at index :{index} and row:{row}") 
•from keras.utils import np_utils  
•num_features = 64  
•# pd.set_option('display.max_rows', 500)  
•num_labels = 7  
•# pd.set_option('display.max_columns', 500)  
•batch_size = 64  
•# pd.set_option('display.width', 1000)  
•epochs = 30  
• •width, height = 48, 48  
fileName = input("fer2013.csv")
•fileScan = open(fer2013.csv, 'r')
•df=pd.read_csv('fer2013.csv')  
•X_train = np.array(X_train,'float32')   •#2nd convolution layer  
•train_y = np.array(train_y,'float32')   •model.add(Conv2D(64, (3, 3), activation='relu'))  
•X_test = np.array(X_test,'float32')   •model.add(Conv2D(64, (3, 3), activation='relu'))  
•test_y = np.array(test_y,'float32')   •# model.add(BatchNormalization())  
•train_y=np_utils.to_categorical(train_y, num_classes=num_labels)   •model.add(MaxPooling2D(pool_size=(2,2), strides=(2, 2)))  
•test_y=np_utils.to_categorical(test_y, num_classes=num_labels)   •model.add(Dropout(0.5))
• #cannot produce   •#3rd convolution layer  
•#normalizing data between oand 1   •model.add(Conv2D(128, (3, 3), activation='relu'))  
•X_train -= np.mean(X_train, axis=0)   •model.add(Conv2D(128, (3, 3), activation='relu'))  
•X_train /= np.std(X_train, axis=0)   •# model.add(BatchNormalization())  
•X_test -= np.mean(X_test, axis=0)   •model.add(MaxPooling2D(pool_size=(2,2), strides=(2, 2)))
•X_test /= np.std(X_test, axis=0)   •model.add(Flatten()) 
•X_train = X_train.reshape(X_train.shape[0], 48, 48, 1)   •#fully connected neural networks  
X_test = X_test.reshape(X_test.shape[0], 48, 48, 1)   •model.add(Dense(1024, activation='relu'))  
• # print(f"shape:{X_train.shape}")   •model.add(Dropout(0.2))  
•##designing the cnn   •model.add(Dense(1024, activation='relu'))  
•#1st convolution layer   •model.add(Dropout(0.2))
•model = Sequential()   •model.add(Dense(num_labels, activation='softmax')) 
•model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', input_shape=(X_tr •# model.summary()  
ain.shape[1:])))   •  #Compliling the model  
•model.add(Conv2D(64,kernel_size= (3, 3), activation='relu'))   •model.compile(loss=categorical_crossentropy,  
•# model.add(BatchNormalization())   •              optimizer=Adam(),  
•model.add(MaxPooling2D(pool_size=(2,2), strides=(2, 2)))   •              metrics=['accuracy'])  
•model.add(Dropout(0.5))   •  
•#Training the model   •while True:  
•model.fit(X_train, train_y,  
•    ret,test_img=cap.read()# captures frame and returns boolean
•          batch_size=batch_size,  
•          epochs=epochs,  
 value and captured image  
•          verbose=1,   •    if not ret:  
•          validation_data=(X_test, test_y),   •        continue  
•          shuffle=True)   •    gray_img= cv2.cvtColor(test_img, cv2.COLOR_BGR2GRAY)  
•  #Saving the  model to  use it later on   •  
•fer_json = model.to_json()  
•with open("fer.json", "w") as json_file:  
•    faces_detected = face_haar_cascade.detectMultiScale(gray_im
•    json_file.write(fer_json)   g, 1.32, 5)  
•model.save_weights("fer.h5")   •  
•import os   •  
•import cv2   •    for (x,y,w,h) in faces_detected:  
•import numpy as np  
•        cv2.rectangle(test_img,(x,y),(x+w,y+h),
•from keras.models import model_from_json  
(255,0,0),thickness=7)  
•from keras.preprocessing import image  
•  #load model   •        roi_gray=gray_img[y:y+w,x:x+h]#cropping region of inter
•model = model_from_json(open("fer.json", "r").read())   est i.e. face area from  image  
•#load weights   •        roi_gray=cv2.resize(roi_gray,(48,48))  
•model.load_weights('fer.h5')   •        img_pixels = image.img_to_array(roi_gray)  
•face_haar_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xm •        img_pixels = np.expand_dims(img_pixels, axis = 0)  
l')  
•cap=cv2.VideoCapture(0)   •        img_pixels /= 255  
 predictions = model.predict(img_pixels)  
  
        #find max indexed array  
        max_index = np.argmax(predictions[0])  
  
        emotions = ('angry', 'disgust', 'fear', 'happy', 'sad', 'surprise', 'neutral')  
        predicted_emotion = emotions[max_index]  
  
        cv2.putText(test_img, predicted_emotion, (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2)  
  
    resized_img = cv2.resize(test_img, (1000, 700))  
    cv2.imshow('Facial emotion analysis ',resized_img)  
  
  
  
    if cv2.waitKey(10) == ord('q'):#wait until 'q' key is pressed  
        break  
  
cap.release()  
cv2.destroyAllWindows  
Videotester.py
•import os
•import cv2
•import numpy as np
•from keras.models import model_from_json
•from keras.preprocessing import image

#load model
•model = model_from_json(open("fer.json", "r").read())
•#load weights
•model.load_weights('fer.h5')
•face_haar_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
•cap=cv2.VideoCapture(0)

while True:
•    ret,test_img=cap.read()               # captures frame and returns boolean value and captured image
•    if not ret:
•        continue
•    gray_img= cv2.cvtColor(test_img, cv2.COLOR_BGR2GRAY)
•    faces_detected = face_haar_cascade.detectMultiScale(gray_img, 1.32, 5)
•   for (x,y,w,h) in faces_detected:

•        cv2.rectangle(test_img,(x,y),(x+w,y+h),(255,0,0),thickness=7)
•        roi_gray=gray_img[y:y+w,x:x+h]#cropping region of interest i.e. face area from  image
•        roi_gray=cv2.resize(roi_gray,(48,48))
•        img_pixels = image.img_to_array(roi_gray)
•        img_pixels = np.expand_dims(img_pixels, axis = 0)
•        img_pixels /= 255

        predictions = model.predict(img_pixels)

        #find max indexed array
•        max_index = np.argmax(predictions[0])

        emotions = ('angry', 'disgust', 'fear', 'happy', 'sad', 'surprise', 'neutral')
•        predicted_emotion = emotions[max_index]

        cv2.putText(test_img, predicted_emotion, (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2)

    resized_img = cv2.resize(test_img, (1000, 700))
•    cv2.imshow('Facial emotion analysis ',resized_img)

    if cv2.waitKey(10) == ord('q'):#wait until 'q' key is pressed
•        break

cap.release()
•cv2.destroyAllWindows
OUTPUT:
LIMITATIONS OF PROJECT:-
1. Poor Image Quality Limits Facial Recognition's Effectiveness
Image quality affects how well facial-recognition algorithms work. The image quality of scanning video is
quite low compared with that of a digital camera. Even high-definition video is, at best, 1080p (progressive
scan); usually, it is 720p. These values are equivalent to about 2MP and 0.9MP, respectively, while an
inexpensive digital camera attains 15MP. The difference is quite noticeable.
2. Small Image Sizes Make Facial Recognition More Difficult
When a face-detection algorithm finds a face in an image or in a still from a video capture, the relative size of
that face compared with the enrolled image size affects how well the face will be recognized. An already small
image size, coupled with a target distant from the camera, means that the detected face is only 100 to 200 pixels
on a side. Further, having to scan an image for varying face sizes is a processor-intensive activity. Most
algorithms allow specification of a face-size range to help eliminate false positives on detection and speed up
image processing.
3. Different Face Angles Can Throw Off Facial Recognition's Reliability
The relative angle of the target’s face influences the recognition score profoundly. When a face is enrolled
in the recognition software, usually multiple angles are used (profile, frontal and 45-degree are common).
Anything less than a frontal view affects the algorithm’s capability to generate a template for the face. The more
direct the image (both enrolled and probe image) and the higher its resolution, the higher the score of any
resulting matches.
4. Data Processing and Storage Can Limit Facial Recognition Tech
Even though high-definition video is quite low in resolution when compared with digital camera images, it still
occupies significant amounts of disk space. Processing every frame of video is an enormous undertaking, so
usually only a fraction (10 percent to 25 percent) is actually run through a recognition system. To minimize total
processing time, agencies can use clusters of computers. However, adding computers involves considerable data
transfer over a network, which can be bound by input-output restrictions, further limiting processing speed.
OVERCOMING PROBLEMS:-

As technology improves, higher-definition cameras will become available. Computer networks will be able to move more
data, and processors will work faster. Facial-recognition algorithms will be better able to pick out faces from an image and
recognize them in a database of enrolled individuals. The simple mechanisms that defeat today’s algorithms, such as obscuring
parts of the face with sunglasses and masks or changing one’s hairstyle, will be easily overcome.
An immediate way to overcome many of these limitations is to change how images are captured. Using checkpoints, for
example, requires subjects to line up and funnel through a single point. Cameras can then focus on each person closely, yielding
far more useful frontal, higher-resolution probe images. However, wide-scale implementation increases the number of cameras
required.

Evolving biometrics applications are promising. They include not only facial recognition but also gestures, expressions, gait
and vascular patterns, as well as iris, retina, palm print, ear print, voice recognition and scent signatures. A combination of
modalities is superior because it improves a system’s capacity to produce results with a higher degree of confidence. Associated
efforts focus on improving capabilities to collect information from a distance where the target is passive and often unknowing.
Clearly, privacy concerns surround this technology and its use. Finding a balance between national security and individuals’
privacy rights will be the subject of increasing discussion, especially as technology progresses.
CONCLUSION:-
 
 
 
Thus we conclude that, if this project is successfully implemented and working then it would be a great achievement for us as a
team. There could be variations and advancements in the project in future .
 
 
In a companion disclosure we describe an enhanced face model derived from active appearance model (AAM) tech niques which
employs a differential spatial Subspace to pro vide an enhanced real-time depth map. Employing tech niques from advanced
AAM face model generation 31 and the information available from an enhanced depth map we can generate a real-time 3D face
model.
 
The next step, based on the 3D face model is to generate a 3D avatar that can mimic the face of a user in real time. We are
currently exploring various approaches to implement Such a system using our real-time stereoscopic imaging system.
 
 
We would like to thank all our faculty of department , mentors ,parents and also those who have been a part of this project.
THANK YOU

You might also like