HUMAN FACE DETECTION
Face recognition technology has become part of our daily lives with the
growing demand for personal authentication on apps and services on the web.
It is a popular methodology today to automatically verify a person by matching
his digital image with a database of images.
A branch of biometrics to identify users, face recognition prevents misuse or
unauthorized use of services and information in a fight against a growing
number of cyber crimes like credit card misuse and computer hacking or
security breach in organizations.
Used in a large number of applications in HCI (Human-Computer
Interface), security, robotics, entertainment, and games, facial recognition
brings immense advantage to the companies and end-users, helping them
enhance their security and track down the trespassers.
It replaces other security methods passwords, PINs, sharp cards, plastic cards,
tokens, keys, and so on. Passwords and PINs are hard to recall, while cards,
tokens, scratches, etc. can be lost, disregarded, purloined, or duplicated.
In this post, we list the top 250 research papers and projects in face
recognition, published recently. Feel free to download. Share your own
research papers with us to be added to this list.
Here are the steps involved in a human face detection ,a deep learning project:
Step 1: Data Collection
1. Gather datasets of images containing human faces (e.g., FDDB, WIDER
2. Ensure diversity in face orientations, lighting conditions, and backgrounds.
3. Collect around 10,000-50,000 images.
Step 2: Data Preprocessing
1. Resize images to uniform size (e.g., 224x224).
2. Normalize pixel values (e.g., 0-1 or -1 to 1).
3. Apply data augmentation (rotation, flipping, scaling).
4. Split data into training (80%), validation (10%), and testing (10%) sets.
Step 3: Model Selection
1. Choose a deep learning architecture (e.g., CNN, YOLO, SSD, Faster R-CNN).
2. Consider using pre-trained models (e.g., VGGFace, ResNet50).
Step 4: Model Training
1. Define loss function (e.g., cross-entropy, mean squared error).
2. Set hyperparameters (learning rate, batch size, epochs).
3. Train the model using the training set.
4. Monitor performance on validation set.
Step 5: Model Evaluation
1. Evaluate model on testing set.
2. Metrics: precision, recall, AP (average precision), IoU (intersection over
union).
3. Compare with state-of-the-art models.
Tools and Frameworks:
1. TensorFlow 2. Keras 3. OpenCV
import cv2
- Computer vision library for image and video processing.
- Functions for image filtering, thresholding, edge detection, feature
extraction, and more.
-Image Processing
1. [Link](): Read an image from file.
2. [Link](): Display an image.
3. [Link](): Save an image to file.
4. [Link](): Resize an image.
5. [Link](): Convert image color space.
-Object Detection
1. [Link](): Detect objects using Haar cascades.
2. [Link](): Load deep learning models.
import numpy as np
- Library for efficient numerical computation.
- Supports multi-dimensional arrays and matrices.
- Essential for scientific computing and data analysis.
from [Link] import Sequential
- importing essential layers from TensorFlow Keras for building a CNN.
Layers:
1. Sequential(): A linear stack of layers.
2. Conv2D(): 2D convolutional layer for image feature extraction.
3. MaxPooling2D(): 2D pooling layer for downsampling.
4. Flatten(): Layer to flatten input data for dense layers.
5. Dense(): Fully connected (dense) layer for classification and regression
face_cascade=[Link]([Link]
des+'haarcascad e_frontalface_default.xml')
-Loading a pre-trained Haar cascade classifier for detecting frontal face.
1. [Link](): Creates a cascade classifier object.
2. [Link]: Path to pre-trained Haar cascade files.
3. +'haarcascade_frontalface_default.xml': Specific file for frontal face
detection.
Haar Cascade Classifiers:
1. Trained on positive (face) and negative (non-face) images.
2. Uses AdaBoost algorithm to select features.
3. Features are Haar-like wavelets (edge, line, rectangle).
[Link](Conv2D(32, (3, 3), activation='relu',
input_shape=(64, 64, 3)))
[Link](MaxPooling2D(pool_size=(2, 2)))
[Link](Conv2D(64, (3, 3), activation='relu'))
[Link](MaxPooling2D(pool_size=(2, 2)))
[Link](Flatten())
Layer Explanation:
1. Conv2D:
- 32 filters to detect features
- 3x3 kernel size for local connectivity
- ReLU activation for non-linearity
- Input shape: 64x64x3 (RGB images)
2. MaxPooling2D:
- Pool size 2x2 to reduce spatial dimensions
- Stride 2x2 to move pooling window
3. Conv2D (second layer):
- 64 filters to detect more complex features
- 3x3 kernel size - ReLU activation
4. MaxPooling2D (second layer):
- Pool size 2x2 - Stride 2x2 5. Flatten:
- Flatten output to 1D for dense layers
Output Shapes:
1. Conv2D: (64, 62, 62, 32)
2. MaxPooling2D: (64, 31, 31, 32)
3. Conv2D: (64, 29, 29, 64)
4. MaxPooling2D: (64, 14, 14, 64)
5. Flatten: (64, 12544)
[Link](Dense(128, activation='relu'))
[Link](Dense(1, activation='sigmoid'))
1. Dense (128 units, ReLU activation):
- 128 units (neurons) for complex feature learning
- ReLU activation for non-linearity
- Output shape: (None, 128)
2. Dense (1 unit, sigmoid activation):
- 1 unit (neuron) for binary classification (0/1, yes/no)
- Sigmoid activation for probability output (0 to 1)
- Output shape: (None, 1)
[Link](optimizer='adam',
loss='binary_crossentropy', metrics=['accuracy'])
1. Optimizer: Adam (adaptive learning rate, momentum)
2. Loss Function: Binary Cross-Entropy (BCE)
3. Metrics: Accuracy
Binary Cross-Entropy Loss:
1. Suitable for binary classification problems
2. Measures difference between predicted probabilities and true labels
img = [Link](img_path)
- Reads an image from the specified path using OpenCV's imread function.
- Returns a BGR (Blue, Green, Red) color image.
gray = [Link](img, cv2.COLOR_BGR2GRAY)
- Converts the BGR image to grayscale using OpenCV's cvtColor function.
- COLOR_BGR2GRAY is a flag specifying the conversion.
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
1. face_cascade: Pre-trained Haar cascade classifier for face detection.
2. gray: Grayscale image.
3. 1.3: scaleFactor - scales the image by 30% at each image scale.
4. 5: minNeighbors - minimum number of neighboring rectangles to consider.
[Link](img, (x, y), (x+w, y+h), (255, 0, 0), 2)
1. img: Input image.
2. (x, y): Top-left corner coordinates.
3. (x+w, y+h): Bottom-right corner coordinates.
4. (255, 0, 0): Red color (BGR).
5. 2: Rectangle thickness.
face_region = img[y:y+h, x:x+w]
1. img: Input image.
2. y:y+h: Row range (vertical slice).
3. x:x+w: Column range (horizontal slice).
face_resized = [Link](face_region, (64, 64))
1. face_region: Input face region.
2. (64, 64): New dimensions.
face_resized = face_resized / 255.0
-Divides pixel values by 255 to normalize between 0 and 1.
face_resized = [Link](face_resized, (1, 64, 64, 3))
Dimensions:
1. 1: Batch size (single image)
2. 64: Height 3. 64: Width 4.
3: Color channels (BGR)
img_path = '/content/[Link]'
Input image:
Output image: