Image Sorting Using Object Detection and Face Recognition
Image Sorting Using Object Detection and Face Recognition
ISSN No:-2456-2165
Abstract:- The user generates a bulk amount of face is recognized it will create a directory by the name of
multimedia data among which one of the most used files the given class label and for face recognition ”Linear
is images. The average user does not bother to organize Binary Pattern Histogram” (LBPH) algorithm is used. If a
the images. This application is used to organize the face is not found then object detection will be used on the
images based on the object and faces of a person present image. For object detection YOLO algorithm is used. If any
in the image. It uses object detection, face detection, object is detected in the image, it will be moved to the
face recognition to categorize the images into its respective directory by the name of the class label.
respective directories. Object Detection uses the YOLO
algorithm to detect an object, for face detection it uses Additionally, the search by face option allows users to
Haar Cascade and for face recognition LBPH search images with similar faces in the directory. All the
algorithm. Using the above-mentioned algorithm images images having the same face will be displayed.
are categorized into directories.
II. LITERATURE REVIEW
Keywords:- Object Detection, Face Detection, Face
Recog-nition, YOLO, Haar cascade, LBPH. In this section, we reviewed three different research
papers related to our System, to understand their weakness
I. INTRODUCTION and how we can overcome them.
Image is one of the most used multimedia files. Since A. Object Recognition in Images
smartphones are developed cameras as well, everyone takes Object Recognition is a field of study in image
images and the trillions of images stored. Since these are processing. The process of identifying objects in video or
not going to slow down, the digital image continues to an image is termed as Object Recognition. It has a huge
grow. 1.2 trillion images are generated by the end of 2017. number of applications in the field of activity recognition,
4.7 trillion photos will be stored [10]. 182 photos per month robot localization, and automation, etc. Objects appear
are taken by the average iOS user, whereas 111 photos are different when seen from a different perspective. It should
being taken by average Android user [9]. Since there are be invariant to changed viewpoints, robustness, occlusion
bulk amounts of photos the average user does not bother to and object transformations. This task targets to perform a
organize the images. It takes a considerable amount of time technique including mainly 2 stages. In the first stage, the
to organize these images. input image is categorized using a classifier. In this paper
[1] two type of classifiers are used for classifier
This application helps you to quickly organize the optimization which are ”k-nearest neighbour(kNN)” and
unsorted image. The basic idea is to give images as input to ”Support Vector Machine(SVM) classifier”. SVM classifier
the program, and objects or faces are detected in the uses GIST features and the kNN classifier uses SIFT
images. The image will then be moved or copied to a folder features. Various kernels are used in SVM such as
to their respective class. This allows you to rapidly go Gaussian, Linear, and Polynomial. Feature extraction takes
through and organize your large amounts of pictures. place, forming a similarity matrix. It is given to the kNN
classifier. The comparison shows that the SVM classifier is
The input image is fed to the application, it will detect more accurate than the kNN classifier. Coil-20P and Eth80
whether there is the face or not, if the face is detected in the are the datasets used for the processing.
image then face recognition is applied to the image. For
face detection ”Haar-Cascade algorithm” is used. After the
Find the best threshold value which will classify the 1) Radius: Set usually to 1 pixel.
image as positive or negative. 2) Neighbors: Set usually to 8 pixels.
3) Grid X: Set usually to 8 cells.
Select the features with minimum error rate. 4) Grid Y: Set usually to 8 cells.
1) Crop the image with equal width and height. For e.g. Pr = Conditional class probability.
416 x 416.
2) Divide the image into an S x S grid. Prediction for Bounding box: noitemsep
3) Apply the Convolutional Neural Network.
4) Calculate the predictions of Bounding box with con- – X,Y = Co-ordinates to represent center of box rela-tive to
fidence score and predictions of class probabilities. the bounds of grid cell.
5) Non-max suppression of the bounding boxes. – W,H = Width and Height predicted relative to the whole
6) Final image with the bounding boxes, class labels and image.
confidence scores is obtained. – Confidence: Represents the IOU between the pre-dicted
box and the ground truth box.
Prediction of class probabilities: Confidence is given as;
Confidence = Pr(Object)*IoU where, – Class specific confidence score is given as:
Pr(classi)*IOU = Pr(classi—object)*(Predicted Con-
IoU = Intersections over Union. fidence)
IoU = Highest bounding box Pr(classi)*IOU = Pr(classi—object)*Pr(object)*IOU