0% found this document useful (0 votes)
272 views

Object Recognition

This document is a project report on object recognition using template matching. It discusses developing an automated object detection system to analyze target object motion in video streams from video surveillance. The objectives are to detect, track and analyze object motion. It discusses key steps of object detection in video, tracking objects across frames, and analyzing tracked objects to recognize behaviors. The project aims to implement these techniques to intelligently detect and track human motion, which has applications in surveillance, robotics, military, and more. It outlines the introduction, literature review, proposed techniques, work done, results and future work.

Uploaded by

Elizabeth Weaver
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
272 views

Object Recognition

This document is a project report on object recognition using template matching. It discusses developing an automated object detection system to analyze target object motion in video streams from video surveillance. The objectives are to detect, track and analyze object motion. It discusses key steps of object detection in video, tracking objects across frames, and analyzing tracked objects to recognize behaviors. The project aims to implement these techniques to intelligently detect and track human motion, which has applications in surveillance, robotics, military, and more. It outlines the introduction, literature review, proposed techniques, work done, results and future work.

Uploaded by

Elizabeth Weaver
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 46

A Project Report on Object Recognition Using Template Matching

Object Recognition using Template Matching


B Tech (Electronics and Telecommunication)
By

Aaruni Bhugul Pulkit Khandelwal Sanyam Mehndiratta


Under the Guidance of

Prof. Aniket Kulkarni

Department of Electronics and Telecommunication Engineering S.V.K.MsNMIMS, Mukesh Patel Schoolof Technology Management and Engineering
MAY 2014
Page i

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

A Project Report on Object Recognition Using Template Matching

CERTIFICATE
This is to certify that this B.Tech (Telecommunication) Project report titled Object Recognition Using Template Matching approved by Prof. Aniket Kulkarni is approved by me. It is further certified that, to the best of my knowledge, the report represents work carried out by the student at MPSTME, SVKMs NMIMS, Shirpur campus during the academic year 2013-14 (Project Stage).

Date: Place: Shirpur

________________ (Prof.Aniket Kulkarni) Project Mentor

________________ (Prof. Shashikant S. Patil) Head, EXTC Dept.

Internal Examiner: 1. _______________

External Examiner: 1. ______________

2. ________________

2. _______________

_______________ (Dr. M. V. Deshpande) Associate Dean


Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page ii

A Project Report on Object Recognition Using Template Matching

Acknowledgement

Apart from, the success of our seminar depends largely on the encouragement and guidelines of many others. We take this opportunity to express my gratitude to the people who have been instrumental in the successful completion of this seminar. We would like to show our greatest appreciation to Prof. Aniket Kulkarni, our project guide and Prof. Shashikant Patil Head, EXTC Dept, Dr. M. V. Deshpande (Associate Dean). We cant thank them enough for their tremendous support and help. Without their encouragement and guidance this seminar would not have materialized. We are grateful for the guidance, support and help received from other students who contributed and contributing to this seminar which is vital for the success of this seminar.

Name of the students: Aaruni Bhugul(702) Pulkit Khandelwal(725) Sanyam Mehndiratta (732)

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page iii

A Project Report on Object Recognition Using Template Matching

Table of Contents
Chapter No.Title Page No.

Acknowledgement Abstract 1. Objective 1.1 Objective 1.2 Motivation and Justification 2. Introduction 2.1 General Framework 3. 4. Literature Review Theoretical Background 4.1 Programming Language 4.2 OS Support 4.3 Background Substrate 4.4 Gaussian Filtering 4.5 Object Detection Algorithm 5. Proposed Technique 5.1 Sliding Window Object Localization 5.2 Template Matching 5.2.1 Template Matching by Cross Correlation 5.2.2 Normalized cross Correlation
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

iii viii

9 9 10 11 12 14 15 15 17 17 19 24 26 30 32 32
Page iv

A Project Report on Object Recognition Using Template Matching

6. 7. 8.

Work Done Results Future Work 8.1 Human Action Analysis 8.2 Hardware Implementation 8.3 Modification and Algorithm for PTZ Camera

33 37 45 45 45 45

References

46

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page v

A Project Report on Object Recognition Using Template Matching

List of Figures

Figure 2.1 Figure 3.1.2 Figure 3.3 Figure 5.1(a) Figure5.1.(b)

: General Framework : Open CV Framework : Gaussian Filtering : Sliding Window Object Localization : Sliding template image over source image

11 14 18 27 28 29 37 37 38 38 38 39 39 40 40 40 40 40 41
Page vi

Figure 5.2 (b) : Resultant image with maximum match Figure 7.1(a) Figure 7.1(b) Figure 7.1(c) Figure 7.1(d) Figure 7.1(e) Figure 7.1(f) Figure 7.2(a) Figure 7.2(b) Figure 7.2(c) Figure 7.2(d) Figure 7.2(e) Figure 7.2(f) Figure 7.3(a) : Sample 1 original image : Sample 1 template image 1 : Sample 1 template image 2 : Sample 1 image after applying algorithm : Sample 1 resultant window 1 : Sample 1 resultant window 2 : Sample 2 original image : Sample 2 template image 1 : Sample 2 template image 2 : Sample 2 image after applying algorithm : Sample 2 resultant window 1 : Sample 2 resultant window 2 : Sample 3 original image

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

A Project Report on Object Recognition Using Template Matching

Figure 7.3(b) Figure 7.3(c) Figure 7.3(d) Figure 7.3(e) Figure 7.3(f) Figure 7.4(a) Figure 7.4(b) Figure 7.4(c) Figure 7.4(d) Figure 7.4(e) Figure 7.4(f)

: Sample 3 template image 1 : Sample 3 template image 2 : Sample 3 image after applying algorithm : Sample 3 resultant window 1 : Sample 3 resultant window 2 : Sample 4 original image : Sample 4 template image 1 : Sample 4 template image 2 : Sample 4 image after applying algorithm : Sample 4 resultant window 1 : Sample 4 resultant window 2

42 42 42 42 42 43 43 43 44 44 44

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page vii

A Project Report on Object Recognition Using Template Matching

Abstract

A computer vision system has been developed for real-time Motion detection and human motion tracking of 3 D objects including those of variable internal parameters. A fast algorithm based on various algorithms of Template matching like correlation matrix, absolute difference matrix, and their normalized parts have been implemented along with a Template Updating technique using sliding window object localization approach to track the motion of a detected body in the surveillance video. A fast algorithm based on color based differentiation technique is also implemented which tracks the moving object on the basis of its dominant color. Furthermore, a data structure implementation algorithm has been proposed to reject the non-useful areas of a binary image formed after various filtering techniques. The algorithms implemented provide accurate results for the human surveillance. The methods allows for larger frame to frame motion and can robustly track models with degrees of freedom while running on relatively inexpensive hardware. These provide a reasonable compromise between the simplicity of parameterization and the expressive power for subsequent scene understanding. The proposed applications of algorithms implemented in this report could be human motion analysis in visual surveillance, where path of the person is required.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page viii

A Project Report on Object Recognition Using Template Matching

Chapter 1 Objective

1.1

Objective

To develop an automated Object detection system for analyzing motion of target object in a video stream from video surveillance

1.2

Motivation & Justification

Object detection, path tracking & Action Recognition are the most active fields of research in the field of Computer Vision & Image Processing. Traditional surveillance systems require human beings to continuously monitor several incoming videos. Surveillance cameras are already prevalent in commercial establishments, while camera outputs are usually recorded in tapes or stored in video archives. Such systems are prone to human errors. Thats why there is need of an automated intelligent system to detect classify and track human motion. Major concern is to detect the required object or required human in a video, which is essentially required in most of real life applications like robotics, defence etc.

The areas where the object detection and human motion analysis systems can be used are: 1.1 For surveillance and monitoring of the people to ensure that they are within the norms 1.2 For Military and Police surveillance. 1.3 In the field of Robotics where path tracing and motion analysis is required. 1.4 Educational & Manufacturing industries.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 9

A Project Report on Object Recognition Using Template Matching

Chapter 2 Introduction
Object Detection, Classification and Tracking is an important task within the field of computer vision. There are three key steps in video analysis:-

1) Detection of moving objects of interest.

2) Tracking of such objects from frame to frame.

3) Analysis of object tracked to recognize their behavior.

Object detection in video streams has been a popular topic in the field of computer vision. Tracking is a particularly important issue in human motion analysis since it serves as a means to prepare data for pose estimation and action recognition. In contrast to human detection, human tracking belongs to a higher-level computer vision problem. However, the tracking algorithms within human motion analysis usually have considerable intersection with motion segmentation during processing.

As one of the most active research areas in computer vision, visual analysis of human motion attempts to detect, track and identify people, and more generally, to interpret human behavior, from image sequences involving humans. Human motion analysis has attracted great interests from computer vision researchers due to its promising applications in many areas such as visual surveillance, perceptual user interface, contentbased image storage and retrieval, video conferencing, athletic performance analysis, virtual reality, etc.

Videos are actually sequences of images, each of which called a frame, displayed in fast
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 10

A Project Report on Object Recognition Using Template Matching

enough frequency so that human eyes can percept the continuity of its content. It is obvious that all image processing techniques can be applied to individual frames. Besides, the contents of two consecutive frames are usually closely related. This project is to be implemented on OpenCV (Open source Computer Vision Library), which is an open source C++ library for image processing and computer vision, originally designed by Intel.

2.1 General Framework

Figure 2.1: General Framework

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 11

A Project Report on Object Recognition Using Template Matching

Chapter 3 Literature Survey

A general framework [9] for Object detection analysis involves stages such as motion detection with the help of background subtraction and foreground segmentation, object classification, and motion tracking.

3.1 Object Motion Tracking


Wang[10] classifies object motion analysis into three parts, namely object detection, object tracking & object behavior understanding. The importance and popularity of object motion analysis has led to several previous surveys. Each such survey is discussed in the following in order to put the current review in context.

The focuses were on three major areas related to interpreting human motion: (a) motion analysis involving human body parts, (b) tracking moving human from a single view or multiple camera perspectives, and (c) recognizing human activities from image sequences.

Collins et al. [13] classified moving object blobs into four classes such as single human, vehicles, human groups and clutter, using two factors, namely area and shape factor.

Bo Wu and Ram Nevatia [14] proposed an approach to automatically track multiple, possibly partially occluded humans in a walking or standing pose from a single camera, which may be stationary or moving. A human body is represented as an assembly of body parts. Part detectors are learned by boosting a number of weak classifiers which are based on edge-let features. Responses of part detectors are combined to form a joint likelihood

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 12

A Project Report on Object Recognition Using Template Matching

model that includes an analysis of possible occlusions. The combined detection responses and the part detection responses provide the observations used for tracking. An object is tracked by data association and mean-shift methods. This system can track humans with both inter-object and scene occlusions with static and non-static backgrounds. This method yields good results but at the sake of high computations. Paper does not explore the interaction between detection and tracking. The proposed system works in a sequential way: tracking takes the results of detection as input. How- ever, tracking can be used to facilitate detection. One of the most straightforward ways is to speedup detection by restraining the searching in the neighborhood of prediction by tracking. Liang Xiao[15] talks about two types of Image sequences formed by the moving target one is the static background, the other is the varying background. It states that former case usually occurs in the camera which is in a relatively static state, produces moving image sequences with static background while the latter occurs in the target movement, when camera is also in the relative movement state. It describes two method of moving target detection namely temporal differencing and background subtraction. Temporal differencing can be used for static background while background subtraction is used for changing background. It also talks about optical flow methods but criticizes them for their need of specialized hardware.

Recent years have seen consistent improvements in the task of automated tracking of pedestrians in visual data. The problem of tracking of multiple targets can be viewed as a combination of two intertwined tasks: inference of presence and locations of targets; and data association to infer the most likely tracks. Research in the analysis of objects in general, and humans in particular, has often attempted to leverage the parts that the objects are composed of. Indeed, the state-of-the-art in human detection has greatly benefited from explicit and implicit detection of body parts [17]. A model of spatial relationships between detected parts is learned in an online fashion so as to split pedestrian track lets at points of low confidence.
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 13

A Project Report on Object Recognition Using Template Matching

Chapter 4 Theoretical Background


Object detection is an important computer vision building block. Object tracking in object detection is an important computer vision building block. Object tracking in videos involves verifying the presence of an object in image sequences and possibly locating it precisely for recognition. Object tracking is to monitor the objects for spatial and temporal changes during a video sequence, including its presence, position, size, shape, etc. This is done by solving the temporal correspondence problem, the problem of matching the target region in successive frames of a sequence of images taken at closelyspaced time intervals. These two processes are closely related because tracking usually starts with detecting objects, while detecting an object repeatedly in subsequent image sequence is often necessary to help and verify tracking. Analyzing human motion from video imagery has also recently become viable. In the current trend, besides analyzing any video for information retrieval, analyzing live surveillance videos for detecting activities that take place in its coverage area has become more important. This project is to be implemented on OpenCV (Open source Computer Vision Library), which is an open source C++ library for image processing and computer vision, originally designed by Intel.

4.1 OpenCV
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision, developed by Intel Russia research center in Nizhny Novgorod, and now supported by Willow Garage and Itseez. It is free for use under the open source BSD license. The library is cross-platform. It focuses mainly on realtime image processing. If the library finds Intel's Integrated Performance Primitives on the system, it will use these proprietary optimized routines to accelerate it.
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 14

A Project Report on Object Recognition Using Template Matching

4.1.1 Programming language


OpenCV is written in C++ and its primary interface is in C++, but it still retains a less comprehensive though extensive older C interface. There are now full interfaces in Python, Java and MATLAB/OCTAVE (as of version 2.5). The API for these interfaces can be found in the online documentation. Wrappers in other languages such as C#, Ch, Ruby have been developed to encourage adoption by a wider audience. All of the new developments and algorithms in OpenCV are now developed in the C++ interface. A CUDA-based GPU interface has been in progress since September 2010.An OpenCL-based GPU interface has been in progress since October

2012, documentation for version 2.4.5 can be found here.

4.1.2 OS support
OpenCV runson Windows,Android, Maemo, FreeBSD, OpenBSD, iOS, BlackBerry

10, Linux and OS X. The user can get official releases from SourceForge, or take the current snapshot under SVN from there. OpenCV uses CMake

Figure 4.1.2 Open Frameworks running the OpenCV add-on example.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 15

A Project Report on Object Recognition Using Template Matching

4.2 Background Subtraction


Open CV (Open Source Computer Vision Library) is a library of programming functions mainly aimed at real-time computer vision, developed by Intel. One of the most common methods for preprocessing an image for object detection is background subtraction. It is achieved by building a representation of the scene called a background model and then finding deviations from the model of each incoming frame. A threshold difference in the image region from the background model signifies a foreground object. The drawback of this approach is the sensitiveness to dynamic scene changes due to lighting and extraneous events.

4.3 Gaussian Filtering


Gaussian filtering is used to blur images and remove noise and detail. Gaussian filters have the properties of having no overshoot to a step function input while minimizing the rise and fall time. This behavior is closely connected to the fact that the Gaussian filter has the minimum possible group delay. It is considered the ideal time domain filter, just as the sync is the ideal frequency domain filter. Gaussian filtering is useful in at least two different contexts in digital signal processing. One context is low-pass filtering. In this context, we typically wish to attenuate highfrequency noise. For example, when detecting edges or computing the orientation of features in digital images, we might compute partial derivatives of image sample values. Because derivatives amplify high-frequency (often noisy) components of signals, we might also apply a low-pass filter before or after computing those derivatives.We use to illustrate the effect of smoothing with successively larger and larger Gaussian filters.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 16

A Project Report on Object Recognition Using Template Matching

Figure 4.3..(a)Original image

Figure 4.3.(b) The image shows the effect of filtering with a Gaussian of = 2.0 (and kernel size 99).

Figure 4.3.(c) The image shows the effect of filtering with a Gaussian of = 4.0 (and kernel size 1515).

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 17

A Project Report on Object Recognition Using Template Matching

4.4 Object detection approach


Detection over time typically involves matching objects in consecutive frames using features such as points, lines or blobs. That is to say, detection may be considered to be equivalent to establishing coherent relations of image features between frames with respect to position, color, velocity etc.

Tracking can be divided into various categories according to different criteria. As far as tracked objects are concerned, tracking may be classified into tracking of human body parts such as hand, face, and leg and tracking of whole body. Certainly, tracking can also be grouped according to other criteria such as the dimension of tracking space (2-D vs. 3D), tracking environment (indoors vs. outdoors), the number of tracked human (single human, multiple humans, human groups), the cameras state (moving vs. stationary), the sensors multiplicity (monocular vs. stereo), etc.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 18

A Project Report on Object Recognition Using Template Matching

4.4.1 Region based approach


The idea here is to identify a connected region associated with each moving object in an image, and then track it over time using a cross-correlation measure. In this, a human body was considered as a combination of some blobs, respectively, representing various body parts such as head, torso and four limbs. Meanwhile, both human body and background scene were modeled with Gaussian distributions. Finally, the pixels belonging to the human body were assigned to different body parts blobs using the loglikelihood measure. Therefore, by tracking each small blob, the moving people could be successfully tracked.

Figure 4.4.1 Region based tracking

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 19

A Project Report on Object Recognition Using Template Matching

4.4.2 Active Contour based approach


Tracking based on active contour models, or snakes [6] aims at directly extracting the shape of the subjects. The idea is to have a representation of the bounding contour of the object and keep dynamically updating it over time. In contrast to the region-based tracking approach, the advantage of having an active contour-based representation is the reduction of computational complexity. However, it requires a good initial At. If somehow one could initialize a separate contour for each moving object, then one could keep tracking even in the presence of partial occlusion. But initialization is quite difficult, especially for complex articulated objects.

Figure 4.4.2 Active Contour based tracking

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 20

A Project Report on Object Recognition Using Template Matching

4.4.3 Feature based approach


Abandoning the idea of tracking objects as a whole, this tracking method uses subfeatures such as distinguishable points or lines on the object to realize the tracking task. Its benefit is that even in the presence of partial occlusion, some of the sub-features of the tracked objects remain visible. Feature-based tracking includes feature extraction and feature matching. Low-level features such as points are easier to extract. It is relatively more difficult to track higher-level features such as lines and blobs. So, there is usually a trade-of between feature complexity and tracking efficiency.

Figure 4.4.3 Feature based tracking

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 21

A Project Report on Object Recognition Using Template Matching

4.5 Motion tracking and occlusion handling


In instances where the template may not provide a direct match, it may be useful to implement the use of eigenspaces templates that detail the matching object under a number of different conditions, such as varying perspectives, illuminations, color contrasts, or acceptable matching object poses. For example, if the user was looking for a face, the eigenspaces may consist of images (templates) of faces in different positions to the camera, in different lighting conditions, or with different expressions. It is also possible for the matching image to be obscured, or occluded by an object; in these cases, it is unreasonable to provide a multitude of templates to cover each possible occlusion. For example, the search image may be a playing card, and in some of the search images, the card is obscured by the fingers of someone holding the card, or by another card on top of it, or any object in front of the camera for that matter. In cases where the object is malleable or pose able, motion also becomes a problem, and problems involving both motion and occlusion become ambiguous. In these cases, one possible solution is to divide the template image into multiple sub-images and perform matching on each subdivision

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 22

A Project Report on Object Recognition Using Template Matching

Chapter 5 Proposed Technique


In this project Object detection is to be done by using template matching method. Template matching is a technique for finding areas of an image that match (are similar) to a template image (patch).It is a technique in digital image processing for finding small parts of an image which match a template image. It can be used in manufacturing as a part of quality control, a way to navigate a mobile robot, or as a way to detect edges in images.

The Algorithm is implemented in OPENCV and the approach used for object tracking is as follows:

1. First a template image is to be loaded. A Template image (T) in the patch image which will be compared to the source image. 2. After that video in which detection is to be done is loaded. 3. After loading a video, matching method is to be applied on the first frame 4. Then an object is detected in the first frame by making rectangular box around the object in the first frame. 5. Gaussian Filters are applied on each consecutive frames of the video. 6. The next objective is to find the object in the image sequence. Foreground detection is done by using sliding window approach followed by template matching which is described later.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 23

A Project Report on Object Recognition Using Template Matching

Algorithm flow diagram:

Load Template

Load Video
Read Frame
Apply Matching Method

Localizing the Best match Create Rectangle


Update Template

Display Resultant Frame


Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 24

A Project Report on Object Recognition Using Template Matching

5.1 Sliding Window Object Localisation


Many different definitions of object localization exist in the literature. Typically, they differ in the form that the location of an object in the image is represented, e.g. by its centre point, its contour, a bounding box, or by a pixel-wise segmentation. In the following we will only study localization where the target is to find a bounding box around the object. This is a reasonable compromise between the simplicity of the parameterization and its expressive power for subsequent scene understanding. An additional advantage is that it is much easier to provide ground truth annotation for bounding boxes than e.g. for pixel- wise segmentations.

In sliding-window-based approaches for object detection, sub-images of an input image are tested whether they contain the object of interest. Potentially, every possible subwindow in an input image might contain the object of interest. However, in a VGA image there are already 23;507;020;800 possible sub-windows and the number of possible sub windows grows as n for images of size n _n .We restrict the search space to a subspace R by employing the following constraints. First, we assume that the object of interest retains its aspect ratio. Furthermore, we introduce margins dx and dy between two adjacent sub windows and set dx and dy to be 1/10 of the values of the original bounding box. In order to employ the search on multiple scales, we use a scaling factor s = 1.2a, a {1010} g for the original bounding box of the object of interest. We also consider sub windows with a minimum area of 25 pixels only. |R|= [ ( )][ ( )]

w and h denote the size of the initial bounding box and n and m the width and height of the image.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 25

A Project Report on Object Recognition Using Template Matching

For sliding window we need two primary components: a. Source image (I): The image in which we expect to find a match to the template image. b. Template image (T): The patch image which will be compared to the sorce image. Our goal is to detect the highest matching area.

Figure 5.1.(a) Sliding window object localization

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 26

A Project Report on Object Recognition Using Template Matching

To identify the matching area, we have to compare the template image against the source image by sliding it.

Figure 5.1.(b) Siding template image over source image

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 27

A Project Report on Object Recognition Using Template Matching

By sliding, we mean moving the patch one pixel at a time (left to right, up to down). At each location, a metric is calculated so it represents how good or bad the match at that location is (or how similar the patch is to that particular area of the source image). For each location of T over I, you store the metric in the resultmatrix (R). Each location in R contains the match metric.

Figure 5.1.Resultant showing maximum match

The

image

above

is

the

result R of

sliding

the

patch

with

metric TM_CCORR_NORMED. The brightest locations indicate the highest matches. As you can see, the location marked by the red circle is probably the one with the highest value, so that location (the rectangle formed by that point as a corner and width and height equal to the patch image) is considered the match. In practice, we use the function minMaxLocto locate the highest value (or lower, depending of the type of matching method) in the R matrix

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 28

A Project Report on Object Recognition Using Template Matching

5.2 Template Matching Methods


Template matching is a technique for finding areas of an image that match (are similar) to a template image (patch). We need two primary components:

a) Source Histogram (I): The histogram of image in which we expect to find a match to the template image histogram. b) Template Histogram (T): The histogram of patch image which will be compared to the template image histogram.

The goal is to detect the highest matching area. To identify the matching area, the template image histogram is compared against the source image histogram by sliding it using sliding window approach explained in previous topic.

For each location of T over I, you store the metric in the result matrix(R). We use following methods [9] for matching:-

a. Absolute Sequence Difference method:

( (

))

b. Normalized Sequence Difference method:

( ( ( (

)) ))

R(x,y)=
) (

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 29

A Project Report on Object Recognition Using Template Matching

c.

Absolute Correlation Method:

( (

))

d. Normalized Correlation Method: ( ( ( ( )


( )

) ))

) (

e. Absolute Coefficient Method:

( (

))

f. Normalized Coefficient Method: Where, ( ) ( ) ( ) ( ( ( ( )


( )

) ))

) (

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 30

A Project Report on Object Recognition Using Template Matching

Then the location with higher matching probability is localized and a rectangle is drawn around the area corresponding to the highest match and objected is detected.

5.2.2 Template Matching by Cross Correlation


Correlation is an important tool in image processing, pattern recognition, and other fields. The correlation between two signals (cross correlation) is a standard approach to feature detection [3, 4] as well as a building block for more sophisticated recognition techniques. Textbook presentations of correlation commonly mention the convolution theorem and the attendant possibility of efficiently computing correlation in the frequency domain via the fast Fourier transform. Unfortunately the normalized form of correlation (correlation coefficient) preferred in many applications does not have a correspondingly simple and efficient frequency domain expression, and spatial domain implementation is recommended instead.

Template matching techniques [3] attempt to answer some variation of the following question: Does the image contain a specified view of some feature, and if so, where? The use of cross correlation for template matching is motivated by the distance measure. The resulting correlation term c(u,v) is a measure of the similarity between the image and the feature.

5.2.3 Normalized Cross Correlation


If the image energy f2(x, y) is not constant however, feature matching by cross correlation can fail. For example, the correlation between the template and an exactly matching region in the image may be less than the correlation between the template and a bright spot. Another drawback of cross correlation is that the range of c(u, v) is dependent on both the size of the template and the template and image amplitudes. Variation in the image energy under the template can be reduced by high-pass filtering
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 31

A Project Report on Object Recognition Using Template Matching

the image before cross correlation. In a transform domain implementation the filtering can be conveniently added to the frequency domain processing, but selection of the cutoff frequency is problematic a low cut-off may leave significant image energy variations, whereas a high cut-off may remove information useful to the match.

Normalized cross correlation overcomes these difficulties by normalizing the image and template vectors to unit length, yielding a cosine-like correlation coefficient.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 32

A Project Report on Object Recognition Using Template Matching

Chapter 6 Work Done


In this project main aim is to detect the object so that the required object can be tracked. The location of an object in the image is represented by its center point or its contour, or a bounding box, or by a pixel-wise segmentation. Here the target is only to find a bounding box around the object. This is a reasonable compromise between the simplicity of the parameterization and its expressive power for subsequent scene understanding. An additional advantage is that it is much easier to provide ground truth annotation for bounding boxes than for pixel-wise segmentations.

In sliding-window-based approaches for object detection, sub images of an input image are tested whether they contain the object of interest. Potentially, every possible sub window in an input image might contain the object of interest.

The template used in the previous iteration is no more useful to us because with the motion of the moving body, the template might not match any area after a few frames have passed in further iterations.

Moreover a moving body might change its angle of orientation towards the camera when the next few frames are read.

To overcome these shortcomings the template update approach comes in quite handy. Whenever the template is matched with a certain area in a frame, the detected area is bounded by a rectangle whose size as same as the size of the template. This rectangle is then cropped from the frame and the cropped image becomes our new template in the next iteration. This approach where at every frame our template is updated gives accurate results until and unless the frames are missed or the motion is so rapid that matching a
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 33

A Project Report on Object Recognition Using Template Matching

template fails in the very next frame. These conditions are rarely observed in our day to day life so template matching and update technique tracks the path of a human very accurately. In case of multiple human motions tracking this approach is quite useful as it distinguishes between two blobs directly on the basis of template matching and updating. Various features like orientation, area, color, contrast etc come into play when template matching is used as the area most alike would obviously give the minimum difference. This difference is plot in terms of grey scale and is shown in the results. The following color based approach can be said to be a sub-part of this approach but the time reduction in tracking the motion that we achieve with color based approach is quite good

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 34

A Project Report on Object Recognition Using Template Matching

Chapter 7 Results
7.1 Results and Discussions

Template matching & updating


The tracked region based on template matching and updating gives accurate results. Only error is when the template is lost in any frame due to rapid motions. The rectangles formed across the faces of the detected humans in the results are the exact match to their faces being supplied as templates in the beginning and being updated in every frame. Even if the faces are moved by some angle and the orientation towards the camera is changed the results are not affected as the templates are updated.

The six approaches for template matching which have been described provide different results in different scenarios. Some are more accurate in one while others are more accurate in the other. So there is no trade off. Here 4 sample results are shown with original frame image and initial templates. First image is the frame input from the video and template based and updating algorithm searches for the templates of the faces provided in the beginning and being updated at each frame.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 35

A Project Report on Object Recognition Using Template Matching

Figure 7.1 Template based detection sample 1

a) Original Image

b) Template 1 c) Template 2

c) Image after applying algorithm


Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 36

A Project Report on Object Recognition Using Template Matching

e) Updated template 1

f) Updated template 2

g) Resultant window 1

h) Resultant window 2
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 37

A Project Report on Object Recognition Using Template Matching

Figure 7.2: Template based detection sample 2

a) Original image

b) template 1 c) template 2

d) Image after applying algorithm


Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 38

A Project Report on Object Recognition Using Template Matching

e) Updated template 1

f) Updated template 2

g) Resultant window 1

h) Resultant window 2
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 39

A Project Report on Object Recognition Using Template Matching

Figure 7.3: Template based detection sample 3

a) Original image

b) template 1 c) template 2

e) Image after applying algorithm


Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 40

A Project Report on Object Recognition Using Template Matching

e) Updated template 1

f) Updated template 2

g) Resultant window 1

h) Resultant window 2
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 41

A Project Report on Object Recognition Using Template Matching

Figure 7.4: Template based detection sample 4

a) Original image

b) Template 1

c) Template 2

d) Image after applying algorithm


Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 42

A Project Report on Object Recognition Using Template Matching

e) Updated template 1

f) Updated template 2

g) Resultant window 1

h) Resultant window 2
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 43

A Project Report on Object Recognition Using Template Matching

Chapter 8 Future Work


8.1 Human Action Analysis
Behavior understanding is to analyze and recognize human motion patterns, and to produce high level description of actions and interactions. It may be simply considered as a classification problem of time varying feature data, i.e., matching an unknown test sequence with a group of labeled reference sequences representing typical human actions.

Different approaches of human behavior analysis will be studied and implemented. Some of these are Action Recognition, Stick figure model, 3-D & 2-D contours.

8.2 Hardware Implementation


A complete system can be developed. This system can have two core components one a camera and the second a computer. A complete security system can be simulated with connecting alarms, lights, automated messaging etc

8.3 Modification of Algorithm for PTZ camera & multiple cameras


Instead of a normal camera a PTZ (Pan, tilt and Zoom) camera can be used. A PTZ camera allows for a bigger mapping so that the whole environment can be scanned not just a fixed part. Moreover multiple number of cameras can be used for 3-D estimation of the area. The algorithm has to be slightly modified for these changes.

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 44

A Project Report on Object Recognition Using Template Matching

9. References
[1] Ff. R.T. Collins, A.J. Lipton, T. Kanade, Introduction to the special sectionon video surveillance, IEEE Trans. Pattern Anal. Mach. Intell. 22 (8) (2000) 745746

[2] A. R. Francois and G. G. Medioni. Adaptive colour background modeling for realtime segmentation of video streams. In Proceedings of the International Conference on Imaging Science, Systems, and Technology, pages 227{232, 1999. [3] R.O. Duda and P.E.Hart, Pattern Classification and Scene Analysis, New York: Wiley, 1973.

[4] R. C. Gonzalez and R. E. Woods, Digital Image Processing (third edition), Reading, Massachusetts: Addison-Wesley, 1992

[5] G. R. Bradski and J. Davis, Motion Segmentation and Pose Recognition with Motion History Gradients, Machine Vision and Applications, 2002

[6] D. Meyer, J. Denzler, H. Niemann, Model based extraction of articulated objects in image sequences, Proceedings of the Fourth International Conference on Image Processing, 1997

[7] R. Brunelli. Template Matching Techniques in Computer Vision: Theory and Practice.Wiley Publishing, 2009 [8] W. C. Abraham and A. Robins. Memory retentionthe synaptic stability versus plasticity dilemma. Trends in neurosciences, 28(2):7378, Feb. 2005.

[9] OpenCV, Learning. "Computer vision with the OpenCV library." GaryBradski, Adrian Kaehler(2008).

[10] Wang, Liang, Weiming Hu, and Tieniu Tan. "Recent developments in human
Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 45

A Project Report on Object Recognition Using Template Matching

[11] J.K. Aggarwal, Q. Cai, Human motion analysis: a review, Proceedings of the IEEE Workshop on Motion of Non-Rigid and Articulated Objects, 1997, pp. 90102 [13] Lipton, Alan, et al. A system for video surveillance and monitoring. Vol. 2. Pittsburg: Carnegie Mellon University, the Robotics Institute, 2000.

[14] Bo Wu and Ram Nevatia, Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors, Tenth IEEE International Conference, Computer Vision, 2005. ICCV 2005.

[15] Liang Xiao and Tong-qiang Li, Moving Object Detection and Tracking, 2010

[16] SourabhKhire and JochenTeizer, Object Detection and Tracking, Information and Automation (ICIA), IEEE International Conference 2008

[17] Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: ECCV. (2004)

[18] Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.:Object detection with discriminatively trained part-based models. PAMI 32 (2010) 1627 1645

[19] Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3d human pose annotations. In: ICCV. (2009)

[20] Tian, T.P., Sclaroff, S.: Fast globally optimal 2d human detection with loopy graph models. In: CVPR. (2010)

Department of Electronics & Telecommunication Engineering, MPSTME, SVKMs NMIMS

Page 46

You might also like