0% found this document useful (0 votes)

2K views37 pages

Final Year Project Report

The document provides an introduction to vehicle detection and counting. It discusses the importance of multi-vehicle detection in intelligent traffic management. Modern technologies using deep learning can replace hardware-based traffic management systems. The objectives are to detect different vehicle types in traffic video using deep learning, and count vehicles using tracking algorithms. Previous related work applied object detection methods like YOLO and R-CNN for detection, and tracking algorithms like KLT and DeepSORT for counting. The report is organized into chapters covering the problem, methodology, implementation details and results.

Uploaded by

Mohd Vohra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2K views37 pages

Final Year Project Report

Uploaded by

Mohd Vohra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CHAPTER 1

INTRODUCTION

By the sudden growth of intelligent traffic and road links, the Multi-Vehicle detection and
counting has become an important technique for gathering data about traffic and plays the
vital role in intelligent traffic management and control of the highway. With the popular
installation of traffic monitoring cameras a large database of traffic video can be obtained for
analysis.

These traffic management systems not only reduce delays and blockage due to traffic but also
plays a significant role in solving major road problems like - Identification of accidents and
vehicles moving on incorrect lanes, Checking that the traffic police is properly performing
their duty and showing traffic flow data. Modern technologies using deep learning have a
strong potential to replace these hardware-based systems in a cost effective manner with less
manpower and resources.

The majority of vehicle counting systems can be classified as hardware or software-based

detection systems. The usage of standard sensors such as loop or magnetometer detectors,
security cameras require a lot of maintenance and the cost of installing these sensors are also
expensive.

The necessary things needed which must be kept in mind while designing these models are
one must compare it with the previous research which are done on these traffic challenges,
understanding the various methods there accuracy and performance statistics in various
weather conditions like heavy rainfall, dusty weather and dense fog. On the other hand,
performance also decreases by the shadows formed by tall buildings, dense clouds. Keeping

1
all these challenges in mind an efficient dataset and training algorithm should be selected to
give contribution to the society.

To resolve these difficulties various models are put forward which can precisely detect and
count the number of vehicles in different conditions which helps in solving the real time
problems in day to day life.

1.1 Motivation

Vehicle Detection and Counting is a major need in this modern technology. It plays an
important role in civilian and military applications. Even though, there are so many
prediction techniques that are available in software engineering there is a need for stable
methodology

The advancement of road links and the increase in the number of vehicles like self-driving
cars, electric scooters in recent years, there is a need for modern technologies which can
solve traffic problems quickly.

An efficient model is needed for counting vehicles in parking lanes and also in collecting
taxes and parking charges from the vehicles efficiently.

The distractions are very common in the traffic videos of urban areas where vehicles do not
follow the rules and lanes systematically. This creates confusion in counting the vehicles and
this should be removed to make the model useful in society.

As a result, the motivation of this project is to propose a solution that will effectively detect
and count vehicles by training the dataset using object detection and counting algorithms to
increase the performance and efficiently detect and count vehicles.

2
1.2 Problem Statement

Vehicle detection and counting is now playing an important role in traffic management and
monitoring and requires for controlling traffic in a cost effective way with less manpower.

Our problem statement focuses on detecting different kinds of vehicles like buses, cars,
trucks, bikes in the given video and applying a suitable tracking algorithm to count the
vehicles in the given video frame efficiently.

1.3 Project Objective

The objectives of our project are:

● The main goal of the project is to detect the different types of vehicles in the given traffic
video using deep learning algorithms.

● To count the different types of vehicles like (car, truck, bus, bike) in the given traffic video
using tracking algorithms.

1.4 Scope of the Project

The Scope of the project is that it is useful in verifying the amount collected on tolls, this can
be achieved by installing a camera on the roadside . It can also be useful in parking
management. We can achieve our scope by:

- Collecting the Dataset : Once the dataset are available, the dataset is preprocessed and
cleaned from the original dataset. The selection of the right methods is very important
as they will improve the overall efficiency of the model.
- Using Object Detection Methods: Once the dataset is preprocessed the object
detection algorithms like (Yolo, SSD) is selected and used for training the dataset and
it helps in classification of vehicles in the given video.

3
- Applying the Tracking Algorithm: After the various vehicles are detected from the
trained dataset the tracking algorithms like (Deep Sort, ORB) are applied which will
help in counting the vehicles in the given video. For doing so, certain filters and
wrapper methods are applied.

1.5 Previous Related Work

Various experiments were conducted on different datasets. The outcome showed clear
improvements obtained using the object detection and tracking algorithms. To make it simple,
researchers have suggested various methods for vehicle detection and counting in real time.

Yang et al. [12], proposed a vehicle detection which is using the method of background
subtraction. They use a detection method with the technique of low-rank decomposition. It
contributes the favorable result on constant scenes but its performance decreases when the
background scenes change rapidly. Also, the vehicle counting process is still difficult, and it
is important to deal with partial blockage of the objects and variation of brightness, contrast
of the images. In future, the paper needs to improve on accurately detecting the object .

Abdelwahab et al. [11], proposed a different approach to count the vehicles using R-CNN and
tracking the vehicle using the KLT (KanadeLucas-Tomasi) tracker. Combining these two
methods it shows better performance on the trained dataset.

Zhe Dai et al. [2], also bring forward a vehicle counting framework in which there are three
stages: object detection was done using yolov3, object tracking with the help of KCF algo
and trajectory processing using region encoding method. It shows an result of object
detection with an accuracy of 87.6% in the high traffic and weather conditions.

Adson M. Santos et al [1], designs a system that use YOLOv3 for object detection and Deep
SORT for multiple object tracking algorithms; it showed an average accuracy of 99.15% in
the global count on GRAM and CD2014 datasets respectively. It can also count the vehicles
more efficiently.

4
Zuraimi et al. [13], also suggested a model using TensorFlow and you only look once (yolo)
for detecting vehicles in real time. Combining these two and other needed dependencies the
given paper compares the previous versions of yolo and picks yolov4 for implementation.
Furthemore, the system uses DeepSORT algorithm to help count the number of vehicles
passing in the video effectively. This paper concluded, the best model among available
YOLO models is Yolov4 which has achieved results with 82.08% AP50 using the custom
dataset.

1.6 Organization of the Report

● Chapter 1 presents the research problem, research objectives, justifying the need for
carrying out the research work and outlines the main contributions arising from the
work undertaken.
● Chapter 2 provides the essential background and context for this thesis.
● Chapter 3 provides the details of system architectural design and methodology
● Chapter 4 explains the implementation details and results obtained.
● Chapter 5 summarizes the report and briefs the future aspects.

1.6 Chapter Summary

This chapter is the basic building unit for execution of our project. It briefly introduced the
research problem, research objectives, scope of the project, previous related work and the
proposed solution framework. The next chapter examines the pertinent literature most
relevant to our research

5
CHAPTER 2

Literature Survey

This chapter focuses mainly on the review of the Real time object detection and various
tracking approaches that have already been implemented, emphasizing on:

[Link] problem that is addressed

[Link] solution to be proposed

[Link] main contributions and conclusions.

Object detection is a digital approach for recognizing and to track down the items in a video
or image. It, in general, produces bounding boxes around the objects in an image, to locate
the things in a specified context. Image recognition and object detection are mistaken most
of the time.

Fig 2.1 Image recognition vs Image detection

6
The above image is classified through image recognition. The word "dog" refers to a dog in
an image. While, object detection creates a box around each dog and labels it "dog."The
detection method is used to anticipate the position of each object with proper labeling .It
provides extra information about an image in this manner.

Object tracking deals with the process of object detection. The overview of the steps followed
are:

● Object detection is a technique of detecting and classifying the object using a suitable
algorithm by creating a bounding boxes around it.
● Giving each object its own identification by assigning a unique Id.
● Following the labelled item when it moves across the frames and storing the essential
data.

2.1 Object Detection Algorithms

Fig 2.2. Various object detection algorithms [14]

2.2 R-CNN model

To deal with the challenge of selecting a large number of areas, Ross Girshick et al. suggested
an approach in which we make use of a selective search algorithm to extract only 2000
regions from an image. Only 2000 regions need to be focused. These 2000 candidate region
7
suggestions are squared and input into CNN, which resulted in a 4096-dimensional feature
vector. The CNN deals with the feature extractor, which is fed into an SVM to estimates the
presence of the object within that candidate region proposal.

Fig 2.3. R-CNN algorithm[14]

2.2.1 Selective Search Algorithm

Selective Search is an item detection region proposal [Link], the image is

over-segmented based on pixel intensity using Felzenszwalb and Huttenlocher graph-based
segmentation algorithm which serves as the starting point for the processes that follow.

1. Add all bounding boxes irrespective of segmented parts to the group of regional
proposals
2. Create groups of similar segments.
3. Start from step 1

Larger segments are produced and added to the list of proposed regions with each iteration.
As a result, we use a bottom-up method to develop region suggestions, starting with smaller
areas and working our way up.

8
2.2.2 Problems with R-CNN

● Training time is very large because each image requires the classification of 2000
area recommendations.
● Since each image take 47 seconds for processing,it is not suitable for realtime
detecting
● Since the Selective algorithm is the same , no learning is taking place. This may
result in the creation of poor candidate region suggestions.

2.3 Fast R-CNN model

The same author came up with a new model to overcome the shortcomings of the previous
model, Fast R-CNN- a better object detection system. The working of the algorithm is quite
similar to the R-CNN .We give the CNN the input image to produce a convolutional feature
map. We select the region of proposals from the convolutional feature map, bind them into
squares, then organize them into a fixed size using a RoI pooling layer so that they may be
fed into a fully connected layer. feature vector.

Fig 2.4. Fast R-CNN algorithm[14]

9
2.3.1 Comparison between R-CNN, Fast R-CNN and SPP net

Comparison of object detection algorithms

Fast R-CNN is considerably more precise in training and testing than R-CNN, as shown in
the graphs above. When comparing the performance of Fast R-CNN, using region proposals
on the whole declines the performance compared to not utilizing region proposals.

2.3.2 Problems with Fast R-CNN model

Although, even Fast RCNN has certain limitations yet it also brings forward a selective
search to locate the RoI, which is a time-consuming approach. It takes 2 seconds per image
in detecting objects, which is comparatively faster than RCNN. However, when dealing with
big real-world datasets, even a Fast RCNN becomes slow.

However, another object detection technique performed better than Fast RCNN. And I have a
feeling you won't be surprised by the name.

2.4 Faster R-CNN model

To identify region proposals, algorithms (R-CNN and Fast R-CNN) use selective search.
Selective search is a time consuming process that reduces the network

10
[Link], Shaoqing Ren et al. creates an object identification algorithm that
comes up with the limitations of selective search [Link] image is inserted into a CNN,
resulting in a convolutional feature map, same as Fast R-CNN. This model makes use of
featuremap besides selective search algorithms on the feature map to locate them. A RoI
pooling layer is then used to reshape the anticipated region proposals, which is subsequently
utilized to divide the images within the proposed region.

2.5 YOLO-You Only Look Once

CNN is used in the YOLO method to detect objects in real time. To detect objects, the
approach just takes a single forward propagation through a neural network, as the name
suggests.. There are several versions of the YOLO algorithm. The YOLO focuses on object
detection in a different way. It inputs the whole image in a single instance and predicts the
bounding box coordinates with the class probabilities.

2.5.1 Importance of YOLO

YOLO algorithm is important because of the following reasons:

● Speed:YOLO is quite suitable for object detection in real time.

● High accuracy: YOLO is a prediction technique that detects objects with high
precision.
● Learning skills: The algorithm has exceptional learning abilities, helps in great
performance.

11
2.5.2 How YOLO works

YOLO algorithm mainly deals with three techniques:

● Residual blocks
● Bounding box regression
● Intersection Over Union (IOU)

[Link] Residual blocks: The image is first divided into several grids with dimension of SxS.
The figure below depicts a grid of an input image. There are a number of grid cells with
dimensions SxS of the same size as described in image below. Objects that fall within each
grid cell will be considered.

Fig 2.6 Residual block

[Link] Bounding box regression: An outline that highlights an object in a picture is known
as a bounding box.
The properties of each bounding box in the image are:

● Width (bw)
● Height (bh)

12
● Class ( person, car, cat, etc.)- This is represented by the letter c.
● Bounding box center (bx,by)

The image shows an example of a bounding box with a yellow outline.

Fig 2.7 Bounding box

[Link] Intersection over union (IOU): The concept of intersection over union (IOU)
explains how to detect the object when two boxes overlap. IOU is used by YOLO to create an
output box that properly surrounds the items. Each grid cell has a bounding box with a
confidence score The IOU is considered 1 if the predicted and real confidence score is the
same. If the size of the bounding box is different from the actual one then that box is
removed.

13
2.6 SSD (Single Shot Detection)
SSD is a model that detects objects, but what precisely does that imply? Object detection and
picture classification are often confused. In simple terms, image classification identifies the
type of image, whereas object detection identifies the various objects in the image and uses
bounding boxes to indicate where they are in the image. Let's get into SSD now that we've
cleared things up.

Single Shot Detector The model's name exposes the majority of the model's details. Yes,
unlike other models that traverse the image more than once to produce an output detection,
the SSD model identifies the object in a single pass over the input image.

2.7 Object detection vs Object tracking

Once an image is identified and located and its initial position is known,to predict its position
in the upcoming frames of a video is termed as “Object tracking”.

Object detection, on the other hand, is a method of creating the bounding boxes around the
images and predicts its position initially. The target image must be visible on the input for
object detection to work. This method is not suitable if it is caused by any interference.

2.8 Approaches to Object Tracking

Object tracking came into existence for over two decades, and several methods and ideas
have been developed to improve tracking models' accuracy and efficiency.

Some of the techniques were traditional or traditional.

The conclusions of some of the study articles are as follows:

14
2.8.1 MDNet

Multi-Domain Net is an object tracking technique that uses enormous amounts of data to
train. Its goal is to learn a wide range of exceptions and various relationships.

MDNet has been taught to study the shared representation of targets from numerous
annotated videos, which means it takes many films from diverse domains.

It includes mainly two components:

Pretraining: The network must learn multi-domain representation during pretraining. The
system is trained on many annotated videos to learn representation and dimensional
information in order to accomplish this.

Visual tracking online: The domain-specific layers are removed after pre-training,only the
shared network left. A binary classification layer is introduced during inference and taught or
fine-tuned.

2.8.2 GOTURN

Deep Regression Networks models require offline training. This model works on a general
relationship between object motion and appearance and can be used to manage the objects
other than the training sets.

Due to the reason that they cannot use many videos to increase the efficiency, online tracker
algorithms are not so fast and performance is not upto the mark. GOTURN is a
regression-based technique. In essence, they use only a feed-forward run across the network
to regress straight to track target objects.

2.8.3 DeepSORT

DeepSort is a widely used object tracking algorithm. It's a SORT extension, which uses
online tracking options.

15
SORT is a method that estimates the location of an object based on its previous location using
the Kalman filter. The Kalman filter is highly good at removing occlusions.

SORT consists of three components:

1. Detection: Detecting the desired object in the initial stage i.

2. Estimation: With the help of Kalman filter ,predicting the future location i+1 of the
object from the starting stage.
3. Association: Since the Kalman filter estimates the future location of the object i+1, it
must be optimized using the correct location. This is usually done by detecting the
position of the object in that position i+1. This is achieved by using the Hungarian
algorithm.

After we've covered the principles of SORT, we have deep learning methods to improve the

performance of algorithm. Because deep neural networks can now recognise the features of
the target image, SORT can predict the object's location with significantly higher accuracy.

2.9 Literature Review

Some of the conclusions drawn from the research papers are:

In paper [1], The YOLOv3 was used for detection of vehicles and deepsort for tracking and
counting the vehicles. These methods are also easy to understand. Thus, the paper contributes
in counting vehicles automatically, it comes with higher speed and thus is beneficial in
achieving traffic information. It concluded that methods used for implementation obtained
high accuracy in comparison to previously proposed methods.

In paper [2], it proposes a vehicle counting framework in which there are three stages: object
detection using yolov3, tracking using KCF algo and trajectory processing using region

16
encoding method. This paper suggests a better tracking algorithm which helps in increasing
its performance in congested areas.

In paper [3],This paper proposes a vehicle counting framework that uses the SSD model for
multi-vehicle detection and correlation matched algorithm for multi-vehicle object tracking
and trajectory optimization algorithm based on the least squares method. The proposed
framework model solves the problem of occlusion and vehicle scale change in the tracking
process.

In paper [4],This paper proposes the uses of FPN and Cascade R-CNN for multi-vehicle
[Link] framework proposes an architecture that enables precise detection and
classification of [Link] model performance is achieved 59.78% for cars.

In paper [5], A model of vehicle identification and counting, that consolidated the deep
learning recognition method YOLOv4 with object tracking method [Link] framework
is important in the field of highway and transport infrastructure management and much better
than the traditional methods. Not good results when performing on real-life videos.

In paper [6],It Proposes the object detection algorithm YOLO v4 and optimized it for vehicle
[Link] are the various scopes of application in IC detection,Crack Detection, Face
detection, [Link] final combined model gives benchmark results with a MAP of 67.7%.

In paper [7],It proposes a brilliant method, combining spatial-visual feature learning and
global 3D state estimation,to track moving vehicles in a 3D world. This framework is useful
in estimating the complete 3D bounding box .This 3D tracking approach can match with the
competitive results with an image only.

In paper [8],It proposed a system to count vehicles by utilizing the various [Link]
experiences wrong Detection and duplicate of vehicles in some cases , To provide
information assisting vehicle counting, traffic flow prediction, and vehicle speed
measurement.

17
In paper [9], It proposed a vehicle detection and tracking method from aerial [Link]
approach is capable of handling both static and moving backgrounds. We use a foreground
detector for static backgrounds, which can overcome tiny variations in the real picture by
updating the model. To calculate motion of camera for moving background, image
registration is used which is helpful in vehicle detection over a specific frame.

In paper [10], we began by obtaining photographs and then working on and performing
various operations on them. Then use the haar cascade for object detection, and see how
different haar cascades are employed for car and bus detection. For further item detection,
many pre-trained hair cascades were used.

As can be concluded from the above-mentioned research papers,it shows variations in the
efficiency of the vehicle detection and tracking which depend on the data sets we choose.

Below is the table (table 2.1) representing the various research papers published and their
conclusions which were studied through literature survey. It gives detailed information of the
methods, the] datasets and the conclusions drawn in the research paper.

TABLE 2.1. Literature survey

[Link] Papers Name Approach used Dataset Limitations Performance

used

1. Counting Vehicle It uses YOLOv3 for GRAM It can detect This proposal achieved an accuracy of
with high precision object and count 99.15% in the global count in
CD2014
in vehicles but
detection and Deep GRAM and CD2014 datasets
unable to
brazilian road SORT for multiple
It also obtained an accuracy of over
classify them
using yolov3 and objects tracking.
90% in
individually
deep sort (2020)
real scenes of Brazilian federal
highways

18
2. Video Based It proposes a vehicle VCD It was unable The obtained result show that the
counting framework to detect accuracy reaching 87.6%, even if the
Vehicle Counting VDD
in which there bikers in the traffic condition is quite complex.
Framework (2019)
street
are three stages
object detection It accuracy
(using yolov3), decreases near
object crowded
places like
tracking (using KCF
algo) and trajectory hospitals and
processing commercial
centers
(using region
encoding method)

3. Video-Based This paper proposes NOHWY This neural The result shows that the proposed
Vehicle Counting a vehicle counting network does vehicle counting method obtains more
for Expressway framework which not generate than 93% accuracy and 25 FPS speed
Based on Vehicle uses the SSD model enough high on vehicle counting based on vehicle
Detection and for multi-vehicle level features tracking.
Correlation-Match detection and to do
ed Tracking correlation matched prediction for
algorithm for small objects.
(2020)
multi-vehicle object So it does
tracking and worse for
trajectory smaller
optimization objects.
algorithm based on
least squares method.

4. Vehicle counting This paper proposes VisDrone Results The model obtained an average
and tracking in the uses of FPN and showed that accuracy of 59.78% for cars when the
2019.
Aerial video feeds Cascade R-CNN for precision for IOU with ground truth was greater than
using Cascade multi vehicle the other four 0.5. The precision dropped for the
RCNN and feature [Link] is classes other categories such as vans and
Pyramid performed simply by resulted from trucks, resulting in an overall average
Networks(2021) measuring the IOU the lack of precision of 20.46%.

19
between detected training
objects in two examples in
subsequent frames. these
categories
compared
with the car
category.

5. Real-time vehicle A tale model of COCO Not good Good overall performance is achieved
detection and vehicle identification results when in terms of tracking accuracy.
OPEN-
counting based and counting ,that performing on
With the combination of yolov4 and
IMAGE
yolo and consolidate the deep real life
DeepSort can be seen to outperforms at
deepsort(2020) learning recognition videos with
least 11% of AP and 12 % of AP50 the
method YOLOv4 regularly
original YOLO v4.
with object tracking changing
method DeepSort brightness and
background
slow moving
vehicles

6. Refining YOLO v4 It Proposes the object UA-DETR DIoU with Results with an MAP of 67.7%
for Vehicle detection algorithm AC NMS makes (10%-point higher than base model) on
Detection(2020) YOLO v4 and benchmark the system the DETRAC-test dataset.
optimized it for dataset less open to
vehicle occlusion due
[Link] v4 to the central
provides higher distance
accuracy and faster
along with an
results so as to
overlap area.
implement real-time
vehicle detection

7. Joint Monocular It proposes an ideal GTA Monocular Model filters out 6 − 8% possible
3D Vehicle framework, 3D tracking mismatching trajectories To analyze its
KITTI
combining visual approach can impact.

20
Detection and feature learning and reach
Tracking.(2019) global 3D state competitive
estimation,to track results with
moving vehicles in a image stream
3D world. only.

8. A YOLO-based It proposed a system It COCO Overall efficiency of the system is

Traffic Counting to count vehicles experience 82%.
Video
System (2018) False
through YOLO.
Detection
and
duplicate
counting of
vehicles in
some case

9. Vehicle Counting It proposes a system There are Aerial videos The experimental results of 16 aerial
Based on Vehicle based on the platform many videos show that the proposed method
Detection and of UAV It consists of limitations produces more than 90% and 85%
Tracking from vehicle detection, of using accuracy on static-background videos
Aerial Videos multi-vehicle Surveillance and moving-background videos,
(2018) tracking, videos respectively.
multi-vehicle cameras
management, and such as
vehicle counting. problem of
occlusion ,
shadow and
limited
views

10. Vehicle detection It uses the moving Not suitable CCD images It achieve a performance of 85%.
and tracking based tracking function for
AVI videos
on openCV (2020) library and camshaft multi-target
algorithm to tracking
construct a vehicle system.

21
video analysis
system.

2.10 Chapter Summary

This chapter contains the papers that helped us to understand and reach a position where we could
implement different techniques and eventually do their comparative analysis.

22
CHAPTER 3

SYSTEM DESIGN AND METHODOLOGY

We make a model which is firstly trained using the dataset collected with the help of
kaggle and manually collected images. And then the dataset is pre-processed and
further annotated in Yolo format. This custom dataset is used to train the Yolov4
model and the trained weights is used in tracking using the deepsort for counting the
vehicles.

3.1 System Design and Architecture

3.1.1 System Architecture

Below diagram clearly explain the working of our model

Fig 3.1. Basic Architecture

23
3.1.2 System Design

This Project is developed in python using Google collab, Tensorflow, and OpenCV. The
Google Colab is an open-source web application that allows anyone to write and achieve
capricious python code through the browser and is mainly well suited in machine learning
data analysis, & [Link] open source plan of Action and software library for
machine learning and artificial [Link] the flexible environment of tools, libraries
and it let the developers develop and deploy ML based applications easily.

3.2 Dataset Used

We have collected images of different classes which include cars, buses, trucks, and bikes
with the help of Kaggle, robo-flow, and google.

After the collection of data, we filtered the noisy and blurred images for better training of our
model. Meanwhile, Furthermore we also adjusted the brightness, hue, and contrast.

In the next step with the help of CVAT (Computer Vision Annotation Tool), we have created
bounding boxes and annotation and divided our dataset into two parts:-

(i) 90% for training the model.

(ii) 10% for testing our model

The training of our model on our local machine is really time taking and requires a lot of
dependencies if we don’t have a powerful GPU. So to avoid this we have chosen to run our
code on Google Colab since it does provide a free GPU and online environment.

We have collected images of different types of vehicles and performed some data
augmentation methods on it like resizing the images, brightness adjustment, color adjustment,
rotation of the images (clockwise/anti-clockwise), cropping the images etc. And created
bounding boxes and annotations on them.

24
We have split our data into two scenes: day and night time and trained them into eight classes
(four classes in each scene, which are motorbike, truck, bus, and bike).

Here are some snapshots of the dataset used for training our model.

Fig 3.2 Figure showing the glimpse of the dataset

3.3 Data Preprocessing

The adaptations we do to our data prior to passing it to the algorithm are touched on as
pre-processing. Data preprocessing is mainly a technique for transforming raw data into a
polished data set. To be more explicit, whenever data is received from many sources, it is
collected in raw file, which makes analysis impractical. Data must be formatted properly in
order to achieve finer results from the applied model in Machine Learning or deep learning
applications. The data-preprocessing which we have done includes the Data cleaning which
means that we have deleted the images from the dataset which does not contain any of the
objects in it. Data build up is done to increase the number of images in the dataset. Data

25
augmentation techniques include the cropping of the images, flipping of the images, rotation
of the images, changing the brightness of the images, adjustment of the contrast of the
images,hue and saturation adjustment.

Fig 3.3 Diagram showing the pre-processing of the Dataset

3.4 Selection of Performance Metrics

There are different metrics that we have used for measuring the performance of the
training Yolov3. Following are the names of various metrics used in our project:

1. mAP: To estimate object detection models the same as R-CNN & YOLO, the mean
average fidelity(mAP) is used. The mAP contrasts the ground truth bounding box to
the discovered box and returns a score. The higher the value of score, the more exact
the model is in its detections.
2. IOU: Crossing over fusion is an evaluation metric used to measure the correctness of
an object locator on a particular dataset.

26
3.7 Chapter Summary

This chapter ascertain the system design and architecture required for the implementation of the
models. And It also furnishes a detailed knowledge of the proposed procedure used in the project.

27
CHAPTER 4

IMPLEMENTATION AND RESULTS

The project is built in python language using Google Colab. In this project, we have tried to
detect the vehicle in the video dataset and then track the vehicle and then count the total
number of vehicles in the given video dataset. For obtaining results, the use of some python
libraries have been taken.

4.1 Software and Hardware Requirements

4.1.1 HardwareRequirements:

● Processors: Intel Core i5 processor Intel Core i5 processor or higher.

● Disk space: up to 4 GB

4.1.2 SoftwareRequirements:

● Operating systems: Windows 10 or latest, macOS or Linux

● Python 3.6 or above.
● Include development tools: conda, conda-env, Google Colab
● Include Python packages: opencv-python, tensorflow, matplotlib, pillow and others.

4.2 Assumptions and dependencies

● The vehicle should be running on the roads and the vehicles classes should be
belonging among the [Link] file.
● There should be enough light present in the testing and training dataset.

28
● IOU and mAP are used as the metrics for our model performance evaluation as in the
case of object detection IOU works best to measure the overlapping of a predicted
bounding box versus actual bounding box of an object.

4.3 Implementation Details

● In this section we provide the result obtained by the implementation of the chosen
technique and help us to justify the use of the proposed objection detection and
tracking technique.

● The dataset used in training and testing the object detection model consists of 25000
images. Each image consists of the object belonging from the four classes i.e
motorbike, car, bus, truck. And few of the images in the dataset do not consist of any
of the objects belonging from the above-mentioned four classes, so we have deleted
those images from the dataset.

● The performance of object detector and tracker is evaluated on IOU and mAP.

● The dataset we prepared is trained with the help of YOLOv4 model. We have
selected the v4 version of yolo among all the available options. It uses CNN having
twenty four convolution layers, four max-pooling layers and two fully connected
layers.

● The counting of the objects is implemented using Deepsort. This can be achieved with
the help of Kalman filter.

● In terms of looks, features similarity, and movement distance, verified tracks and
detections are evaluated. The association findings of verified tracks and detections are
then generated using the Hungarian method.

● With the help of intersection-over-union (IOU) performance metrics we select the

bounding bos on the vehicles overlapping in the video.

29
● The Kalman filter and the motion prediction model are used to update the multiple
tracking in the motion state. Furthermore we build new tracks for unrelated
detections.

4.3.1 Snapshots of Interfaces:

Below snapshot showing a code for removing the images from the dataset having no object
within it.

Figure 4.1 Code to get rid of images having no object

30
The below Snapshot is of the code to divide the dataset in 90% for training and 10% for
testing.

Figure 4.2 Code for dividing the dataset in the training and testing part.

31
The below snapshot is of the code to start the training of our Yolov4 model on custom
dataset and the weights are getting saved on drive

Figure 4.3 Code to train Yolov4 on custom dataset.

32
The below snapshot shows a code to Copy our trained model in tracking part to tracking part
and running save_model.py in cmd.

Figure 4.4 Code to copy our trained model to the tracking part.

33
Below snapshot shows a code for importing vehicle Counting class in object_tracker.py and
using run to start running. Video is divided into frames and each object in one frame has
assigned some unique id’s.

Figure 4.5 Snapshot of the code to start the counting using Deepsort

34
4.3.2 Results:

Below is the graph plotted between the loss and number of iteration. The graph shows the
two curve one is of blue color and another is of red color. Blue curve shows the loss while the
red curve shows the mean average precision(mAP) at 50% Intersection-over-Union(IOU)
threshold ([email protected])

Figure 4.6 Graph showing the loss and mAP while training the Yolov4 on custom dataset

35
The below snapshot is output of the tracking code which is showing the tracking of object in
each frame of the video dataset.

Figure 4.7 Snapshot showing the output of the tracking code.

4.4 Chapter Summary

This chapter contains an introduction about the prerequisites required for implementation. It
also contains the result obtained by applying our model.

36
CHAPTER 5

CONCLUSIONS

5.1 Conclusion

We can see that we can successfully detect and count the vehicles in the given video frames
containing four classes of vehicles: cars, buses, trucks and bikes. After training our dataset on
the YOLOv4 model we obtain an mAP(mean average precision) of 83.80% and we are also
able to detect and count vehicles in bad weather conditions.

Our results also helps us in understanding various deep learning models and choosing Yolov4
and deep sort for implementation and which helps us in obtaining desired outcome and also
come forward with the challenges which need to be improved in our proposed system.

5.2 Future Scope

In the future, we plan to work on improving the limitations of our project that the model is
unable to count Indian vehicles like autos which are used widely in India. We also hope to
work on training our model on a dataset containing images of bad weather conditions like
heavy rainfall, dusty weather and dense fog and achieve higher accuracy and performance.

Driver Drowsiness Detection System: A Project Phase-2 Report ON
100% (1)
Driver Drowsiness Detection System: A Project Phase-2 Report ON
51 pages
Finding Missing Person Using Ai
100% (1)
Finding Missing Person Using Ai
18 pages
Fianl Year Project Report
No ratings yet
Fianl Year Project Report
62 pages
Ddds
0% (1)
Ddds
16 pages
Face Detection
No ratings yet
Face Detection
14 pages
Skin Disease Detection Using Machine Learning
100% (2)
Skin Disease Detection Using Machine Learning
59 pages
Face Emotion Recognition Project in Python
No ratings yet
Face Emotion Recognition Project in Python
44 pages
On Attendance Management System
100% (1)
On Attendance Management System
25 pages
Object Detection System Data Flow Diagram
100% (1)
Object Detection System Data Flow Diagram
16 pages
Traffic Prediction Using ML
0% (1)
Traffic Prediction Using ML
7 pages
Face Recognition Based Attendance System
No ratings yet
Face Recognition Based Attendance System
70 pages
Face Mask Detection Project
0% (1)
Face Mask Detection Project
57 pages
Deep Learning Based Car Damage Detection, Classification and Severity
No ratings yet
Deep Learning Based Car Damage Detection, Classification and Severity
7 pages
Gesture Based Home Automation
100% (1)
Gesture Based Home Automation
13 pages
Real - Time Human Detection & Counting: Project Report BY
No ratings yet
Real - Time Human Detection & Counting: Project Report BY
20 pages
AI-Based Picture Translation App: 1) Background/ Problem Statement
No ratings yet
AI-Based Picture Translation App: 1) Background/ Problem Statement
7 pages
Object Detection
0% (1)
Object Detection
57 pages
Project Report On OCR
80% (5)
Project Report On OCR
55 pages
UPI (Report)
100% (1)
UPI (Report)
30 pages
Real-Time Traffic Sign Recognition Project
100% (1)
Real-Time Traffic Sign Recognition Project
8 pages
Facial Emotion Recognition Report
No ratings yet
Facial Emotion Recognition Report
12 pages
Human Detection System Report
No ratings yet
Human Detection System Report
39 pages
Project Report
No ratings yet
Project Report
67 pages
Face Mask Detection with Python AI
No ratings yet
Face Mask Detection with Python AI
16 pages
"Missing Person Detection Using Ai": Bapuji Educational Association (Regd.)
100% (1)
"Missing Person Detection Using Ai": Bapuji Educational Association (Regd.)
12 pages
Age and Gender Detection Using Deep Learning
No ratings yet
Age and Gender Detection Using Deep Learning
14 pages
Computer Vision in Autonomous Vehicles
No ratings yet
Computer Vision in Autonomous Vehicles
17 pages
Final Year Project Topics (300+)
100% (1)
Final Year Project Topics (300+)
28 pages
Online Fake Logo Detection System Python Project
No ratings yet
Online Fake Logo Detection System Python Project
8 pages
Fake Logo Detection DT Report
100% (1)
Fake Logo Detection DT Report
26 pages
Smart Traffic Monitoring System Project Report
No ratings yet
Smart Traffic Monitoring System Project Report
25 pages
Modeling Prediction of Cyber Hacking Breaches
No ratings yet
Modeling Prediction of Cyber Hacking Breaches
33 pages
Final Project Report
No ratings yet
Final Project Report
76 pages
Leaf Disease Detection Guide
No ratings yet
Leaf Disease Detection Guide
29 pages
Detection of Cyber Attack in Network Using Machine Learning Techniques-1
No ratings yet
Detection of Cyber Attack in Network Using Machine Learning Techniques-1
71 pages
Flight DElay Report
No ratings yet
Flight DElay Report
49 pages
A Major Project Report Final Year
No ratings yet
A Major Project Report Final Year
70 pages
BCA Project: Face Recognition System
17% (6)
BCA Project: Face Recognition System
76 pages
Traditional TCP: Unit-4: Mobile Transport Layer
No ratings yet
Traditional TCP: Unit-4: Mobile Transport Layer
30 pages
Cat Vs Dog Classification Using Python
No ratings yet
Cat Vs Dog Classification Using Python
23 pages
Smart Agriculture with ML & IoT
No ratings yet
Smart Agriculture with ML & IoT
33 pages
Traffic Signs Recognition
0% (1)
Traffic Signs Recognition
20 pages
Report On Automatic Attendance System Using Face Recognition
100% (1)
Report On Automatic Attendance System Using Face Recognition
23 pages
Fake Account Detection
100% (1)
Fake Account Detection
34 pages
Driver'S Drowsiness Detection System: Bachelor of Technology in Computer Science and Engineering by
0% (1)
Driver'S Drowsiness Detection System: Bachelor of Technology in Computer Science and Engineering by
18 pages
Pothole Detection PPT PERFECT
100% (2)
Pothole Detection PPT PERFECT
10 pages
Sorting Visualizer
100% (1)
Sorting Visualizer
21 pages
Employee Leave Management System Project Report Vtu
50% (2)
Employee Leave Management System Project Report Vtu
3 pages
Road Light Gateway Technology Overview
100% (1)
Road Light Gateway Technology Overview
15 pages
Synopsis
50% (2)
Synopsis
47 pages
Anti Sleep Accident Prevention System Using Eye Blink Sensor
No ratings yet
Anti Sleep Accident Prevention System Using Eye Blink Sensor
21 pages
Seminar Report On Object Detection and Tracking
No ratings yet
Seminar Report On Object Detection and Tracking
54 pages
Project Report ON Autonomous Car System Using IOT: Submitted by
No ratings yet
Project Report ON Autonomous Car System Using IOT: Submitted by
59 pages
CWI in Python: Overview and Applications
100% (1)
CWI in Python: Overview and Applications
72 pages
Software Requiement Specifications: Fake News Detector
100% (2)
Software Requiement Specifications: Fake News Detector
10 pages
Driver Drowsiness Detection Project Report
No ratings yet
Driver Drowsiness Detection Project Report
14 pages
File 4
No ratings yet
File 4
60 pages
Accident Detection Using IOT Project Report Final
No ratings yet
Accident Detection Using IOT Project Report Final
51 pages
Vehicle Counting with AI: A Study
No ratings yet
Vehicle Counting with AI: A Study
11 pages
MINOR PRoj Idea2
No ratings yet
MINOR PRoj Idea2
34 pages
OPE10
No ratings yet
OPE10
6 pages
Modicon M580: Hardware Reference Manual
No ratings yet
Modicon M580: Hardware Reference Manual
336 pages
LCD TV Instruction Manual & Safety Guide
No ratings yet
LCD TV Instruction Manual & Safety Guide
47 pages
BSBFIN601 Houzit - Pty - LTD - Assessment - 2 Model
No ratings yet
BSBFIN601 Houzit - Pty - LTD - Assessment - 2 Model
6 pages
System Thinking Ola Ouso
100% (2)
System Thinking Ola Ouso
19 pages
Nozick's Tale of the Slave Explained
No ratings yet
Nozick's Tale of the Slave Explained
1 page
Infinity S j5964803 l209
No ratings yet
Infinity S j5964803 l209
7 pages
Philippine Heart Center 2020 Audit Report
No ratings yet
Philippine Heart Center 2020 Audit Report
5 pages
Kokam Battery Solutions for Transport
No ratings yet
Kokam Battery Solutions for Transport
4 pages
HDB Breakers Manual
No ratings yet
HDB Breakers Manual
89 pages
Roof Plan and Reflected Ceiling Plan
No ratings yet
Roof Plan and Reflected Ceiling Plan
1 page
BA Interview Questions Retail Banking
No ratings yet
BA Interview Questions Retail Banking
3 pages
Essential Interview Skills and Strategies
No ratings yet
Essential Interview Skills and Strategies
21 pages
PDF Software Abstractions Logic Language and Analysis 2nd Edition Daniel Jackson Download
100% (1)
PDF Software Abstractions Logic Language and Analysis 2nd Edition Daniel Jackson Download
72 pages
JSSC - Maneatis - Self-Biased High-Bandwidth Low-Jitter
No ratings yet
JSSC - Maneatis - Self-Biased High-Bandwidth Low-Jitter
9 pages
Dyeing & Finishing List
No ratings yet
Dyeing & Finishing List
2 pages
IPU Cabinet Installation Record Template Rev.C
No ratings yet
IPU Cabinet Installation Record Template Rev.C
7 pages
SPE 140463 Marcellus Shale Hydraulic Fracturing and Optimal Well Spacing To Maximize Recovery and Control Costs
No ratings yet
SPE 140463 Marcellus Shale Hydraulic Fracturing and Optimal Well Spacing To Maximize Recovery and Control Costs
13 pages
Zendocrine Complex
No ratings yet
Zendocrine Complex
2 pages
Capillarys 3 TERA - Service Manual (v1.15P1!22!03-2019)
100% (3)
Capillarys 3 TERA - Service Manual (v1.15P1!22!03-2019)
120 pages
Tankpac Services - 105701 - Web
No ratings yet
Tankpac Services - 105701 - Web
2 pages
International Council On Large Electric Systems: Cigre
No ratings yet
International Council On Large Electric Systems: Cigre
6 pages
Arduino Uno Setup & Troubleshooting Guide
No ratings yet
Arduino Uno Setup & Troubleshooting Guide
11 pages
User-Generated Content Builds Brand Trust
No ratings yet
User-Generated Content Builds Brand Trust
6 pages
55 - Ormoc Sugar Central v. Ormoc City, Feb. 17, 1968
No ratings yet
55 - Ormoc Sugar Central v. Ormoc City, Feb. 17, 1968
1 page
Sagar Bill Book
No ratings yet
Sagar Bill Book
7 pages
3 Hours / 70 Marks: Seat No
No ratings yet
3 Hours / 70 Marks: Seat No
3 pages
Wa0005.
No ratings yet
Wa0005.
13 pages
Document 36
No ratings yet
Document 36
9 pages
PlotXY: New ATP Plotting Tool
No ratings yet
PlotXY: New ATP Plotting Tool
7 pages

Final Year Project Report

Uploaded by

Final Year Project Report

Uploaded by

CHAPTER 1

The majority of vehicle counting systems can be classified as hardware or software-based

1.3 Project Objective

The objectives of our project are:

1.4 Scope of the Project

1.5 Previous Related Work

1.6 Organization of the Report

1.6 Chapter Summary

[Link] problem that is addressed

[Link] solution to be proposed

[Link] main contributions and conclusions.

Fig 2.1 Image recognition vs Image detection

2.1 Object Detection Algorithms

Fig 2.2. Various object detection algorithms [14]

2.2 R-CNN model

Fig 2.3. R-CNN algorithm[14]

2.2.1 Selective Search Algorithm

Selective Search is an item detection region proposal [Link], the image is

2.3 Fast R-CNN model

Fig 2.4. Fast R-CNN algorithm[14]

Comparison of object detection algorithms

2.3.2 Problems with Fast R-CNN model

2.4 Faster R-CNN model

2.5 YOLO-You Only Look Once

2.5.1 Importance of YOLO

YOLO algorithm is important because of the following reasons:

● Speed:YOLO is quite suitable for object detection in real time.

YOLO algorithm mainly deals with three techniques:

Fig 2.6 Residual block

The image shows an example of a bounding box with a yellow outline.

Fig 2.7 Bounding box

2.7 Object detection vs Object tracking

2.8 Approaches to Object Tracking

Some of the techniques were traditional or traditional.

The conclusions of some of the study articles are as follows:

It includes mainly two components:

SORT consists of three components:

1. Detection: Detecting the desired object in the initial stage i.

2.9 Literature Review

Some of the conclusions drawn from the research papers are:

TABLE 2.1. Literature survey

[Link] Papers Name Approach used Dataset Limitations Performance

8. A YOLO-based It proposed a system It COCO Overall efficiency of the system is

2.10 Chapter Summary

SYSTEM DESIGN AND METHODOLOGY

3.1 System Design and Architecture

3.1.1 System Architecture

Below diagram clearly explain the working of our model

Fig 3.1. Basic Architecture

3.2 Dataset Used

(i) 90% for training the model.

(ii) 10% for testing our model

Fig 3.2 Figure showing the glimpse of the dataset

3.3 Data Preprocessing

Fig 3.3 Diagram showing the pre-processing of the Dataset

3.4 Selection of Performance Metrics

IMPLEMENTATION AND RESULTS

4.1 Software and Hardware Requirements

● Processors: Intel Core i5 processor Intel Core i5 processor or higher.

● Operating systems: Windows 10 or latest, macOS or Linux

4.2 Assumptions and dependencies

4.3 Implementation Details

● With the help of intersection-over-union (IOU) performance metrics we select the

4.3.1 Snapshots of Interfaces:

Figure 4.1 Code to get rid of images having no object

Figure 4.3 Code to train Yolov4 on custom dataset.

Figure 4.7 Snapshot showing the output of the tracking code.

4.4 Chapter Summary

5.2 Future Scope

You might also like