Object Detection Tutorial (YOLO)
Description
In this tutorial we will go step by step on how to run state of the art object detection CNN (YOLO) using
open source projects and TensorFlow, YOLO is a R-CNN network for detecting objects and proposing
bounding boxes on them. It has more a lot of variations and configurations. We will focus on using the
Tiny, 80 classes COCO one. YOLO is pretty lightweight and it achieves around 1FPS on Euclid. Impressive!
Link to the site and paper:
[Link] [Link]
Requirements and setup
First of all we need to install a couple of packages and download some more from GitHub
1) TensorFlow ( Install using pip )
2) DarkFlow – TensorFlow adaptation of Darknet network runner
3) Some configuration and weight files
1. Installing TensorFlow
Install TensorFlow using PIP : $ sudo pip install tensorflow
Note: if you don’t have pip yet install it using: $ sudo apt-get install python-pip
2. Installing DarkFlow
DarkFlow is a network builder adapted from Darknet, it allows building TensorFlow networks from cfg
files and loading pre trained weights. We will use it to run YOLO.
a. Clone DarkFlow from : [Link]
b. Weights and cfg files can be found on : [Link]
c. We will use Tiny-Yolo: COCO model. (Last One)
d. Download cfg and weights file and copy them to the DarkFlow folder
e. Classes names file [Link] (Found in the tutorial page)
The Code
1. Yolo demo code over ROS using Euclid’s cameras
#! /usr/bin/env python
import sys
[Link](0, '/opt/ros/kinetic/lib/python2.7/dist-packages/darkflow')
from [Link] import TFNet
import cv2
import threading
import time
import rospy
import os
from cv_bridge import CvBridge
from sensor_msgs.msg import Image
class EuclidObjectRecognizer():
def __init__(self):
[Link] = [Link] = None
[Link] = [Link]()
[Link] = 1
[Link] = []
script_path = [Link]([Link](__file__))
[Link] = {"model": [Link](script_path, "[Link]"), "load":
[Link](
script_path, "[Link]"), "threshold": 0.20, "config":
script_path}
[Link] = TFNet([Link])
[Link] = [Link]['colors']
[Link] = {}
# Start ROS
rospy.init_node("object_recognizer", anonymous=True)
[Link] = CvBridge()
[Link] = [Link]("/euclid/object/live", Image)
[Link] = [Link](
"/camera/color/image_raw", Image, [Link])
[Link] = [Link](10)
while [Link] == None:
[Link]()
[Link] = [Link](target=[Link])
[Link]()
[Link]()
def newColorImage(self, imageMsg):
[Link] = [Link]([Link].imgmsg_to_cv2(imageMsg),
cv2.COLOR_RGB2BGR)
def getClassColor(self, className):
if className in [Link]:
return [Link][className]
[Link][className] = [Link][len([Link]) + 10]
def mainThread(self):
h, w, _ = [Link]
while not rospy.is_shutdown():
[Link] = [Link]()
for bbox in [Link]:
left, top, right, bot, label = bbox['topleft']['x'], bbox['topleft'][
'y'], bbox['bottomright']['x'], bbox['bottomright']['y'],
bbox['label']
color = [Link](label)
[Link]([Link], (left, top),(right, bot), color, 3)
[Link]([Link], label, (left, top - 12),0, 2e-3 * h,
color, 1)
[Link]([Link].cv2_to_imgmsg([Link],
"bgr8"))
[Link]()
def liveRecognitionThread(self):
print "Starting live recognition"
while not rospy.is_shutdown():
[Link] = [Link]()
[Link] = [Link].return_predict([Link])
[Link] = [Link]() - [Link]
print "Live recognition Stopped"
if __name__ == "__main__":
try:
recgonizer = EuclidObjectRecognizer()
except Exception as e:
print e
rospy.signal_shutdown()
2. Setup the configuration and weights
1. Download the weights file
2. Download the configuration file
3. Download the classes names file [Link]
Change [Link] to point to the configuration and weight files you download.
Note: I recommend saving the weights and configuration files in the same directory as the
script or to change the script_path variable to point to that directory.
Note: The classes names should be in the same directory as the configuration file.
3. Run the script
Now we are ready to run the script, The script subscribes to the RGB camera topic, while YOLO runs in a
background thread predicting bounding boxes. The script also draws these boxes into an OpenCV image and
publishes the result using a ROS Topic.
$ python yolo_demo.py
Note: Cameras node should be running in order to publish new images.
4. Time to play
Now you can try different weights and configurations, or simply start building an application using this
amazing capability.
Some ideas:
f. Use depth to do tracking on the bounding boxes to get better results and FPS
g. Use the depth camera to calculate the distance of objects from Euclid
h. Use object detection for scene understanding and reasoning.