Live object detection using tensorflow object detection api and speech output using gtss and pygame.
This repository is a application of tensorflow object detection api to detect objects in webcam feed and it gives audible output for the detected object's class name. For audio output is uses google text to speech to get audio files for class names and pygame to play the audio.
The repository is tested with Windows 10, and it will also work for Windows 7 and 8. The general procedure can also be used for Linux operating systems, but some minor changes might be required.
The TensorFlow Object Detection API requires using the specific directory structure provided in its GitHub repository. It also requires several additional Python packages, and a few extra setup commands to get everything set up to run or train an object detection model.
Create a virtual environment using anaconda and in the environment install below packages. Or you can use pip.
tensorflow >= 2.2
opencv-python >= 4.0
protobuf >= 3.1
Clone TensorFlow object detection repository located at or download as zip and extract. Go to the directory where the repo is clone or extracted and navigate to research/
. Open a terminal in the research
directory and activate the environment created in step 1. Run below command to generate python scripts from protocol buffer object present in the object_detection/protos
protoc object_detection/protos/*.proto --python_out=.
Clone this repository or download and extract all the contents directly into the research/object_detection
directory. Replace object_detection/utils/
file with the one found in Object_detection_and_Speech/utils
. The Object_detection_and_Speech/utils/
file contains some modification required for this project.
From the object_detection
directory open a terminal and activate environment created in step 1. Then run the below command:
python Object_detection_and_Speech/
If you want to run object detection with distance warning then:
python Object_detection_and_Speech/