Computer Vision Introduction
Computer Vision Introduction
Computer vision helps to understand the complexity of the human vision system
and trains computer systems to interpret and gain a high-level understanding of
digital images or videos. In the early days, developing a machine system having
human-like intelligence was just a dream, but with the advancement of artificial
intelligence and machine learning, it also became possible. Similarly, such
intelligent systems have been developed that can "see" and interpret the world
around them, similar to human eyes. The fiction of yesterday has become the
fact of today
1959: The first experiment with computer vision was initiated in 1959, where they
showed a cat as an array of images. Initially, they found that the system reacts first to
hard edges or lines, and scientifically, this means that image processing begins with
simple shapes such as straight edges.
1960: In 1960, artificial intelligence was added as a field of academic study to solve
human vision problems.
1963: This was another great achievement for scientists when they developed
computers that could transform 2D images into 3-D images.
1974: This year, optical character recognition (OCR) and intelligent character
recognition (ICR) technologies were successfully discovered. The OCR has solved
the problem of recognizing text printed in any font or typeface, whereas ICR can
decrypt handwritten text. These inventions are one of the greatest achievements in
document and invoice processing, vehicle number plate recognition, mobile
payments, machine translation, etc.
1982: In this year, the algorithm was developed to detect edges, corners, curves, and
other shapes. Further, scientists also developed a network of cells that could recognize
patterns.
2000: In this year, scientists worked on a study of object recognition.
2001: The first real-time face recognition application was developed.
2010: The ImageNet data set became available to use with millions of tagged images,
which can be considered the foundation for recent Convolutional Neural Network
(CNN) and deep learning models.
2012: CNN has been used as an image recognition technology with a reduced error
rate.
2014: COCO has also been developed to offer a dataset for object detection and
support future research.
Firstly, a vast amount of visual labeled data is provided to machines to train it.
This labeled data enables the machine to analyze different patterns in all the
data points and can relate to those labels. E.g., suppose we provide visual data
of millions of dog images. In that case, the computer learns from this data,
analyzes each photo, shape, the distance between each shape, color, etc., and
hence identifies patterns similar to dogs and generates a model. As a result, this
computer vision model can now accurately detect whether the image contains a
dog or not for each input image.
Object classification
Object identification
Object tracking
Optical character recognition
1. Object classification
2. Object identification
The system identifies a particular object in a photo, video, or image. For
example, with object identification, the system would be able to not only
distinguish people in a photo, but also analyze their appearance to determine the
identity or traits of those people.
3. Object tracking
The system analyzes a video to process the location of a moving object over
time. For example, with object tracking, a parking lot surveillance camera could
identify cars in a parking lot and provide information about the location and
movements of those cars over time.
4. Optical character recognition
The system identifies letters and numbers in images and convert that text
into machine-encoded text that can be read by other computer
applications or edited by users.
Healthcare
Smart Cities