0% found this document useful (0 votes)
13 views13 pages

Computer Vision

Computer Vision
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
13 views13 pages

Computer Vision

Computer Vision
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 13

Computer Vision Introduction

1.Computer vision is a subfield of artificial intelligence that deals with acquiring,


processing, analyzing, and making sense of visual data such as digital images and
videos. It is one of the most compelling types of artificial intelligence that we regularly
implement in our daily routines.

2. Computer vision helps to understand the complexity of the human vision system and trains
computer systems to interpret and gain a high-level understanding of digital images or videos.

What is Computer Vision?


Computer vision is one of the most important fields of artificial intelligence
(AI) and computer science engineering that makes computer systems capable
of extracting meaningful information from visual data like videos and
images. Further, it also helps to take appropriate actions and make recommendations
based on the extracted information.

History of Computer Vision


Computer vision is not a new technology because scientists and experts have been
trying to develop machines that can see and understand visual data for almost six
decades. The evolution of computer vision is classified as follows:

1959: The first experiment with computer vision was initiated in 1959, where they showed a cat
as an array of images. Scientifically, this means that image processing begins with simple shapes
such as straight edges.

1963: This was another great achievement for scientists when they developed
computers that could transform 2D images into 3-D images.

1974: This year, optical character recognition (OCR) and intelligent character recognition (ICR)
technologies were successfully discovered. The OCR has solved the problem of recognizing text
printed in any font or typeface, whereas ICR can decrypt handwritten text.

1982: In this year, the algorithm was developed to detect edges, corners, curves, and
other shapes. Further, scientists also developed a network of cells that could recognize
patterns.

2000: In this year, scientists worked on a study of object recognition.

2001: The first real-time face recognition application was developed.

2010: The ImageNet data set became available to use with millions of tagged images,
which can be considered the foundation for recent Convolutional Neural Network (CNN)
and deep learning models.
2012: CNN has been used as an image recognition technology with a reduced error rate.

2014: COCO has also been developed to offer a dataset for object detection and support
future research.

How does Computer Vision Work?


Computer vision is a technique that extracts information from visual data, such as
images and videos. Although computer vision works similarly to human eyes with brain
work, this is probably one of the biggest open questions for IT professionals: How does
the human brain operate and solve visual object recognition?

On a certain level, computer vision is all about pattern recognition which includes the
training process of machine systems for understanding the visual data such as images
and videos, etc.

Firstly, a vast amount of visual labeled data is provided to machines to train it. This
labeled data enables the machine to analyze different patterns in all the data points and
can relate to those labels. E.g., suppose we provide visual data of millions of dog images.
In that case, the computer learns from this data, analyzes each photo, shape.
Task Associated with Computer Vision
Although computer vision has been utilized in so many fields, there are a few common
tasks for computer vision systems. These tasks are given below:

o Object classification: Object classification is a computer vision technique/task used


to classify an image, such as whether an image contains a dog, a person's face, or a
banana. It analyzes the visual content (videos & images) and classifies the object into the
defined category. It means that we can accurately predict the class of an object
present in an image with image classification.
o Object Identification/detection: Object identification or detection uses image
classification to identify and locate the objects in an image or video. With such detection
and identification technique, the system can count objects in a given image or scene and
determine their accurate location and labeling. For example, in a given image, one dog,
one cat, and one duck can be easily detected and classified using the object detection
technique.
o Object Verification: The system processes videos, finds the objects based on search
criteria, and tracks their movement.
o Object Landmark Detection: The system defines the key points for the given object in
the image data.
o Image Segmentation: Image segmentation not only detects the classes in an image as
image classification; instead, it classifies each pixel of an image to specify what objects it
has. It tries to determine the role of each pixel in the image.
o Object Recognition: In this, the system recognizes the object's location with respect to
the image.
Applications of computer vision
Computer vision is one of the most advanced innovations of artificial intelligence and
machine learning. As per the increasing demand for AI and Machine Learning
technologies, computer vision has also become a center of attraction among different
sectors.

Below are some most popular applications of computer vision:

o Facial recognition: Computer vision has enabled machines to detect face images of
people to verify their identity. Initially, the machines are given input data images in which
computer vision algorithms detect facial features and compare them with databases of
fake profiles. Popular social media platforms like Facebook also use facial recognition to
detect and tag users. Further, various government spy agencies are employing this
feature to identify criminals in video feeds.
o Healthcare and Medicine: Computer vision has played an important role in the
healthcare and medicine industry. Traditional approaches for evaluating cancerous tumors
are time-consuming and have less accurate predictions, whereas computer vision
technology provides faster and more accurate chemotherapy response assessments;
doctors can identify cancer patients who need faster surgery with life-saving precision.
o Self-driving vehicles: Computer vision technology has also contributed to its role in self-
driving vehicles to make sense of their surroundings by capturing video from different
angles around the car and then introducing it into the software.
o Optical character recognition (OCR)
Optical character recognition helps us extract printed or handwritten text from visual data
such as images. Further, it also enables us to extract text from documents like invoices,
bills, articles, etc.
o Machine inspection: Computer vision is vital in providing an image-based automatic
inspection. It detects a machine's defects, features, and functional flaws, determines
inspection goals, chooses lighting and material-handling techniques, and other
irregularities in manufactured products.
o Retail (e.g., automated checkouts): Computer vision is also being implemented in the
retail industries to track products, shelves, wages, record product movements into the
store, etc. This AI-based computer vision technique automatically charges the customer
for the marked products upon checkout from the retail stores.
o 3D model building: 3D model building or 3D modeling is a technique to generate a 3D
digital representation of any object or surface using the software. In this field also,
computer vision plays its role in constructing 3D computer models from existing objects.
o Medical imaging: Computer vision helps medical professionals make better decisions
regarding treating patients by developing visualization of specific body parts such as
organs and tissues. It helps them get more accurate diagnoses and a better patient care
system.
o Automotive safety: Computer vision has added an important safety feature in
automotive industries. E.g., if a vehicle is taught to detect objects and dangers, it could
prevent an accident and save thousands of lives and property.
o Surveillance: It is one of computer vision technology's most important and beneficial use
cases. Nowadays, CCTV cameras are almost fitted in every place, such as streets, roads,
highways, shops, stores, etc., to spot various doubtful or criminal activities. It helps
provide live footage of public places to identify suspicious behavior, identify dangerous
objects, and prevent crimes by maintaining law and order.
o Fingerprint recognition and biometrics: Computer vision technology detects
fingerprints and biometrics to validate a user's identity. Biometrics deals with recognizing
persons based on physiological characteristics, such as the face, fingerprint, vascular
pattern, or iris, and behavioral traits, such as gait or speech. It combines Computer Vision
with knowledge of human physiology and behavior.
Computer Vision Challenges
There are a few challenges observed while working with computer vision technology.

o Reasoning and analytical issuesAll programming languages and technologies require


the basic logic behind any task. To become a computer vision expert, you must have
strong reasoning and analytical skills. If you don't have such skills, then defining any
attribute in visual content may be a big problem.
o Privacy and security: Privacy and security are among the most important factors for any
country. Similarly, vision-powered surveillance is also having various serious privacy
issues for lots of countries. It restricts users from accessing unauthorized content.
o Duplicate and false content: Cyber security is always a big concern for all
organizations, and they always try to protect their data from hackers and cyber fraud. A
data breach can lead to serious problems, such as creating duplicate images and videos
over the internet.

Computer Vision Applications


Computer vision is a subfield of AI (Artificial Intelligence), which enables
machines to derive some meaningful information from any image, video, or
other visual input and perform the required action on that information.

 Computer vision is like eyes for an AI system, which means if AI enables the
machine to think, computer vision enables the machines to see and observe the
visual inputs.

 Computer vision technology is based on the concept of teaching computers to


process an image or a visual input at pixels and derive meaningful information
from it.

common tasks for which computer vision can be used:


o Image Classification: Image classification is a computer vision technique
used to classify an image, such as whether an image contains a dog, a person's
face, or a banana. It means that with image classification, we can
accurately predict the class of an object present in an image.
o Object Detection: Object detection uses image classification to identify and
locate the objects in an image or video. With such detection and identification
technique, the system can count objects in a given image or scene and determine
their accurate location, along with their labelling. For example, in a given image,
there is one person and one cat, which can be easily detected and classified using
the object detection technique.

o Object Tracking: Object tracking is a computer vision technique used to follow a


particular object or multiple items. Generally, object tracking has applications in
videos and real-world interactions, where objects are firstly detected and then
tracked to get observation.
o Semantic Segmentation: Image segmentation is not only about detecting the
classes in an image as image classification. Instead, it classifies each pixel of an
image to specify what objects it has.

1. Computer Vision in Healthcare


The Healthcare industry is rapidly adopting new technologies and automation solutions,
one of which is computer vision. In the healthcare industry, computer vision has the
following applications:Backward Skip 10sPlay VideoForward Skip 10s

o X-Ray Analysis
Computer vision can be successfully applied for medical X-ray imaging. With computer
vision, X-ray analysis can be automated with enhanced efficiency and accuracy. The state-
of-art image recognition algorithm can be used to detect patterns in an X-ray image that
are too subtle for the human eyes.
o Cancer Detection
Computer vision is being successfully applied for breast and skin cancer detection. With
image recognition, doctors can identify anomalies by comparing cancerous and non-
cancerous cells in images.
o CT Scan and MRI
Computer vision has now been greatly applied in CT scans and MRI analysis. AI with
computer vision designs such a system that analyses the radiology images with a high
level of accuracy, similar to a human doctor, and also reduces the time for disease
detection, enhancing the chances of saving a patient's life.

2. Computer Vision in Transportation


Some popular applications of computer vision in the transportation industry:

o Self-driving cars
Computer vision is widely used in self-driving cars. It is used to detect and classify objects
(e.g., road signs or traffic lights), create 3D maps or motion estimation, and plays a key
role in making autonomous vehicles a reality.
o Pedestrian detection
Computer vision has great application and research in Pedestrian detection. With the help
of cameras, pedestrian detection automatically identifies and locate the pedestrians in
image or video. This pedestrian detection is very helpful in different fields such as traffic
management, autonomous driving, transit safety, etc.
o Road Condition Monitoring & Defect detection
Computer vision has also been applied for monitoring the road infrastructure condition by
accessing the variations in concrete and tar. A computer vision-enabled system
automatically senses pavement degradation, which successfully increases road
maintenance allocation efficiency and decreases safety risks related to road accidents.

3. Computer Vision in Manufacturing


In the manufacturing industry, the demand for automation is at its peak. Below are some
most popular applications

o Defect Detection
This is perhaps, the most common application of computer vision. With computer vision,
we can detect defects such as cracks in metals, paint defects, bad prints, etc.
o Analyzing text and barcodes (OCR)
Nowadays, each product contains a barcode on its packaging, which can be analyzed or
read with the help of the computer vision technique OCR. Optical character recognition or
OCR helps us detect and extract printed or handwritten text from visual data such as
images.
o Fingerprint recognition and Biometrics
Computer vision technology is used to detect fingerprints and biometrics to validate a
user's identity.
Biometrics is the measurement or analysis of physiological characteristics of a person that
make a person unique such as Face, Finger Print, iris Patterns, etc. It makes use of
computer vision along with knowledge of human physiology and behaviour.
o 3D Model building
3D model building or 3D modelling is a technique to generate a 3D digital representation
of any object or surface using the software. Computer vision plays its role here also in
constructing 3D computer models from existing objects.

4. Computer Vision in Agriculture


Some popular cases of computer vision applications in Agriculture:

o Crop Monitoring
In the agriculture sector, crop and yield monitoring are the most important tasks for better
agriculture. With computer vision systems, real-time crop monitoring and identification of
any crop variation due to any disease or deficiency of nutrition can be made.
o Automatic Weeding
An automatic weeding machine is an intelligent project enabled with AI and computer
vision that removes unwanted plants or weeds around the crops. Traditionally weeding
methods require human labour, which is costly and inefficient compared to automatic
weeding systems.
Computer vision enables the intelligent detection and removal of weeds using robots,
which reduces costs and ensures higher yields.
o Plant Disease Detection
Computer vision is also used in automated plant disease detection, which is important at
an early stage of plant development.

5. Computer Vision in Retail


Some popular applications of computer vision in the retail industry are given below:

o Self-checkout
Self-checkout enables the customers to complete their transactions from a retailer without
the need for human staff, and this becomes possible with computer vision. Self-checkouts
are now helping retailers in avoiding long queues and manage customers.
o Automatic replenishment
Automated stock replenishment is a leading technology innovation in retail sectors.
Automatic replenishment with computer vision systems captures the image data and
performs a complete inventory scan to track the shelves item at regular intervals.
o People Counting
Nowadays, various situations occur where we may need the count of people or customers
entering and leaving the stores. This foot count or people counting can be done by
computer vision systems that analyze the image or video data captured by the in-store
cameras. People counting is helpful in managing the people and allowing the limited
people for cases such as Covid social distancing.

Computer Vision Techniques


What is Computer Vision?
Computer vision is a sub-field of AI and machine learning that enables the machine to
see, understand, and interpret the visuals such as images, video, etc., and extract useful
information from them that can be helpful in the decision-making of AI applications. It
can be considered as an eye for an AI application.

A typic process of Computer vision is illustrated in the above image. It mainly performs three steps, which are:

1. Capturing an Image

A computer vision software or application always includes a digital camera or CCTV to capture the image. So,
firstly it captures the image and puts it as a digital file that consists of Zero and one's.

2. Processing the image

In the next step, different CV algorithms are used to process the digital data stored in a file. These algorithms
determine the basic geometric elements and generate the image using the stored digital data.

3. Analyzing and taking required action

Finally, the CV analyses the data, and according to this analysis, the system takes the
required action for which it is designed.

Top Computer Vision Techniques


1. Image Classification

Image classification is the simplest technique of Computer Vision. The main aim of image
classification is to classify the image into one or more different categories. Image
classifier basically takes an image as input and tells about different objects present in
that image, such as a person, dog, tree, etc.

Image classification is basically of two types, Binary classification and multi-class


classification. As the name suggests, binary image classification looks for a single class
in the given image and provides results based on if the image has that object or not. For
example, we can achieve superhuman performance in detecting skin cancer in humans
by training an AI system on both images that have skin cancer and images that do not
have skin cancer.

2. Object Detection

Object detection is another popular technique of computer vision that can be performed
after Image classification or which uses image classification to detect the objects in
visual data. It is basically used to recognize the objects within the boundary boxes and
find the class of the objects in the image.

As human beings, whenever we see a visual or look at an image or video, we can


immediately recognize and even locate the objects within a moment. So, the aim of
object detection is to replicate the same human intelligence into machines to identify
and locate the objects.

Object detection has several applications, including object tracking, retrieval, video
surveillance, image captioning, etc.

3. Semantic Segmentation

Semantic Segmentation is not only about detecting the classes in an image as image
classification. Instead, it classifies each pixel of an image to specify what objects it has. It
tries to determine the role of each pixel in the image. It classifies similar objects as a
single class from the pixel levels. For example, if an image contains two dogs, then
semantic segmentation will put both the dogs under the same label. It tries to
understand the role of each pixel in an image.

4. Instance Segmentation

Instance segmentation can classify the objects in an image at pixel level as similar to
semantic segmentation but with a more advanced level. It means Instance Segmentation
can classify similar types of objects into different categories. For example, if visual
consists of various cars, then with semantic segmentation, we can tell that there are
multiple cars, but with instance segmentation, we can label them according to their
colour, shape, etc.

Instance segmentation is a typical computer vision task compared to other techniques as


it needs to analyse the difference within visual data with different overlapping objects
and different backgrounds.

Using the below image, we can analyse the difference between semantic segmentation
and instance segmentation, where semantic segmentation classified all the persons as
singly entities, whereas instance segmentation classified all the persons as different by
considering colours also.
5. Panoptic Segmentation =Semantic + Instance Segmentation

Panoptic Segmentation is one of the most powerful computer vision techniques as it


combines the Instance and Semantic Segmentation techniques. It means with Panoptic
Segmentation, you can classify image objects at pixel levels and can also identify
separate instances of that class.

6. Keypoint Detection

Keypoint detection tries to detect some key points in an image to give more details
about a class of objects. It basically detects people and localizes their key points. There
are mainly two keypoint detection areas, which are Body Keypoint
Detection and Facial Keypoint Detection.

For example, Facial keypoint detection includes detecting key parts of the human face
such as the nose, eyes, corners, eyebrows, etc. Keypoint detection mainly has
applications, including face detection, pose detection, etc.

With Pose estimation, we can detect what pose people have in a given image, which
usually includes where the head, eyes, nose, arms, shoulders, hands, and legs are in an
image. This can be done for a single person or multiple people as per the need.

7. Person Segmentation

Person segmentation is a type of image segmentation technique which is used to


separate the person from the background within an image. It can be used after the pose
estimation, as with this, we can closely identify the exact location of the person in the
image as well as the pose of that person.

8. Depth Perception

Depth perception is a computer vision technique that provides the visual ability to
machines to estimate the 3D depth/distance of an object from the source. Depth
Perception has wide applications, including the Reconstruction of objects in Augmented
Reality, Robotics, self-driving cars, etc.
9. Image Captioning

Image captioning, as the name suggests, is about giving a suitable caption to the image
that can describe the image. It makes use of neural networks, where when we input an
image, then it generates a caption for that image that can easily describe the image. It is
not only the task of Computer vision but also an NLP task.

10. 3D Object Reconstruction

As the name suggests, 3D object reconstruction is a technique that can extract 3D


objects from a 2D image.

You might also like