ARTIFICIAL INTELLIGENCE
QUESTION BANK – CLASS 10
UNIT 5: COMPUTER VISION
MCQs:
1.What does Computer Vision enable machines to do?
a) Read text
b) Hear sounds
c) See through images and process visual data
d) Speak
Answer: c) See through images and process visual data
2.Which application of Computer Vision involves recognizing human faces for
security purposes?
a) Face Filters
b) Google Translate App
c) Facial Recognition
d) Medical Imaging
Answer: c) Facial Recognition
3.What technology uses Computer Vision for self-driving cars?
a) Image Classification
b) Object Detection
c) Convolution
d) Medical Imaging
Answer: b) Object Detection
4.In a grayscale image, the darkest pixel has a value of:
a) 255
b) 128
c) 0
d) 1
Answer: c) 0
5.What are the three primary colors used in an RGB image?
a) Red, Yellow, Blue
b) Red, Green, Blue
c) Red, Green, Black
d) Yellow, Blue, White
Answer: b) Red, Green, Blue
6.Which of the following applications uses Computer Vision to help users search for
information using an image?
a) Face Filters
b) Google’s Search by Image
c) Google Translate App
d) Inventory Management
Answer: b) Google’s Search by Image
What type of image does not have color but consists of different shades of gray?
a) RGB Image
b) Grayscale Image
c) Filtered Image
d) Segmented Image
Answer: b) Grayscale Image
The term “pixel” refers to:
a) A picture element
b) A processing unit
c) A color model
d) An algorithm
Answer: a) A picture element
What does a convolution operation do in image processing?
a) Brightens the image
b) Combines multiple images
c) Multiplies the image by a kernel
d) Increases the resolution of the image
Answer: c) Multiplies the image by a kernel
What is the range of pixel values in an 8-bit grayscale image?
a) 0-100
b) 0-500
c) 0-255
d) 0-1024
Answer: c) 0-255
In computer vision, which task involves assigning a label to an image from a fixed set
of categories?
a) Object Detection
b) Instance Segmentation
c) Image Classification
d) Feature Extraction
Answer: c) Image Classification
Which layer of the Convolutional Neural Network (CNN) is responsible for extracting
high-level features like edges?
a) Fully Connected Layer
b) Convolution Layer
c) Pooling Layer
d) ReLU Layer
Answer: b) Convolution Layer
What does Max Pooling do in a CNN?
a) Extracts high-level features
b) Returns the maximum value from the image region
c) Classifies the image
d) Reduces noise in the image
Answer: b) Returns the maximum value from the image region
What is the purpose of the Fully Connected Layer in a CNN?
a) Reduce image size
b) Extract features
c) Perform image segmentation
d) Classify the image into labels
Answer: d) Classify the image into labels
Which feature in images is considered the easiest to detect in image processing?
a) Edges
b) Corners
c) Flat surfaces
d) Textures
Answer: b) Corners
What type of pooling reduces the spatial size of the convolved feature while retaining
the important features?
a) Average Pooling
b) Max Pooling
c) Zero Pooling
d) Dynamic Pooling
Answer: b) Max Pooling
In an image, how are RGB images stored?
a) As a single grayscale value
b) In three different channels
c) As a string of text
d) Using a single bit
Answer: b) In three different channels
Which computer vision task involves finding instances of real-world objects in images
or videos?
a) Instance Segmentation
b) Object Detection
c) Feature Extraction
d) Classification
Answer: b) Object Detection
In computer systems, pixel data is stored in:
a) Binary format
b) Text format
c) Audio format
d) XML format
Answer: a) Binary format
What is the result when R=G=B=0 in an RGB image?
a) Black
b) White
c) Gray
d) Red
Answer: a) Black
Which application of Computer Vision converts 2D scans into 3D models for medical
professionals?
a) Self-driving cars
b) Medical Imaging
c) Google Translate App
d) Face Filters
Answer: b) Medical Imaging
In which year was the concept of Computer Vision introduced?
a) 1960s
b) 1970s
c) 1980s
d) 1990s
Answer: b) 1970s
Which component of a CNN helps introduce non-linearity into the feature map?
a) Pooling Layer
b) Convolution Layer
c) Fully Connected Layer
d) ReLU Layer
Answer: d) ReLU Layer
What is an 8-bit image’s maximum pixel value?
a) 100
b) 128
c) 200
d) 255
Answer: d) 255
What feature does Google Translate use to provide real-time text translation through
the camera?
a) Optical Character Recognition
b) Face Detection
c) Image Filters
d) Object Detection
Answer: a) Optical Character Recognition
Which image property determines how much detail the image has?
a) Color depth
b) Pixel value
c) Resolution
d) Filter
Answer: c) Resolution
What is a key challenge in object detection?
a) Identifying edges
b) Locating the objects accurately
c) Applying filters
d) Storing the image
Answer: b) Locating the objects accurately
What does the kernel in convolution help in image processing?
a) Identifying colors
b) Resizing the image
c) Enhancing certain features
d) Increasing brightness
Answer: c) Enhancing certain features
Which of the following is NOT a layer in Convolutional Neural Networks (CNN)?
a) Fully Connected Layer
b) Pooling Layer
c) ReLU Layer
d) Median Filter Layer
Answer: d) Median Filter Layer
What is the output color when R=255, G=0, and B=0 in the RGB model?
a) Blue
b) Green
c) Red
d) Yellow
Answer: c) Red
What is the term for assigning each pixel a label in the instance segmentation task?
a) Localization
b) Classification
c) Feature extraction
d) Instance segmentation
Answer: d) Instance segmentation
In convolution, which operation occurs after multiplying the image and the kernel?
a) Pooling
b) Summation
c) Filtering
d) Transformation
Answer: b) Summation
The Google Translate App uses Computer Vision for which task?
a) Facial Recognition
b) Image Classification
c) Object Detection
d) Text Translation
Answer: d) Text Translation
The more pixels an image has, the __ it is.
a) Smaller
b) Larger
c) Blurred
d) Brighter
Answer: b) Larger
What is the smallest unit of a digital image?
a) Frame
b) Pixel
c) Color depth
d) Bit
Answer: b) Pixel
In the RGB color model, what happens when all channels (R, G, B) are set to 255?
a) Black color is produced
b) Gray color is produced
c) Red color is produced
d) White color is produced
Answer: d) White color is produced
What type of layer in a CNN is responsible for reducing the size of the image?
a) Pooling Layer
b) Convolution Layer
c) Fully Connected Layer
d) Feature Layer
Answer: a) Pooling Layer
What is the main purpose of image classification in Computer Vision?
a) To extract edges from images
b) To reduce the size of the image
c) To assign an image a label from a set of categories
d) To increase the resolution
Answer: c) To assign an image a label from a set of categories
Which term refers to detecting the presence and location of an object in an image?
a) Object Detection
b) Localization
c) Image Segmentation
d) Feature extraction
Answer: b) Localization
What are grayscale images commonly used for?
a) Detecting faces
b) Simplifying image processing
c) Increasing color intensity
d) Reducing image noise
Answer: b) Simplifying image processing
What technique involves finding the exact pixel location of features in an image?
a) Image Compression
b) Feature Detection
c) Image Denoising
d) Color Segmentation
Answer: b) Feature Detection
In CNNs, what is the benefit of applying multiple convolution layers?
a) Increase image size
b) Extract high-level features
c) Remove color from the image
d) Reduce brightness
Answer: b) Extract high-level features
What is a fundamental use of Computer Vision in retail?
a) Image Classification
b) Tracking customer movements
c) Face recognition
d) Object Segmentation
Answer: b) Tracking customer movements
Which layer of CNN removes negative pixel values?
a) Pooling Layer
b) ReLU Layer
c) Convolution Layer
d) Fully Connected Layer
Answer: b) ReLU Layer
What is the RGB color model primarily used for?
a) Encoding textures
b) Image classification
c) Color representation in images
d) Denoising images
Answer: c) Color representation in images
Which feature in an image helps identify an edge during image processing?
a) Brightness
b) Contrast
c) Gradient
d) Pixel density
Answer: c) Gradient
What does object detection in self-driving cars help with?
a) Recognizing facial expressions
b) Navigating through environments
c) Classifying textures
d) Translating text
Answer: b) Navigating through environments
A kernel is typically used in which operation of image processing?
a) Image Segmentation
b) Feature Detection
c) Convolution
d) Denoising
Answer: c) Convolution
How does Google’s Search by Image feature operate?
a) Text comparison
b) Feature matching with a database
c) Audio analysis
d) Object segmentation
Answer: b) Feature matching with a database
What is the goal of instance segmentation in Computer Vision?
a) To count the number of objects
b) To assign each object a category and label its pixels
c) To increase image resolution
d) To detect edges in images
Answer: b) To assign each object a category and label its pixels
Assertion-Reasoning based:
Assertion: Computer Vision enables machines to process and analyze visual data
using algorithms.
Reason: Computer Vision allows machines to mimic human intelligence by
interpreting visual information in real time.
Options:
a) Both Assertion and Reason are true, and Reason is the correct explanation of
Assertion.
b) Both Assertion and Reason are true, but Reason is not the correct explanation of
Assertion.
c) Assertion is true, but Reason is false.
d) Assertion is false, but Reason is true.
Answer: a) Both Assertion and Reason are true, and Reason is the correct explanation
of Assertion.
Assertion: Grayscale images are composed of three channels: red, green, and blue.
Reason: Grayscale images use pixel values ranging from 0 (black) to 255 (white) to
represent different shades of gray.
Options:
a) Both Assertion and Reason are true, and Reason is the correct explanation of
Assertion.
b) Both Assertion and Reason are true, but Reason is not the correct explanation of
Assertion.
c) Assertion is true, but Reason is false.
d) Assertion is false, but Reason is true.
Answer: d) Assertion is false, but Reason is true.
Assertion: Object detection involves classifying objects into a fixed set of categories.
Reason: Object detection is focused on identifying and locating multiple objects in an
image.
Options:
a) Both Assertion and Reason are true, and Reason is the correct explanation of
Assertion.
b) Both Assertion and Reason are true, but Reason is not the correct explanation of
Assertion.
c) Assertion is true, but Reason is false.
d) Assertion is false, but Reason is true.
Answer: b) Both Assertion and Reason are true, but Reason is not the correct
explanation of Assertion.
Assertion: The convolution operation in image processing involves the multiplication
of the image by a kernel.
Reason: Convolution reduces the spatial size of an image while retaining its important
features.
Options:
a) Both Assertion and Reason are true, and Reason is the correct explanation of
Assertion.
b) Both Assertion and Reason are true, but Reason is not the correct explanation of
Assertion.
c) Assertion is true, but Reason is false.
d) Assertion is false, but Reason is true.
Answer: c) Assertion is true, but Reason is false.
Assertion: Max Pooling in Convolutional Neural Networks (CNN) returns the
minimum value from the portion of the image covered by the kernel.
Reason: Max Pooling reduces the spatial size of the image while retaining the most
important features.
Options:
a) Both Assertion and Reason are true, and Reason is the correct explanation of
Assertion.
b) Both Assertion and Reason are true, but Reason is not the correct explanation of
Assertion.
c) Assertion is true, but Reason is false.
d) Assertion is false, but Reason is true.
Answer: d) Assertion is false, but Reason is true.
Question-Answer:
Question: What is the role of Computer Vision in self-driving cars?
Answer: Computer Vision plays a crucial role in self-driving cars by enabling the
vehicle to perceive its surroundings. It allows the car to detect and identify objects,
such as pedestrians, traffic signs, and other vehicles. It also assists in determining
navigational routes and environmental monitoring, which are essential for the safe
operation of autonomous vehicles
Question: Explain the concept of object detection in Computer Vision.
Answer: Object detection in Computer Vision is the process of identifying instances of
real-world objects, such as faces, cars, and buildings, in images or videos. It involves
not only detecting the presence of an object but also identifying its location within the
image. This task is used in various applications, including facial recognition, video
surveillance, and automated vehicle systems.
Question: What is a grayscale image and how is it different from an RGB image?
Answer: A grayscale image consists of varying shades of gray, without any color.
Each pixel in a grayscale image has a value between 0 (black) and 255 (white). An
RGB image, on the other hand, consists of three color channels (Red, Green, and
Blue), where each pixel is a combination of values from these three channels,
providing full-color representation.
Question: Define and explain the concept of image resolution.
Answer: Image resolution refers to the number of pixels in an image and is typically
expressed as width × height. For example, an image resolution of 1280×1024 means
the image is 1280 pixels wide and 1024 pixels high. Higher resolution means more
pixels, which usually results in more detail and better image quality.
Question: How does the ReLU layer function in a Convolutional Neural Network
(CNN)?
Answer: The ReLU (Rectified Linear Unit) layer introduces non-linearity to the CNN.
It replaces all negative pixel values in the feature map with zero, while positive values
remain unchanged. This helps the CNN model learn complex patterns in the data by
increasing the model’s ability to capture non-linear relationships.
Question: What is the significance of the pooling layer in a CNN?
Answer: The pooling layer in a CNN is used to reduce the spatial size of the
convolved features, which helps in decreasing the computational power required for
processing the data. It also helps in retaining the important features of the image while
making the model more robust to distortions and shifts in the input image.
Question: Describe the application of facial recognition in smart homes and cities.
Answer: In smart homes and cities, facial recognition is used for security and access
control. It can recognize authorized individuals, maintain visitor logs, and trigger
actions such as opening doors or granting access to restricted areas. In schools, it can
be used for automated attendance systems, where the system identifies students using
facial recognition.
Question: Explain how Google Translate App uses Computer Vision.
Answer: The Google Translate App uses Computer Vision through its Optical
Character Recognition (OCR) feature. The app can identify text in real-time using the
phone’s camera. It then translates the text into the user’s preferred language by
analyzing the characters in the image and overlaying the translation using augmented
reality.
Question: What is the importance of pixel values in digital images?
Answer: Pixel values represent the color or brightness of each individual pixel in a
digital image. In an 8-bit grayscale image, for example, pixel values range from 0
(black) to 255 (white). These values are essential for determining the visual properties
of the image and are used by algorithms in tasks such as filtering, compression, and
feature extraction.
Question: What are the advantages of using computer vision in medical imaging?
Answer: Computer vision enhances medical imaging by enabling more accurate
analysis of medical scans, such as CT and MRI scans. It can convert 2D scan images
into interactive 3D models, providing medical professionals with detailed insights into
a patient’s condition. This helps improve diagnosis, treatment planning, and surgery
preparation.
Question: What is a kernel in the context of convolution in image processing?
Answer: A kernel, also called a filter, is a small matrix that is used in the convolution
process in image processing. It is slid over the image, where each pixel value is
multiplied by the corresponding value in the kernel, and the results are summed to
produce a new pixel value. This operation helps in extracting features such as edges
and textures from an image.
Question: How does the “Google’s Search by Image” feature use Computer Vision?
Answer: Google’s Search by Image feature allows users to upload an image, which is
then processed using Computer Vision techniques. The system analyzes the image and
compares it to a database of images to find similar content. It identifies key features of
the image, such as shapes, colors, and patterns, and returns matching search results
based on these features.
Question: What is the difference between classification and object detection in
Computer Vision?
Answer: Classification in Computer Vision assigns a single label to an entire image
from a set of predefined categories. Object detection, on the other hand, not only
classifies objects but also identifies their locations within the image. Object detection
is typically more complex as it requires detecting multiple objects and their positions.
Question: Why are corners considered good features in images for computer vision
tasks?
Answer: Corners are considered good features because they are unique and easy to
detect in images. They provide distinct information as their appearance changes
significantly when moved. Unlike flat surfaces or edges that appear similar when
shifted, corners stand out due to their abrupt change in pixel values, making them
reliable for tasks like object detection and feature extraction.
Question: How does max pooling help in a Convolutional Neural Network?
Answer: Max pooling in a Convolutional Neural Network reduces the spatial size of
the convolved features by taking the maximum value from a set of pixels in a region
covered by the kernel. This helps reduce the dimensionality of the data, lowering the
computational cost, and making the model more resistant to variations like shifts and
distortions in the input image.