Computer Vision
Computer Vision
workshop
Presented by Rania SAOUD
Fundamentals of computer vision
& image processing
● Computer Vision:
○1. Gives computers the ability to see and understand images and videos like humans.
○2. Involves recognizing and comprehending the content of images or videos.
○3. Provides computers with "eyes" to perceive visual data.
● Image Processing:
○1. Advanced editing of images beyond basic adjustments.
○2. Enhances or alters images for clarity, size, or color.
○3. Similar to editing photos on a phone but with more sophisticated tools.
● Working Together:
1. Image processing prepares images for analysis in computer vision.
○
3. Both are essential components for tasks like object recognition and scene understanding.
○
Tools & Libraries
1. OpenCV:
- Handles tasks from basic operations like resizing to complex tasks like face recognition.
3. PIL/Pillow:
- Go-to library for opening, manipulating, and saving various image file formats in Python.
Pixels, Resolution & Color Models
● Pixels are tiny squares that create an image
● Resolution is how many pixels are packed together. More pixels mean more detail.
● Color Models :
○ RGB (Red, Green, Blue) : Combining different amounts of red, green, and blue gives us all the colors
we see in digital images. It's used in screens.
○ HSV (Hue, Saturation, Value) ; This model is about colors, their brightness, and how intense they are.
Hue is the color itself, saturation is how vivid the color is, and value is how light or dark it is.
Pixels, Resolution & Color Models
● Resolution is how many pixels are packed together. More pixels mean more detail.
Pixels, Resolution & Color Models
○ RGB (Red, Green, Blue) : Combining different amounts of red, green, and blue gives us all the
colors we see in digital images. It's used in screens.
Reading, Displaying & Writing
- Reading an Image:
- Displaying an Image:
- Resizing :
- Changing the size of a picture to fit a specific location or purpose.
- It's akin to zooming in or out, but permanently altering the dimensions.
- Cropping :
- Allows removing unwanted parts of an image, such as excess sky or photobombers.
- Focuses on the essential elements by eliminating distractions or irrelevant details.
- Rotation :
- Involves adjusting the orientation of an image, correcting sideways or upside-down views.
- Ensures the image is correctly oriented for optimal viewing, akin to adjusting a picture frame.
Resizing
Importance :
- Resizing allows standardizing the size of images within a dataset, which is crucial for consistency in
model training.
- It helps in optimizing computational resources by reducing the image resolution while preserving essential
features, thus speeding up processing and improving efficiency.
Cropping
Importance :
- Cropping enables focusing on specific regions of interest within an image, eliminating irrelevant
background or noise.
- It enhances the performance of object detection, recognition, and segmentation algorithms by removing
clutter and emphasizing relevant features, leading to more accurate results.
Rotation
Importance :
- Rotation corrects the orientation of images that may be captured in different orientations, ensuring
uniformity in the dataset.
- It improves the robustness and accuracy of object detection, classification, and localization models by
presenting images consistently, irrespective of their original orientation, thus facilitating better
generalization and performance.
➔ RGB to grayscale
➔ RGB to HSV
Image Filtering: Smoothing, Sharpening Filters
➔ Smoothing :
◆ Noise Reduction: Helps in reducing noise or unwanted variations
in pixel values.
➔ Sharpening :
◆ Detail Enhancement: Enhance the clarity and definition of edges
and fine details in an image.
- Feature Detection:
- Matching:
❏ Examines unique facial features (e.g., eye distance, nose shape) for
differentiation between individuals.
How it works ?
The model employs a multi-step process for object detection and identification. Initially, it
extracts relevant features from the input image to discern objects from the background.
Subsequently, it localizes these objects by predicting bounding boxes that outline their spatial
extent within the image. Through classification, the model assigns labels to the detected
regions, indicating the type of object present, such as "car" or "person." Post-processing
techniques are then applied to refine predictions and eliminate false positives, ensuring
accurate localization and classification. In certain applications like facial recognition, the
model may further identify specific attributes or instances within detected objects, such as
recognizing individuals by matching detected faces with known identities. Through this
iterative process of feature extraction, localization, classification, and post-processing, the
model effectively detects objects and identifies them within images or videos.
Data augmentation & Data cleaning in
images
➔ Data Augmentation in Computer Vision:
- Increases the diversity of data for better model learning and performance when
encountering new images.