Advanced Vision Concepts and Applications

1. Semantic Segmentation

Definition

Semantic segmentation is the task of labeling each pixel in an image with a category, such as "sky," "tree," or "car." Unlike object detection, semantic segmentation captures spatial extent, focusing on regions rather than bounding boxes.

Key Tasks in Segmentation

Classification: Assigns labels to entire images (e.g., "cat").
Semantic Segmentation: Assigns labels to pixels.
Instance Segmentation: Labels individual object instances (e.g., different dogs in an image).

Performance Metric: Intersection over Union (IoU)

Definition: IoU measures overlap between the predicted segmentation and the ground truth: IoU=Area of Overlap/Area of Union
High IoU scores indicate better segmentation accuracy.

2. Fully Convolutional Networks (FCNs)

Overview

Introduced in 2015, FCNs are deep learning models designed for end-to-end pixel-wise prediction.

Fully Convolutional Design: Replaces fully connected layers with convolutional ones to preserve spatial dimensions.

Challenges

Effective Receptive Field:
- The receptive field grows linearly with the number of convolutional layers, limiting context captured by the network.
Computational Cost:
- Convolutions on high-resolution images are computationally expensive.