1. Semantic Segmentation
Definition
Semantic segmentation is the task of labeling each pixel in an image with a category, such as "sky," "tree," or "car." Unlike object detection, semantic segmentation captures spatial extent, focusing on regions rather than bounding boxes.
Key Tasks in Segmentation
- Classification: Assigns labels to entire images (e.g., "cat").
- Semantic Segmentation: Assigns labels to pixels.
- Instance Segmentation: Labels individual object instances (e.g., different dogs in an image).
Performance Metric: Intersection over Union (IoU)
- Definition: IoU measures overlap between the predicted segmentation and the ground truth: IoU=Area of Overlap/Area of Union
- High IoU scores indicate better segmentation accuracy.
2. Fully Convolutional Networks (FCNs)
Overview
Introduced in 2015, FCNs are deep learning models designed for end-to-end pixel-wise prediction.
- Fully Convolutional Design: Replaces fully connected layers with convolutional ones to preserve spatial dimensions.
Challenges
- Effective Receptive Field:
- The receptive field grows linearly with the number of convolutional layers, limiting context captured by the network.
- Computational Cost:
- Convolutions on high-resolution images are computationally expensive.
Solutions