Core ML Concepts and Applications in Vision

Dimensionality Reduction (CONTINUOUS, UNSUPERVISED)
1. Definition:
  1. Reducing the number of dimensions (variables/features) in a dataset while retaining as much important information as possible.
  2. Simplest method: Dropping dimensions
  3. More sophisticated methods often yield better results!
2. Why it matters:
  1. High-dimensional data can be challenging to analyze and visualize
  2. Helps in simplifying models, improving computational efficiency, and often boosting generalization.
3. Common Techniques:
  1. PCA (Principal Component Analysis):
    1. Identifies correlations in data and produces the best linear projections.
    2. Reduces reconstruction error.
    3. Training Data ⇒ EigenFaces ⇒ Reconstruction of in-domain and out-of-domain images
  2. ICA (Independent Component Analysis):
    1. Focuses on making the dimensions statistically independent.
  3. LLE (Locally Linear Embedding) and Isomap:
    1. Non-linear dimensionality reduction techniques.
  4. Autoencoder:
    1. A neural network that learns efficient codings of input data.
4. Cautions:
  1. PCA outputs "bases" (new axes for data). While useful, assigning physical meaning to these is often misleading.
Clustering: group together similar points and represent them with a single token
1. (DISCRETE UNSUPERVISED)
2. Purpose:
  - Break up the image into meaningful or perceptually similar regions
3. Applications in Vision:
  - Image segmentation for feature support/efficiency: Divide an image into regions based on similarities (e.g., texture, color).
    - Types of Segmentation: Oversegmentation, Undersegmentation, Multiple Segmentations
  - Feature support: Reducing complexity for algorithms by grouping features.
4. Key Challenges:
  - 1. What makes two points/images/patches similar?
  - 1. How do we compute an overall grouping from pairwise similarities?
5. Key Techniques:
  - K-Means:
    - Iteratively assigns points to the nearest center and recalculates cluster centers.
  - Agglomerative Clustering:
    - Starts with each point as its own cluster and merges clusters step-by-step.
  - Mean-shift Clustering:
    - Finds modes (peaks) in the probability density function.
  - Spectral Clustering:
    - Split the nodes in a graph based on assigned links with similarity weights
Clustering for Summarization: Goal is to cluster to minimize variance in data given clusters – preserve the information

Evaluation:
- Generative: How well are points reconstructed from the clusters?
- Discriminative: How well do the clusters correspond to labels? (purity)
  - unsupervised clustering doesn’t aim to be discriminative
Challenges:
- Choosing the right number of clusters.
  - Validation set: Try different numbers of clusters and look at performance on some downstream task (more clusters are usually better)
- Balancing under-segmentation (too few clusters) and over-segmentation (too many).

3. K-Means Algorithm → converges to a local minimum

Steps:
- Randomly select and initialize K cluster centers.
- Assign each point to the nearest center.
- Recompute new centers(mean) based on each cluster/assignment.
- Repeat until assignments stabilize(= no points re-assigned)
Pros:
- find cluster centers minimizing conditional variance(good representation)
- simple and fast!
- easy to implement
Cons:
- need to choose k
- Sensitive to initialization.
- Prone to local minima.
- all clusters have same parameters
- can be slow(O(KNd) for N d-dimensional points)
Design Choices:
- Initialization
  1. – Randomly select K points as initial cluster center
  2. – Or greedily choose K points to minimize residual
- Distance measures
  1. – Traditionally Euclidean, could be others
- Optimization
  1. – Will converge to a local minimum
  2. – May want to perform multiple restarts

4. Building Visual Dictionaries:

sample patches from a database
Cluster the patches
Assign codeword to each new patch, according to nearest cluster

4. Classification

Definition:
Decision Boundaries: