3D Point Processing and Deep Learning Techniques | Notion

1. Overview of 3D Point Processing

What Is 3D Point Processing?

3D point processing involves analyzing and making decisions on point clouds, which are sets of 3D points (x, y, z) typically generated by sensors like LiDAR or depth cameras.

Applications:

Autonomous Vehicles: Detecting and classifying objects like cars, pedestrians, and traffic signs.
Robotics: Navigation and manipulation in 3D environments.
Augmented Reality (AR): Scene understanding for overlaying virtual elements.
3D Design: Shape recognition and editing.

2. Representations of 3D Data

Point Clouds: → converted to other representations before it’s fed to a deep NN

Direct representation of raw 3D data.
Unordered sets of points, each with 3D coordinates, and sometimes additional attributes like intensity or color.

Other 3D Representations:

Voxel Grids:
- Space is divided into a 3D grid of voxels (small cubes).
- Each voxel contains information about whether it is occupied.
Projected Views:
- Render 3D data into 2D images from multiple viewpoints (e.g., front, top, side views).
Range Images:
- Cylindrical 2D images encoding distance from the sensor in each direction.
Meshes:
- Represent surfaces as a collection of connected triangles.

3. Deep Learning on Point Clouds