3D Reconstruction from Video Sequence
View on GitHubOverview
An incremental Structure from Motion (SfM) pipeline that reconstructs a dense 3D scene from a video sequence. The pipeline was implemented from scratch for the Computer Vision course at ENSEIRB-MATMECA, using Bundle Adjustment with the Levenberg-Marquardt algorithm to refine the reconstruction at each step. The result is a dense point cloud with a mean reprojection error typically below 1 pixel across 100+ frames.
Pipeline
The reconstruction follows an incremental SfM approach across five stages:
- Initialization: Essential matrix estimation using RANSAC + five-point algorithm, relative pose recovery with chirality validation, initial 3D point triangulation between the first two views, and track structure creation for multi-view correspondences.
- Incremental camera localization: Robust pose estimation for each new frame from existing 3D–2D correspondences, with pose initialization from the previous camera position and positive depth validation before optimization.
- Intelligent triangulation: Detection of new 3D points from orphan features, parallax angle computation between camera pairs, and a minimum parallax threshold of 5° to ensure geometric stability.
- Track management: Dynamic track structure updates as new points are triangulated, with a global mapping between 3D point keys and indices to maintain multi-view consistency.
- Bundle Adjustment: Global optimization integrated after each new view, using the Levenberg-Marquardt algorithm with Schur complement to efficiently refine all camera poses and 3D points jointly. Scale normalization is applied after each step to keep the reconstruction consistent.
Camera Localization Module
A specialized Bundle Adjustment variant (BA_LM_localization.py) was derived from the provided full-BA implementation. It optimizes only the camera pose (rotation + translation) while keeping 3D points fixed — enabling fast and stable PnP-based localization at each new frame without re-optimizing the entire point cloud.
Results
- Mean reprojection error typically < 1 pixel across all frames.
- Robust camera trajectory estimation across 100+ frames.
- Dense point cloud with multi-view consistency.
Project Structure
├── main_reconstruction.py # [OUR CODE] Main SfM pipeline
├── BA_LM_localization.py # [OUR CODE] Camera-only Bundle Adjustment
├── BA_LM_schur.py # [PROVIDED] Full Bundle Adjustment
├── BA_LM_two_views_schur.py # [PROVIDED] Two-view Bundle Adjustment
├── utils.py # [PROVIDED] Utility functions
├── viewer.py # [PROVIDED] 3D visualization (Open3D)
└── main_show_final_reconstruction.py # [PROVIDED] Result visualization
Tech Stack
- Language: Python 3.11+
- Core libraries: NumPy, SciPy, OpenCV
- Visualization: Open3D, Matplotlib
- Environment: Conda