3D Reconstruction from Video Sequence

2024 – 2025 — ENSEIRB-MATMECA, Computer Vision (S9) (with Maxime Hurtubise)

View on GitHub

Python Computer Vision SfM NumPy OpenCV Open3D

Overview

An incremental Structure from Motion (SfM) pipeline that reconstructs a dense 3D scene from a video sequence. The pipeline was implemented from scratch for the Computer Vision course at ENSEIRB-MATMECA, using Bundle Adjustment with the Levenberg-Marquardt algorithm to refine the reconstruction at each step. The result is a dense point cloud with a mean reprojection error typically below 1 pixel across 100+ frames.

Pipeline

The reconstruction follows an incremental SfM approach across five stages:

Initialization: Essential matrix estimation using RANSAC + five-point algorithm, relative pose recovery with chirality validation, initial 3D point triangulation between the first two views, and track structure creation for multi-view correspondences.
Incremental camera localization: Robust pose estimation for each new frame from existing 3D–2D correspondences, with pose initialization from the previous camera position and positive depth validation before optimization.
Intelligent triangulation: Detection of new 3D points from orphan features, parallax angle computation between camera pairs, and a minimum parallax threshold of 5° to ensure geometric stability.
Track management: Dynamic track structure updates as new points are triangulated, with a global mapping between 3D point keys and indices to maintain multi-view consistency.
Bundle Adjustment: Global optimization integrated after each new view, using the Levenberg-Marquardt algorithm with Schur complement to efficiently refine all camera poses and 3D points jointly. Scale normalization is applied after each step to keep the reconstruction consistent.

Camera Localization Module

A specialized Bundle Adjustment variant (BA_LM_localization.py) was derived from the provided full-BA implementation. It optimizes only the camera pose (rotation + translation) while keeping 3D points fixed — enabling fast and stable PnP-based localization at each new frame without re-optimizing the entire point cloud.

Results

Mean reprojection error typically < 1 pixel across all frames.
Robust camera trajectory estimation across 100+ frames.
Dense point cloud with multi-view consistency.

Project Structure

├── main_reconstruction.py       # [OUR CODE] Main SfM pipeline
├── BA_LM_localization.py        # [OUR CODE] Camera-only Bundle Adjustment
├── BA_LM_schur.py               # [PROVIDED] Full Bundle Adjustment
├── BA_LM_two_views_schur.py     # [PROVIDED] Two-view Bundle Adjustment
├── utils.py                     # [PROVIDED] Utility functions
├── viewer.py                    # [PROVIDED] 3D visualization (Open3D)
└── main_show_final_reconstruction.py  # [PROVIDED] Result visualization

Tech Stack

Language: Python 3.11+
Core libraries: NumPy, SciPy, OpenCV
Visualization: Open3D, Matplotlib
Environment: Conda

Back to projects