VHDL EVT3 Decoder — NimbleAI / IKERLAN

May – July 2024 — Internship at IKERLAN, Basque Country, Spain

VHDL FPGA Neuromorphic Embedded Systems

Context

IKERLAN is a prominent R&D technology center based in the Basque Country, part of the Mondragon Corporation. During this internship I worked within the NimbleAI project — a Horizon Europe-funded initiative developing ultra-energy-efficient neuromorphic vision systems that mimic the human brain's visual processing using event-based sensing.

The Problem

The NimbleAI architecture relies on decompressed event data (EVT2 format, 47 bits per event), but the Prophesee Metavision event-based camera streams compressed data in the EVT3 format (16-bit vectorized). My task was to design a VHDL module capable of decoding EVT3 data to EVT2 in real time, meeting strict performance specifications: up to 60 million raw events per second at up to 200 MHz clock frequency.

Architecture

The final decompressor is a pipelined architecture with four main stages:

FIFO — Buffers incoming EVT3 events from the sensor, with an "almost full" mechanism that prevents data loss by forcing consumption of status-update events even when decoders are busy.
Dispatcher & Status Manager — Routes EVT3 events to available decoders and maintains the current decoding state (timestamp, Y coordinate, X base, polarity).
Decoders — The core bottleneck layer. Each decoder extracts up to 12 EVT2 events from a single EVT3 event using a FSM-free design for minimal clock-cycle overhead.
Areas — Combinatorial blocks that route decoded EVT2 events to the correct downstream processing unit based on the event's origin coordinates on the sensor grid.

The entire architecture is fully generic — the number of decoders, areas, FIFO depth, and area edge coordinates are all configurable parameters, allowing benchmarking of different configurations.

Results

100% decoding accuracy on real Prophesee datasets (laser and driving) with a sufficiently large configuration.
The largest tested configuration (10 decoders, 4 areas, FIFO depth 150) met timing constraints up to 285 MHz.
A 10-decoder, 600-depth FIFO configuration decoded 94.2% of events from an artificially overloaded dataset averaging 150.9 M events/second — well above the 60 M events/second specification.

Stack

VHDL, Xilinx Vivado, AXI-Stream protocol, C++ (for test comparison with Prophesee's reference decoder), Prophesee Metavision SDK.

Back to projects