Research + Deployment Project

3D Scene Flow: Auto-Labeling & Production Deployment

An unsupervised point- and occupancy-level 3D scene-flow auto-labeling system, two deployable flow networks, an ultra-light production head, and full ONNX → TensorRT / Horizon J6E deployment.

Timeline
2023.08–2023.12
Context
PhiGent Robotics
Role
3D Scene Flow Algorithm Engineer
Stage
Deployment-oriented research

Overview

What is this project about?

A 3D motion-estimation stack for autonomous driving: an unsupervised auto-labeling system that assigns a 3D scene-flow vector to every LiDAR point and every occupancy cell, validated by lifting the accuracy of existing flow estimators, distilled into an ultra-light production head, and deployed end-to-end through ONNX, TensorRT (Orin) and the Horizon J6E toolchain.

research 3d-4d scene-flow deployment

Overview

Give every point — and every occupancy cell — a motion vector, with no human labels

3D scene flow is the dense per-point 3D motion field between two LiDAR sweeps — the geometric backbone for dynamic-object reasoning, velocity estimation and occupancy-flow prediction. Hand-labeling it is effectively impossible. This project built an unsupervised auto-labeling system that assigns a motion vector to every point and every occupancy (occ) cell, used those labels to sharpen existing flow estimators, distilled the result into an ultra-light production head, and shipped the whole stack through ONNX → TensorRT (NVIDIA Orin) and the Horizon J6E toolchain.

Unsupervised auto-labeling Point-level + Occ-level flow Graph-conv + math constraints ONNX · TensorRT · Horizon J6E

Logic map

From unsupervised labels to on-vehicle occupancy flow

One label engine feeds a tiny production head; a single ONNX source fans out to two silicon targets. Hover a node.

01 The core IP Auto-Flow labeling

An unsupervised 3D scene-flow & occ-flow auto-labeler

The heart of the project: a self-supervised system that takes raw LiDAR sweeps and produces a dense 3D motion label for every point and every occupancy cell — no manual annotation in the loop. These auto-labels become the training signal for every downstream flow model.

3D scene-flow and occupancy-flow auto-labeling architecture
Auto-labeling architecture. From consecutive LiDAR sweeps, the system jointly estimates a per-point 3D scene-flow field and a per-occ 3D motion field, supervised only by geometric and temporal consistency — so high-quality flow labels are generated automatically, at scale.
Step 1 · Input

Consecutive LiDAR sweeps

  • Paired point clouds Pt, Pt+1 + ego-pose; no flow ground truth required
Step 2 · Estimate

Dense motion field

  • Predict a 3D vector per point and per occ cell; ego-motion compensated so only true object motion remains
Step 3 · Self-supervise

Consistency objectives

  • Nearest-neighbour / cycle / smoothness constraints replace human labels with geometry
Step 4 · Emit

Point- & occ-flow labels

  • A reusable label bank that trains and stress-tests every downstream estimator
Why unsupervised
Dense 3D flow has no scalable human-labeling route — a single sweep holds 100k+ points. Driving geometry (rigid ego-motion, locally smooth object motion, cross-frame correspondence) supplies the supervision instead, so labels scale with raw mileage rather than with annotation budget.
02 Does it actually help? Validation

Auto-labels lift existing flow estimators

The acid test for a labeling system is whether its labels make other models better. Feeding our auto-labels into established 3D scene-flow estimators improved their prediction accuracy substantially — direct evidence that the generated supervision is both correct and useful.

Baseline estimators

Original supervision

Trained on their native, limited flow signals

Weaker on fast, distant and sparse objects

+ our auto-labels

Substantially sharper

Same architectures, richer dense supervision

Consistent accuracy gains across methods

Accuracy improvement after applying our auto-labels to existing scene-flow methods
Result. Existing 3D scene-flow estimation methods, retrained with labels from our auto-labeling system, show a clear, consistent jump in prediction accuracy — the labeling system pays for itself across architectures.
03 From research to silicon Production head

An ultra-light 3D scene-flow head — two design routes

For mass production the flow predictor must be tiny and embedded-friendly. We explored two routes for the production head and compared them head-to-head: a graph-convolution + mathematical-constraint design, and a pure point-cloud deep-learning design fit directly to supervision.

Ultra-light 3D scene-flow production head — graph-convolution variant (top) and pure deep-learning variant (bottom)
Two production heads. Top — graph convolution with explicit mathematical (rigidity / smoothness) constraints. Bottom — a pure point-cloud network that regresses flow directly from supervision. Both are budgeted for on-vehicle compute.
Route A · Graph-conv + math

Geometry-guided

Graph convolution over local neighbourhoods

Explicit rigidity / smoothness constraints

Robust & interpretable on structured motion

Route B · Pure deep-learning

Data-driven

Point-cloud network regresses flow end-to-end

Fit directly to the auto-generated labels

Simplest graph to export and quantize

04 On-vehicle Deployment

ONNX → TensorRT / Horizon J6E, with occ-flow inference

The trained head was exported to ONNX, optimized with TensorRT for NVIDIA Orin, and converted with the Horizon SDK toolchain for J6E. Against existing production options our solution held up well, and the same export produces the live occupancy-flow inference below.

Production-solution comparison and occupancy-flow prediction visualized from ONNX inference
Deployment evidence. Our production flow solution compared against other existing options, alongside an occupancy-flow prediction visualized directly from the exported ONNX inference — confirming that the optimized graph behaves on-target as it does in training.
① Export

Trained model → ONNX

  • Floating-point or quantized graph exported to a portable .onnx
②A · NVIDIA

TensorRT on Orin

  • Graph & precision optimization for the Orin runtime
②B · Horizon

SDK on J6E

  • Convert & quantize through the Horizon J6E toolchain
③ Runtime

On-vehicle inference loop

  • Point-cloud preprocess → cache init → normalize → inference → output parsing → perf stats
Deployment targets
One trained head, two silicon paths: TensorRT / Orin for the high-compute tier and Horizon J6E for the cost-optimized tier — sharing a single ONNX source of truth so training and on-vehicle behaviour stay aligned.

Visualization

Auto-Flow, running

The end-to-end result — dense 3D scene flow auto-labeled and predicted on real driving sequences. Plays automatically and loops.

Auto-Flow demo. Per-point 3D motion estimated across a full sequence — the colour field encodes the predicted scene-flow direction and magnitude.
Confidentiality note. Only the general pipeline and deployment concepts are shown. Internal data, exact metrics, model parameters, and hardware-specific optimization details are omitted; any figures are illustrative.