CVPR 2024 · Pseudo Auto-Labelling

Boosting 3D Scene Flow
Estimation by Pseudo Auto-Labelling

A label-free framework that synthesizes realistic 3D scene-flow labels for raw LiDAR — outperforming every supervised & unsupervised baseline, with a 10× lower EPE3D on LiDAR KITTI.

Abstract

Scene flow, learned from zero manual labels

Learning 3D scene flow from LiDAR point clouds is hard: poor synthetic-to-real generalization, scarce real-world 3D labels, and weak performance on sparse LiDAR. We approach the problem from auto-labelling — generating large amounts of 3D scene-flow pseudo-labels for real LiDAR. Under a rigid-body-motion assumption, multiple anchor boxes with distinct motion attributes decompose the scene into per-object rigid movements. A novel global & local motion augmentation then synthesizes target point clouds from augmented motion parameters, yielding abundant labels highly consistent with real scenes. Across LiDAR KITTI, nuScenes and Argoverse, our method surpasses every prior supervised and unsupervised approach without any manual labelling — cutting EPE3D on LiDAR KITTI tenfold, from 0.190 m to just 0.008 m.

0.000m EPE3D · LiDAR KITTI
0× Lower error vs. 0.190 m
0 Manual labels required
Method

From anchor boxes to millions of flow labels

Optimize per-object rigid motion, then augment it into diverse, photorealistic supervision.

3DSFLabelling pipeline overview
Figure 1. Automatic pseudo 3D scene-flow labelling and model learning. Inputs are 3D anchor boxes, a pair of point clouds and coarse normals. Motion optimization updates box, global-motion, local-motion and box-motion-probability parameters via six objective functions. The optimized parameters drive a global-local augmentation module that simulates K motion types, producing many label candidates from a single source frame to supervise point-wise motion. Click to open the full-resolution PDF.
01 — INPUT

Boxes & point pairs

3D anchor boxes, a source/target point-cloud pair and their coarse normal vectors enter the optimizer.

02 — OPTIMIZE

Rigid motion decomposition

Six objective functions inversely tune box, global and local motion parameters into per-object rigid movements.

03 — AUGMENT

Generate flow labels

Global-local augmentation samples K motion sets, synthesizing target clouds and abundant scene-flow labels.

Pseudo auto-labelling framework
The pseudo-auto-labelling framework. Given point clouds and initial bounding boxes, global and local motion parameters are iteratively optimized; randomly adjusting them augments diverse, realistic motion patterns to train 3D scene-flow estimators.
Qualitative Results

See the alignment, in interactive 3D

Drag any cloud to rotate — the paired view follows, so you compare the exact same angle. Higher overlap means lower scene-flow error.

Registration visualization on LiDAR KITTI and Argoverse

Registration results of our method (GMSF+3DSFLabelling) vs. baselines on LiDAR KITTI and Argoverse. The warped source cloud PCsw is dragged onto the target via predicted scene flow; the larger the overlap between PCsw and target PCT, the higher the accuracy. Click to open the full-resolution PDF.

Ours · FLOT+3DSFLabelling
Baseline · FLOT
Target frame point cloud Source frame + scene flow → target
EPE3D error color scale
Low errorHigh error
Ours · FLOT+3DSFLabelling
Baseline · FLOT
Ours · GMSF+3DSFLabelling
Baseline · GMSF (GT-trained)
Target frame point cloud Source frame + scene flow → target
EPE3D error color scale
Low errorHigh error
Ours · GMSF+3DSFLabelling
Baseline · GMSF (GT-trained)
Ours · MSBRN+3DSFLabelling
Baseline · MSBRN
Target frame point cloud Source frame + scene flow → target
Ours · MSBRN+3DSFLabelling
Baseline · MSBRN
Target frame point cloud Source frame + scene flow → target

Cameras are synchronized — rotate one view, both follow. Inactive datasets load on demand.

Video

Motion augmentation, in motion

Videos play as they enter view and pause when they leave.

Predicted target frames — FLOT vs. FLOT+3DSFLabelling
Global Motion Augmentation
Local Motion Augmentation
Global-Local Motion Augmentation
Citation

Cite 3DSFLabelling

@InProceedings{Jiang_2024_CVPR,
  author    = {Jiang, Chaokang and Wang, Guangming and Liu, Jiuming and
               Wang, Hesheng and Ma, Zhuang and Liu, Zhenqiang and
               Liang, Zhujin and Shan, Yi and Du, Dalong},
  title     = {3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer
               Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2024},
  pages     = {15173-15183}
}