3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling

Figure 1. Automatic pseudo 3D scene flow labeling and model learning. The input comprises 3D anchor boxes, a pair of point clouds, and their corresponding coarse normal vectors. The optimization of motion parameters primarily updates the bounding box parameters, global motion parameters, local motion parameters, and the motion probability of the box. Parameters needed for box updates are inversely adjusted through six objective functions. Once optimized, the motion parameters simulate K types of motion using a global-local data augmentation module. A single source frame point cloud, along with the generated K sets of motion parameters, produces multiple 3D scene flow label candidates. These label candidates serve in guiding the supervised neural network to learn point-wise motion.

Abstract

Learning 3D scene flow from LiDAR point clouds presents significant difficulties, including poor generalization from synthetic datasets to real scenes, scarcity of real-world 3D labels, and poor performance on real sparse LiDAR point clouds. We present a novel approach from the perspective of auto-labelling, aiming to generate a large number of 3D scene flow pseudo labels for real-world LiDAR point clouds. Specifically, we employ the assumption of rigid body motion to simulate potential object-level rigid movements in autonomous driving scenarios. By updating different motion attributes for multiple anchor boxes, the rigid motion decomposition is obtained for the whole scene. Furthermore, we developed a novel 3D scene flow data augmentation method for global and local motion. By perfectly synthesizing target point clouds based on augmented motion parameters, we easily obtain lots of 3D scene flow labels in point clouds highly consistent with real scenarios. On multiple real-world datasets including LiDAR KITTI, nuScenes, and Argoverse, our method outperforms all previous supervised and unsupervised methods without requiring manual labelling. Impressively, our method achieves a tenfold reduction in EPE3D metric on the LiDAR KITTI dataset, reducing it from $0.190m$ to a mere $0.008m$ error.

Qualitative results

Tip: Clicking on the image allows you to view high-definition PDF visualization images.

Citation

@InProceedings{Jiang_2024_CVPR,
    author    = {Jiang, Chaokang Wang, Guangming Liu, Jiuming Wang, Hesheng Ma, Zhuang Liu, Zhenqiang Liang, Zhujin Shan, Yi Du, Dalong},
    title     = {3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {15173-15183}
}

3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling

Abstract

Qualitative results

lidarKITTI - scene flow (Registration visualization)

FLOT+3DSFLabelling

FLOT

lidarKITTI - EPE3D (3D scene flow EPE3D error visualization)

FLOT+3DSFLabelling

FLOT

WaymoOpen - scene flow (Registration visualization)

GMSF+3DSFLabelling

GMSF (Trained on Waymo (with SceneFlow GT)

waymoOpen - scene flow (3D scene flow EPE3D error visualization)

GMSF+3DSFLabelling

GMSF (Trained on Waymo (with SceneFlow GT)

Argoverse - scene flow (Registration visualization)

MSBRN+3DSFLabelling

MSBRN

nuScenes - scene flow (Registration visualization)

MSBRN+3DSFLabelling

MSBRN

Citation