Projects

Research Platform 2025.03–Present

Generative Autonomous-Driving Simulation Platform

Bosch (XC-CN) · World Models Algorithm Engineer

Built a Cosmos-Transfer2.5-based generative simulation platform: a 7V surround world model validated on internal data, real-map (Ingolstadt OSM → layout → 7V) scenario generation, a gRPC semantic bridge between WorldSim and the world model, the first 4-step distillation of 7V surround video (rCM + DMD2) for up to ~13.9× speedup, an editable platform for rare interaction data, and an all-in-one OneModel that serves layout generation, Gaussian-Splatting fix, and harmonization from a single denoiser.

research world-model generative e2e

Research Platform 2025.05–Present

Vector Traffic Generation & Sensor-Level Closed-Loop Simulation

Bosch (XC-CN) · World Models Algorithm Engineer

Built a two-level controllable driving simulator: a structure-aware temporal vector VAE (STAR-AE) that compresses sparse, variable agents and lanes into fixed latents, a conditional latent-diffusion generator (STRIDENet) that produces history-consistent future traffic, and a sensor-level closed-loop WorldSim that fuses Gaussian-Splatting reconstruction, traffic-flow generation, and a mask-guided DiT video editor (built on MagicDrive-V2) into photorealistic surround rollouts.

research world-model generative e2e

POC Project 2025.12–2026.04

One-Stage End-to-End Driving — 8V Pure Vision

Bosch (XC-CN) · World Models Algorithm Engineer

A one-stage, pure-vision end-to-end driving POC that lifts 8 surround cameras into a single BEV feature, reads three structured perception heads (3D detection, HD map, occupancy) from it, predicts the next-frame BEV under generative scoring, and tokenizes everything into a Diffusion-Flow planner that emits the ego trajectory and neighbouring-agent states — perception, prediction, and planning optimised jointly.

research e2e world-model perception

Production Project 2025.04–2026.03

J6M Static & Dynamic Perception

Bosch (XC-CN) · Perception Algorithm Engineer

End-to-end production perception on a mid-trim (J6E / J6M) platform, organised around three shipped systems: a multi-task static OneModel that drives every static element from one shared BEV feature, a 4D-sparse dynamic model that unifies detection and tracking, and an on-board latency-compression effort that cut inference from ~42.65 ms to ~13.88 ms. My work spans architecture, a unified data pipeline, heterogeneous multi-task training, release engineering, and quantization-aware deployment.

production perception bev deployment quantization

POC Project 2024 · Collaboration

End-to-End Driving: 11V + LiDAR Fusion

Lantu · BEV Fusion & AI-Planner Engineer

An end-to-end autonomous-driving system that fuses 11 surround cameras (7 pinhole + 4 fisheye) with LiDAR under a sparse-centric (SparseDrive-style) paradigm. My two core deliverables: a fused BEV-fusion CUDA operator that aligns 11-camera and LiDAR features in a single kernel, and the training of an AI planner that outputs motion and planning in parallel from a shared query decoder.

research e2e perception 3d-4d

Research Project 2023.05–2024

Controllable Surround-View Driving Generation

PhiGent Robotics · Generative Driving Algorithm Engineer

Built a controllable surround-view driving generator that compresses 3D boxes and maps into spatial conditions, encodes text / reference frames / lanes / camera calibration into condition tokens, and injects them into a UNet diffusion backbone — producing cross-camera-consistent 4V / 7V / 11V images and video for data augmentation and open-loop simulation, evolving from OpenSora 1.0 + SD 3.5 to a MagicDrive-fused in-house model.

research world-model generative e2e

Production + Research Project 2022.11–2024

4D Auto-Labeling & Pure LiDAR 3D Detection

Hozon Auto × SJTU IRMV · PhiGent Robotics · Perception Team Leader · 3D Perception Algorithm Engineer

A two-phase journey in autonomous-driving auto-labeling: first a Tesla-AI-Day-inspired vision-only 4D auto-labeling pipeline with Hozon Auto and SJTU IRMV, then a multi-modal 4D auto-labeling and production pure-LiDAR 3D detection system at PhiGent Robotics — optimized at the data, model, and loss levels.

production research 3d-4d perception deployment

Research + Deployment Project 2023.08–2023.12

3D Scene Flow: Auto-Labeling & Production Deployment

PhiGent Robotics · 3D Scene Flow Algorithm Engineer

A 3D motion-estimation stack for autonomous driving: an unsupervised auto-labeling system that assigns a 3D scene-flow vector to every LiDAR point and every occupancy cell, validated by lifting the accuracy of existing flow estimators, distilled into an ultra-light production head, and deployed end-to-end through ONNX, TensorRT (Orin) and the Horizon J6E toolchain.

research 3d-4d scene-flow deployment

Production Project 2023.05–2023.11

Road Preview: Surface-Element Segmentation

BYD · Perception Model Optimization Engineer

A road-surface perception project for the road-preview ('magic-carpet') suspension feature: segment safety-critical small road elements — manhole covers and speed bumps — reliably under hard real-world conditions (tiny targets, water and oil stains, textureless surfaces), then compress and quantize the model to INT8 for efficient TDA4 edge inference, reaching an initial mass-production quality bar.

production perception deployment

Profile Video 2023.06

Master's Graduation — Introduction Film

CUMT × SJTU · Joint Master's Program (2020–2023) · Author & Presenter

A short introduction film I made for my 2023 master's graduation — a compact tour of my research focus, the labs and mentors I worked with, and the perception and 3D/4D systems I built along the way.

profile

Production + Research Project 2021.09–2023.03

Autonomous Lawn-Mower Robot Perception

SJTU IRMV × Positec Technology · Perception Algorithm Developer · 2D–3D Fusion Researcher

The robot has to drive itself off a transport vehicle, reach the lawn, mow, and return — so I built its safety-critical perception stack across four modules: ramp detection for self loading/unloading, 3D grass-obstacle detection (geometry first, then camera–LiDAR fusion), an MCU-deployed 2D BEV safety detector, and a dual-attention LiDAR–vision fusion study.

production research robotics 3d-4d deployment

Research Project 2021.08–2022.10

Integrated Perception, Planning, and Decision-Making Network

The Future Laboratory of the Second Aerospace Academy · Perception and Simulation Developer

A unified multi-task framework that fuses multi-modal sensors (RGB, LiDAR, infrared) through attention-based feature fusion — jointly solving geometric–semantic mapping, unsupervised depth and odometry, multi-object detection and tracking, and closed-loop behavior decisions inside one end-to-end trainable network.

research e2e perception