Production + Research Project

Autonomous Lawn-Mower Robot Perception

A safety-critical LiDAR and multimodal perception stack for autonomous loading/unloading, slope traversal, grass-obstacle detection, and embedded deployment.

Autonomous Lawn-Mower Robot Perception preview
Timeline
2021.09–2023.03
Context
SJTU IRMV × Positec Technology
Role
Perception Algorithm Developer · 2D–3D Fusion Researcher
Stage
Production-oriented + Pre-research

Overview

What is this project about?

The robot has to drive itself off a transport vehicle, reach the lawn, mow, and return — so I built its safety-critical perception stack across four modules: ramp detection for self loading/unloading, 3D grass-obstacle detection (geometry first, then camera–LiDAR fusion), an MCU-deployed 2D BEV safety detector, and a dual-attention LiDAR–vision fusion study.

production research robotics 3d-4d deployment

System Architecture

Four perception modules, one safety stack

From a transport vehicle to the lawn and back, the robot leans on a layered perception stack. Switch between the four modules below to see what each one does, the tools behind it, the algorithm pipeline, and the on-vehicle result.

Logic map

From the transport vehicle to the lawn and back — one safety stack

Every mission stage hangs off the same sensors and feeds one outcome: safe autonomous mowing. Hover a node.

Module A · PCL + Eigen + OpenCV

Up / down ramp detection

Detect a drive-on ramp and its four boundary lines from a solid-state LiDAR point cloud, so the robot can autonomously and stably load and unload itself onto a transport vehicle.

PCL · RANSAC / normals / filtering Eigen3 · SVD / matrix transforms OpenCV · 2D line post-processing
01 · Input

Raw point cloud

  • Solid-state LiDAR stream, vehicle-front field of view
02 · Preprocessing — PCL

Crop, downsample, denoise

  • ROI crop: x∈[0,20] · y∈[−5,5] · z∈[−2,2] m
  • VoxelGrid downsample, leaf = 0.05 m
  • StatisticalOutlierRemoval denoising
03 · Segmentation — PCL

Normals + candidate tilted planes

  • KdTree radius search r = 0.3 m → per-point normals
  • Tilt constraint: normal–Z angle θ∈[5°,35°]
  • RegionGrowing (normal diff <5°, point–plane <0.05 m) → N candidate clusters
04 · Fitting — PCL SAC + Eigen SVD

RANSAC-like plane refinement

  • ×100 iters: weighted 3-point sampling → plane; validity θ∈[5°,20°]; count inliers (<0.05 m); keep max set
  • SVD least-squares on inliers → precise plane {n, d}
05 · Selection

Multi-rule scoring → best ramp

  • Weighted score Score = Σ wᵢ·sᵢ → highest candidate is the target ramp (Score < 0.4 → “not detected”)
CriterionWeightRule
Slope plausibility0.30θ∈[5°,20°] scores full
Projected area0.20larger is better (cap 15 ㎡)
Position / heading0.20front, 2–15 m
Planarity (RMSE)0.15smaller residual is better
Shape compactness0.10reasonable aspect ratio
Ground connection0.05low end joins the ground
06 · Geometry — Eigen

Local frame + boundary points

  • Frame: n = Z-axis, downhill = X-axis; project inliers → 2D (u, v)
  • Longitudinal 20 strips → 2 long edges; lateral 10 strips → 2 short edges → 4 boundary sets
07 · Cleanup — PCL + custom

Outlier removal

  • 1D spacing-jump: gap > 3× mean → drop isolated points
  • Sliding-window median (win = 5): deviation > 0.1 m → drop → 4 clean edge sets
08 · Lines — Eigen SVD + RANSAC

Four boundary-line fitting

  • RANSAC (30 iters) 2-point line + inliers
  • TLS total least squares via SVD → precise line; endpoints by projection → 4 directed segments
09 · Output

Post-processing + RampInfo

  • Pairwise intersection → 4 corners; validate (opposite edges ∥ <5°, width [1.5,5] m, length [2,20] m); EMA smoothing α = 0.3
  • Output RampInfo { slope_angle, width, length, corners[4], boundary_lines[4], plane, confidence }
On-vehicle result
Autonomous up and down slope traversal
Autonomous up/down slope traversal for loading and unloading.
Ramp boundary and obstacle perception
Ramp-boundary and obstacle perception on ramp-like scenes.

Module B · two versions

Grass obstacle detection

Detect 3D obstacles on grass — first with a pure point-cloud geometric pipeline (V1), then upgraded with camera–LiDAR semantic fusion for class-aware, more robust detection (V2).

V1

Geometry-only pipeline — PCL

PCL · filtering / RANSAC / clustering / PCA
01 · Input

Raw point cloud

  • Solid-state LiDAR
02 · Preprocessing

Filter chain

  • PassThrough crop → VoxelGrid downsample → StatisticalOutlierRemoval
03 · Ground

Ground segmentation

  • PMF morphological filter → RANSAC plane (normal∠Z < 15°); ground points discarded
04 · Clustering

Euclidean clustering

  • KD-Tree neighbor search r = 0.4 m · min 20 · max 5000 pts
05 · Filter

PCA bounding box + rules

  • PCA → OBB; constraints h∈[0.1,2] · w∈[0.1,3] · l∈[0.1,5] m · h/w < 5
06 · Output

3D obstacles

  • Position / size / distance = √(cx² + cy²)
V1 result
Rule-based 3D obstacle detection
Rule-based 3D obstacle detection from point-cloud clustering.
V2

Camera–LiDAR semantic fusion — TensorRT

YOLOv5 · 2D detection SegFormer · semantic segmentation TensorRT · inference Camera–LiDAR extrinsics
① Input

Point cloud

  • Solid-state LiDAR
② Input

Image

  • RGB camera
Image → two TensorRT models
YOLOv5

2D bounding boxes

  • Per-frame 2D BBox list
SegFormer

Semantic mask

  • Ground / obstacle pixel mask
Projection

Extrinsic point ↔ pixel

  • Camera–LiDAR extrinsics inject semantics into each 3D point
Semantic ground

Mask-based segmentation

  • mask = ground → discard; obstacle → keep (replaces RANSAC)
ROI clustering

2D-guided clustering

  • Cluster points inside each BBox; associate 2D detection ↔ 3D cluster
Filter

Geometric filter

  • PCA → OBB with the same V1 rule constraints
Fusion

Triple-score fusion

  • 0.4 × geometry + 0.4 × 2D IoU + 0.2 × semantic consistency
Output

Class-aware 3D obstacles

  • 3D obstacles with category label and confidence
V2 result
Image segmentation and LiDAR clustering fusion
Image segmentation and LiDAR clustering fusion.
2D-3D fusion obstacle perception
2D–3D fusion for class-aware obstacle perception.

Module C · STM32H7 MCU

Embedded 2D BEV safety detection

A lightweight 2D BEV obstacle detector deployed on an STM32H7 microcontroller — static memory plus integer optimization deliver 110 fps real-time detection that passes functional-safety testing.

STM32H7 · bare-metal C Static memory allocation Integer / fixed-point optimization
Input

Raw point cloud

  • Solid-state LiDAR stream
Step 1 · Preprocess

Point-cloud preprocessing

  • ROI crop (5 m × 5 m); ground removal z < −0.1 m
  • Invalid filter (range = 0 / NaN); height clamp 0.05–2.5 m
Step 2 · BEV

BEV projection

  • Cell index idx = (x − origin) / reso
  • Per-cell point count; height diff max_z − min_z
Step 3 · Decide

Obstacle decision

  • Density threshold count > 3; height diff > 0.1 m
  • Connected components (8-neighbor merge)
Step 4 · Output

BEV obstacle list

  • Danger level + nearest-obstacle BEV coordinate
110 fpsreal-time on MCU
Staticfixed memory, no heap
Integerfixed-point math
Safetypassed functional-safety test
Safety-test result
Lightweight 2D BEV safety detection
Lightweight 2D BEV detector running on STM32H7 for safety-standard testing.

Module D · pre-research

Dual-attention LiDAR–vision fusion

A pre-research study (toward a company paper KPI) on dual-attention fusion that correlates scene geometry and texture features for stronger LiDAR–vision 3D detection.

Dual-attention fusion Soft + hard association LiDAR–camera 3D detection
Geometry branch

LiDAR features

  • 3D scene-geometry encoding
Texture branch

Image features

  • 2D appearance / texture encoding
Pre-built index

Fast correspondence

  • Pre-compute point ↔ pixel index so fusion stays cheap at runtime
Hard association

Explicit correspondence

  • Geometric point ↔ pixel matching
Soft association

Query-style attention

  • Learned cross-modal attention weights
Dual attention

Geometry × texture interaction

  • Two attention streams couple structure and appearance into a shared representation
Output

Robust 3D detection

  • Stronger on small objects, sparse LiDAR, and degraded images
Reference. Connected to FFPA-Net and soft/hard bi-modality fusion research. Study PDF
Research visualization
Soft and hard association for bi-modality fusion
Soft- and hard-association idea for bi-modality fusion.
Confidentiality note. Only high-level algorithmic information is shown. Thresholds listed here are illustrative; product parameters, calibration data, and deployment details are sanitized.