Skip to content

autoware_shape_estimation#

Purpose#

This node calculates a refined object shape (bounding box, cylinder, convex hull) in which a pointcloud cluster fits according to a label.

Inner-workings / Algorithms#

Fitting algorithms#

  • bounding box

    • L-shape fitting: See reference below for details
    • ML based shape fitting: See ML Based Shape Fitting Implementation section below for details
  • cylinder

    cv::minEnclosingCircle

  • convex hull

    cv::convexHull

Inputs / Outputs#

Input#

Name Type Description
input tier4_perception_msgs::msg::DetectedObjectsWithFeature detected objects with labeled cluster

Output#

Name Type Description
output/objects autoware_perception_msgs::msg::DetectedObjects detected objects with refined shape

Parameters#

Name Type Description Default Range
use_corrector boolean The flag to apply rule-based corrector. true N/A
use_filter boolean The flag to apply rule-based filter true N/A
use_vehicle_reference_yaw boolean The flag to use vehicle reference yaw for corrector false N/A
use_vehicle_reference_shape_size boolean The flag to use vehicle reference shape size false N/A
use_boost_bbox_optimizer boolean The flag to use boost bbox optimizer false N/A
model_params.use_ml_shape_estimator boolean The flag to apply use ml bounding box estimator. true N/A
model_params.minimum_points integer The minimum number of points to fit a bounding box. 16 N/A
model_params.precision string The precision of the model. fp32 N/A
model_params.batch_size integer The batch size of the model. 32 N/A
model_params.build_only boolean The flag to build the model only. false N/A

ML Based Shape Implementation#

The model takes a point cloud and object label(provided by camera detections/Apollo instance segmentation) as an input and outputs the 3D bounding box of the object.

ML based shape estimation algorithm uses a PointNet model as a backbone to estimate the 3D bounding box of the object. The model is trained on the NuScenes dataset with vehicle labels (Car, Truck, Bus, Trailer).

The implemented model is concatenated with STN (Spatial Transformer Network) to learn the transformation of the input point cloud to the canonical space and PointNet to predict the 3D bounding box of the object. Bounding box estimation part of Frustum PointNets for 3D Object Detection from RGB-D Data paper used as a reference.

The model predicts the following outputs for each object:

  • x,y,z coordinates of the object center
  • object heading angle classification result(Uses 12 bins for angle classification - 30 degrees each)
  • object heading angle residuals
  • object size classification result
  • object size residuals

Training ML Based Shape Estimation Model#

To train the model, you need ground truth 3D bounding box annotations. When using the mmdetection3d repository for training a 3D object detection algorithm, these ground truth annotations are saved and utilized for data augmentation. These annotations are used as an essential dataset for training the shape estimation model effectively.

Preparing the Dataset#

Install MMDetection3D prerequisites#

Step 1. Download and install Miniconda from the official website.

Step 2. Create a conda virtual environment and activate it

conda create --name train-shape-estimation python=3.8 -y
conda activate train-shape-estimation

Step 3. Install PyTorch

conda install pytorch torchvision -c pytorch

Install mmdetection3d#

Step 1. Install MMEngine, MMCV, and MMDetection using MIM

pip install -U openmim
mim install mmengine
mim install 'mmcv>=2.0.0rc4'
mim install 'mmdet>=3.0.0rc5, <3.3.0'

Step 2. Install Autoware's MMDetection3D fork

git clone https://github.com/autowarefoundation/mmdetection3d.git
cd mmdetection3d
pip install -v -e .

Preparing NuScenes dataset for training#

Step 1. Download the NuScenes dataset from the official website and extract the dataset to a folder of your choice.

Note: The NuScenes dataset is large and requires significant disk space. Ensure you have enough storage available before proceeding.

Step 2. Create a symbolic link to the dataset folder

ln -s /path/to/nuscenes/dataset/ /path/to/mmdetection3d/data/nuscenes/

Step 3. Prepare the NuScenes data by running:

cd mmdetection3d
python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes --only-gt-database True

Clone Bounding Box Estimator model#

git clone https://github.com/autowarefoundation/bbox_estimator.git

Split the dataset into training and validation sets#

cd bbox_estimator
python3 utils/split_dbinfos.py --dataset_path /path/to/mmdetection3d/data/nuscenes --classes 'car' 'truck' 'bus' 'trailer'  --train_ratio 0.8

Training and Deploying the model#

Training the model#

# Detailed training options can be found in the training script
# For more details, run `python3 train.py --help`
python3 train.py --dataset_path /path/to/mmdetection3d/data/nuscenes

Deploying the model#

# Convert the trained model to ONNX format
python3 onnx_converter.py --weight_path /path/to/best_checkpoint.pth --output_path /path/to/output.onnx

Give the output path of the ONNX model to the model_path parameter in the shape_estimation node launch file.

Assumptions / Known limits#

TBD

L-shape fitting implementation of the paper:

@conference{Zhang-2017-26536,
author = {Xiao Zhang and Wenda Xu and Chiyu Dong and John M. Dolan},
title = {Efficient L-Shape Fitting for Vehicle Detection Using Laser Scanners},
booktitle = {2017 IEEE Intelligent Vehicles Symposium},
year = {2017},
month = {June},
keywords = {autonomous driving, laser scanner, perception, segmentation},
}

Frustum PointNets for 3D Object Detection from RGB-D Data:

@inproceedings{qi2018frustum,
title={Frustum pointnets for 3d object detection from rgb-d data},
author={Qi, Charles R and Liu, Wei and Wu, Chenxia and Su, Hao and Guibas, Leonidas J},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={918--927},
year={2018}
}```