autoware_shape_estimation#
Purpose#
This node calculates a refined object shape (bounding box, cylinder, convex hull) in which a pointcloud cluster fits according to a label.
Inner-workings / Algorithms#
Fitting algorithms#
-
bounding box
- L-shape fitting: See reference below for details
- ML based shape fitting: See ML Based Shape Fitting Implementation section below for details
-
cylinder
cv::minEnclosingCircle
-
convex hull
cv::convexHull
Inputs / Outputs#
Input#
Name | Type | Description |
---|---|---|
input |
tier4_perception_msgs::msg::DetectedObjectsWithFeature |
detected objects with labeled cluster |
Output#
Name | Type | Description |
---|---|---|
output/objects |
autoware_perception_msgs::msg::DetectedObjects |
detected objects with refined shape |
Parameters#
Name | Type | Description | Default | Range |
---|---|---|---|---|
use_corrector | boolean | The flag to apply rule-based corrector. | true | N/A |
use_filter | boolean | The flag to apply rule-based filter | true | N/A |
use_vehicle_reference_yaw | boolean | The flag to use vehicle reference yaw for corrector | false | N/A |
use_vehicle_reference_shape_size | boolean | The flag to use vehicle reference shape size | false | N/A |
use_boost_bbox_optimizer | boolean | The flag to use boost bbox optimizer | false | N/A |
model_params.use_ml_shape_estimator | boolean | The flag to apply use ml bounding box estimator. | true | N/A |
model_params.minimum_points | integer | The minimum number of points to fit a bounding box. | 16 | N/A |
model_params.precision | string | The precision of the model. | fp32 | N/A |
model_params.batch_size | integer | The batch size of the model. | 32 | N/A |
model_params.build_only | boolean | The flag to build the model only. | false | N/A |
ML Based Shape Implementation#
The model takes a point cloud and object label(provided by camera detections/Apollo instance segmentation) as an input and outputs the 3D bounding box of the object.
ML based shape estimation algorithm uses a PointNet model as a backbone to estimate the 3D bounding box of the object. The model is trained on the NuScenes dataset with vehicle labels (Car, Truck, Bus, Trailer).
The implemented model is concatenated with STN (Spatial Transformer Network) to learn the transformation of the input point cloud to the canonical space and PointNet to predict the 3D bounding box of the object. Bounding box estimation part of Frustum PointNets for 3D Object Detection from RGB-D Data paper used as a reference.
The model predicts the following outputs for each object:
- x,y,z coordinates of the object center
- object heading angle classification result(Uses 12 bins for angle classification - 30 degrees each)
- object heading angle residuals
- object size classification result
- object size residuals
Training ML Based Shape Estimation Model#
To train the model, you need ground truth 3D bounding box annotations. When using the mmdetection3d repository for training a 3D object detection algorithm, these ground truth annotations are saved and utilized for data augmentation. These annotations are used as an essential dataset for training the shape estimation model effectively.
Preparing the Dataset#
Install MMDetection3D prerequisites#
Step 1. Download and install Miniconda from the official website.
Step 2. Create a conda virtual environment and activate it
conda create --name train-shape-estimation python=3.8 -y
conda activate train-shape-estimation
Step 3. Install PyTorch
conda install pytorch torchvision -c pytorch
Install mmdetection3d#
Step 1. Install MMEngine, MMCV, and MMDetection using MIM
pip install -U openmim
mim install mmengine
mim install 'mmcv>=2.0.0rc4'
mim install 'mmdet>=3.0.0rc5, <3.3.0'
Step 2. Install Autoware's MMDetection3D fork
git clone https://github.com/autowarefoundation/mmdetection3d.git
cd mmdetection3d
pip install -v -e .
Preparing NuScenes dataset for training#
Step 1. Download the NuScenes dataset from the official website and extract the dataset to a folder of your choice.
Note: The NuScenes dataset is large and requires significant disk space. Ensure you have enough storage available before proceeding.
Step 2. Create a symbolic link to the dataset folder
ln -s /path/to/nuscenes/dataset/ /path/to/mmdetection3d/data/nuscenes/
Step 3. Prepare the NuScenes data by running:
cd mmdetection3d
python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes --only-gt-database True
Clone Bounding Box Estimator model#
git clone https://github.com/autowarefoundation/bbox_estimator.git
Split the dataset into training and validation sets#
cd bbox_estimator
python3 utils/split_dbinfos.py --dataset_path /path/to/mmdetection3d/data/nuscenes --classes 'car' 'truck' 'bus' 'trailer' --train_ratio 0.8
Training and Deploying the model#
Training the model#
# Detailed training options can be found in the training script
# For more details, run `python3 train.py --help`
python3 train.py --dataset_path /path/to/mmdetection3d/data/nuscenes
Deploying the model#
# Convert the trained model to ONNX format
python3 onnx_converter.py --weight_path /path/to/best_checkpoint.pth --output_path /path/to/output.onnx
Give the output path of the ONNX model to the model_path
parameter in the shape_estimation
node launch file.
Assumptions / Known limits#
TBD
References/External links#
L-shape fitting implementation of the paper:
@conference{Zhang-2017-26536,
author = {Xiao Zhang and Wenda Xu and Chiyu Dong and John M. Dolan},
title = {Efficient L-Shape Fitting for Vehicle Detection Using Laser Scanners},
booktitle = {2017 IEEE Intelligent Vehicles Symposium},
year = {2017},
month = {June},
keywords = {autonomous driving, laser scanner, perception, segmentation},
}
Frustum PointNets for 3D Object Detection from RGB-D Data:
@inproceedings{qi2018frustum,
title={Frustum pointnets for 3d object detection from rgb-d data},
author={Qi, Charles R and Liu, Wei and Wu, Chenxia and Su, Hao and Guibas, Leonidas J},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={918--927},
year={2018}
}```