autoware_ptv3#

Purpose#

The autoware_ptv3 package is used for 3D lidar segmentation.

Inner-workings / Algorithms#

This package implements a TensorRT powered inference node for Point Transformers V3 (PTv3) [1]. The sparse convolution backend corresponds to spconv. Autoware installs it automatically in its setup script. If needed, the user can also build it and install it following the following instructions.

Inputs / Outputs#

Input#

Name	Type	Description
`~/input/pointcloud`	`sensor_msgs::msg::PointCloud2`	Input pointcloud topic.

Output#

Name	Type	Description
`~/output/segmented/pointcloud`	`sensor_msgs::msg::PointCloud2`	RGB segmented pointcloud.
`~/output/ground_segmented/pointcloud`	`sensor_msgs::msg::PointCloud2`	Pointcloud with the ground segmented out.
`~/output/probs/pointcloud`	`sensor_msgs::msg::PointCloud2`	Class probabilities segmented pointcloud.
`debug/cyclic_time_ms`	`autoware_internal_debug_msgs::msg::Float64Stamped`	Cyclic time (ms).
`debug/pipeline_latency_ms`	`autoware_internal_debug_msgs::msg::Float64Stamped`	Pipeline latency time (ms).
`debug/processing_time/preprocess_ms`	`autoware_internal_debug_msgs::msg::Float64Stamped`	Preprocess (ms).
`debug/processing_time/inference_ms`	`autoware_internal_debug_msgs::msg::Float64Stamped`	Inference time (ms).
`debug/processing_time/postprocess_ms`	`autoware_internal_debug_msgs::msg::Float64Stamped`	Postprocess time (ms).
`debug/processing_time/total_ms`	`autoware_internal_debug_msgs::msg::Float64Stamped`	Total processing time (ms).

Parameters#

PTv3Node node#

Name	Type	Description	Default	Range
plugins_path	string	A path to the TensorRT plugins.	$(find-pkg-share autoware_ptv3)/plugins/libautoware_tensorrt_plugins.so	N/A
trt_precision	string	A precision of TensorRT engine.	fp16	['fp16', 'fp32']
cloud_capacity	integer	Capacity of the point cloud buffer (should be set to at least the maximum theoretical number of points).	2000000	≥1
onnx_path	string	A path to ONNX model file.	$(var model_path)/ptv3.onnx	N/A
engine_path	string	A path to TensorRT engine file.	$(var model_path)/ptv3.engine	N/A
colors_red	array	Red component of the class to RGB map for the segmented point cloud. Must have the same number of elements as the number of classes.	[255, 0, 255, 64, 0, 255]	N/A
colors_green	array	Green component of the class to RGB map for the segmented point cloud. Must have the same number of elements as the number of classes.	[0, 0, 0, 64, 255, 255]	N/A
colors_blue	array	Blue component of the class to RGB map for the segmented point cloud. Must have the same number of elements as the number of classes.	[0, 255, 255, 64, 0, 255]	N/A
ground_prob_threshold	float	Probability threshold to classify a point as ground.	0.3	≥0.0 ≤1.0

PTv3Node model#

Name	Type	Description	Default	Range
class_names	array	Predicted classes' names.	['CAR', 'TRUCK', 'BUS', 'BICYCLE', 'PEDESTRIAN']	N/A
voxels_num	array	Voxel ranges used during inference [min, opt, max].	[5000, 30000, 60000]	N/A
point_cloud_range	array	Range in meters of the pointcloud in meters [min_x, min_y, min_z, max_x, max_y, max_z].	[-76.8, -76.8, -3.0, 76.8, 76.8, 5.0]	N/A
voxel_size	array	Voxels size [x, y, z] in meters.	[0.3, 0.3, 8.0]	N/A

The `build_only` option#

The autoware_ptv3 node has a build_only option to build the TensorRT engine file from the specified ONNX file, after which the program exits.

ros2 launch autoware_ptv3 ptv3.launch.xml build_only:=true

The `log_level` option#

The default logging severity level for autoware_ptv3 is info. For debugging purposes, the developer may decrease severity level using log_level parameter:

ros2 launch autoware_ptv3 ptv3.launch.xml log_level:=debug

Assumptions / Known limits#

This node assumes that the input pointcloud follows the PointXYZIRC layout defined in autoware_point_types.

Trained Models#

v1 – First model release trained with PoC pseudo labels for the internal T4 dataset.

Changelog#

References/External links#

[1] Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, and Hengshuang Zhao. "Point Transformer V3: Simpler, Faster, Stronger." 2024 Conference on Computer Vision and Pattern Recognition.