Evaluation and Testing#

The evaluation for each model can be found on the GitHub repo of the PoV.

SceneSeg: link
DomainSeg: link
Scene3D: link
EgoPath: link

System Evaluation Results#

The section shows the benchmark results of the VisionPilot model on different hardware environment as the references. There are two procedures to conduct the benchmark:

Just-based: link
Make-base: link

Two sets of computation configurations are used to benchmark the pipeline:

X86-based Computer: link
ARM-based Computer: link

ADLINK ADM-AL30#

Hardware Spec#

CPU: Intel 12th Gen Core i7-12700E (12 cores / 20 threads)
GPU: NVIDIA RTX 4000 SFF Ada (20 GB VRAM)
Memory: 128 GB DDR5 ECC
Driver: Driver Version: 580.95.05 & CUDA Version: 13.0
ROS: ROS Humble & Zenoh
Runtime: TensorRT
OS: Ubuntu 22.04.5

link to AL30 ( Autonomous Driving Solutions)

Benchmark Result#

Zenoh:

Model	CPU Utilization	GPU Utilization	Peak Memory Usage	Frame Rate
SceneSeg	7%	37.9%	3.25G	50
DomainSeg	7%	38.4%	3.28G	55
Scene3D	7%	42.4%	3.21G	55.56
EgoSpace

SceneSeg

* Current FPS: 50.00
--- Per-frame Timing (microseconds) ---
* Total processing time: 23450 us
* Preprocessing time: 670 us
* Inference time: 22460 us
* Output time: 319 us

DomainSeg

* Current FPS: 55.00
--- Per-frame Timing (microseconds) ---
* Total processing time: 14539 us
* Preprocessing time: 330 us
* Inference time: 14007 us
* Output time: 201 us

Scene3D

* Current FPS: 55.56
--- Per-frame Timing (microseconds) ---
* Total processing time: 18802 us
* Preprocessing time: 726 us
* Inference time: 16102 us
* Output time: 1973 us

ROS 2:

Model	CPU Utilization	GPU Utilization	Peak Memory Usage	Frame Rate
SceneSeg	7%	39%	2.89G	46.67
DomainSeg	7%	41%	2.98G	53.85
Scene3D	6%	40%	2.76G	50
EgoSpace

SceneSeg

* Current FPS: 46.67
--- Per-frame Timing (microseconds) --- 
* Total processing time: 23795 us
* Preprocessing time: 739 us
* Inference time: 22058 us
* Output time: 996 us

DomainSeg

* Current FPS: 53.85
--- Per-frame Timing (microseconds) --- 
* Total processing time: 17951 us
* Preprocessing time: 746 us
* Inference time: 16390 us
* Output time: 814 us

Scene3D

* Current FPS: 50.00
--- Per-frame Timing (microseconds) ---
* Total processing time: 12981 us
* Preprocessing time: 211 us
* Inference time: 11026 us
* Output time: 1743 us

ARM processors and nVidia AGX Orin#

Hardware spec:#

CPU: 12-core ARM Cortex-A78AE CPU at 2.2GHz.
GPU: NVIDIA Ampere GPU with 2048 CUDA Cores.
Memory: 64GB LPDDR5. The system and GPU memories are shared.
Driver: The NVIDIA JetPack 6.0 (Ubuntu 22.04 LTS based) was used.
ROS: ROS Humble with Autoware recommended Cyclone DDS settings.
Runtime: ONNX runtime 1.19.0 or TensorRT

link to nVidia Jetson Orin AGX

Benchmark results:#

Model	CPU Utilization	GPU Utilization	Peak Memory Usage	Frame Rate
SceneSeg (ONNX runtime)	91% ~ 99%	99%	45G including network model (~30G) + other process (15G)	8
SceneSeg (TensorRT runtime - FP16)	57 ~ 66 %	74 %	0.8 % (~0.50 GB)	29.12
DomainSeg (TensorRT runtime - FP16)	56 ~ 60 %	88 %	0.8 % (~0.50 GB)	29.85
Scene3D (TensorRT runtime - FP16)	53 ~ 56 %	82 %	0.6 % (~0.38 GB)	29.90
SceneSeg (TensorRT runtime - FP32)	42 ~ 49 %	99 %	0.6 % (~0.38 GB)	17.10
DomainSeg (TensorRT runtime - FP32)	43 ~ 47 %	99 %	0.6 % (~0.38 GB)	17.07
Scene3D (TensorRT runtime - FP32)	44 ~ 46 %	99 %	0.6 % (~0.38 GB)	17.03

link to the instructions and complete results.

Demo Video: link

Advantech AFE-R750#

Hardware spec:#

CPU: 8-core NVIDIA Arm® Cortex A78AE v8.2.
GPU: 1792-core NVIDIA Ampere GPU with 56 Tensor Cores.
Memory: 32GB LPDDR5.
Driver: The NVIDIA JetPack 6.1.
ROS: ROS Humble.
Runtime: TensorRT

link to Advantech AFE-R750

Benchmark results:#

Model	CPU Utilization	GPU Utilization	Peak Memory Usage	Frame Rate
SceneSeg (TensorRT runtime - FP16)	45 ~ 50 %	80 %	< 0.50 GB	21
DomainSeg (TensorRT runtime - FP16)	55 ~ 60 %	90 %	< 0.50 GB	21
Scene3D (TensorRT runtime - FP16)	40 ~ 45 %	85 %	< 0.40 GB	22