Evaluation and Testing#
The evaluation for each model can be found on the GitHub repo of the PoV.
System Evaluation Results#
The section shows the benchmark results of the VisionPilot model on different hardware environment as the references. There are two procedures to conduct the benchmark:
Two sets of computation configurations are used to benchmark the pipeline:
ADLINK ADM-AL30#
Hardware Spec#
- CPU: Intel 12th Gen Core i7-12700E (12 cores / 20 threads)
- GPU: NVIDIA RTX 4000 SFF Ada (20 GB VRAM)
- Memory: 128 GB DDR5 ECC
- Driver: Driver Version: 580.95.05 & CUDA Version: 13.0
- ROS: ROS Humble & Zenoh
- Runtime: TensorRT
- OS: Ubuntu 22.04.5
link to AL30 ( Autonomous Driving Solutions)
Benchmark Result#
-
Zenoh:
Model CPU Utilization GPU Utilization Peak Memory Usage Frame Rate SceneSeg 7% 37.9% 3.25G 50 DomainSeg 7% 38.4% 3.28G 55 Scene3D 7% 42.4% 3.21G 55.56 EgoSpace - SceneSeg
* Current FPS: 50.00 --- Per-frame Timing (microseconds) --- * Total processing time: 23450 us * Preprocessing time: 670 us * Inference time: 22460 us * Output time: 319 us- DomainSeg
* Current FPS: 55.00 --- Per-frame Timing (microseconds) --- * Total processing time: 14539 us * Preprocessing time: 330 us * Inference time: 14007 us * Output time: 201 us- Scene3D
* Current FPS: 55.56 --- Per-frame Timing (microseconds) --- * Total processing time: 18802 us * Preprocessing time: 726 us * Inference time: 16102 us * Output time: 1973 us -
ROS 2:
Model CPU Utilization GPU Utilization Peak Memory Usage Frame Rate SceneSeg 7% 39% 2.89G 46.67 DomainSeg 7% 41% 2.98G 53.85 Scene3D 6% 40% 2.76G 50 EgoSpace - SceneSeg
* Current FPS: 46.67 --- Per-frame Timing (microseconds) --- * Total processing time: 23795 us * Preprocessing time: 739 us * Inference time: 22058 us * Output time: 996 us- DomainSeg
* Current FPS: 53.85 --- Per-frame Timing (microseconds) --- * Total processing time: 17951 us * Preprocessing time: 746 us * Inference time: 16390 us * Output time: 814 us- Scene3D
* Current FPS: 50.00 --- Per-frame Timing (microseconds) --- * Total processing time: 12981 us * Preprocessing time: 211 us * Inference time: 11026 us * Output time: 1743 us
ARM processors and nVidia AGX Orin#
Hardware spec:#
- CPU: 12-core ARM Cortex-A78AE CPU at 2.2GHz.
- GPU: NVIDIA Ampere GPU with 2048 CUDA Cores.
- Memory: 64GB LPDDR5. The system and GPU memories are shared.
- Driver: The NVIDIA JetPack 6.0 (Ubuntu 22.04 LTS based) was used.
- ROS: ROS Humble with Autoware recommended Cyclone DDS settings.
- Runtime: ONNX runtime 1.19.0 or TensorRT
link to nVidia Jetson Orin AGX
Benchmark results:#
| Model | CPU Utilization | GPU Utilization | Peak Memory Usage | Frame Rate |
|---|---|---|---|---|
| SceneSeg (ONNX runtime) |
91% ~ 99% | 99% | 45G including network model (~30G) + other process (15G) |
8 |
| SceneSeg (TensorRT runtime - FP16) |
57 ~ 66 % | 74 % | 0.8 % (~0.50 GB) | 29.12 |
| DomainSeg (TensorRT runtime - FP16) |
56 ~ 60 % | 88 % | 0.8 % (~0.50 GB) | 29.85 |
| Scene3D (TensorRT runtime - FP16) |
53 ~ 56 % | 82 % | 0.6 % (~0.38 GB) | 29.90 |
| SceneSeg (TensorRT runtime - FP32) |
42 ~ 49 % | 99 % | 0.6 % (~0.38 GB) | 17.10 |
| DomainSeg (TensorRT runtime - FP32) |
43 ~ 47 % | 99 % | 0.6 % (~0.38 GB) | 17.07 |
| Scene3D (TensorRT runtime - FP32) |
44 ~ 46 % | 99 % | 0.6 % (~0.38 GB) | 17.03 |
link to the instructions and complete results.
- Demo Video: link
Advantech AFE-R750#
Hardware spec:#
- CPU: 8-core NVIDIA ArmĀ® Cortex A78AE v8.2.
- GPU: 1792-core NVIDIA Ampere GPU with 56 Tensor Cores.
- Memory: 32GB LPDDR5.
- Driver: The NVIDIA JetPack 6.1.
- ROS: ROS Humble.
- Runtime: TensorRT
Benchmark results:#
| Model | CPU Utilization | GPU Utilization | Peak Memory Usage | Frame Rate |
|---|---|---|---|---|
| SceneSeg (TensorRT runtime - FP16) |
45 ~ 50 % | 80 % | < 0.50 GB | 21 |
| DomainSeg (TensorRT runtime - FP16) |
55 ~ 60 % | 90 % | < 0.50 GB | 21 |
| Scene3D (TensorRT runtime - FP16) |
40 ~ 45 % | 85 % | < 0.40 GB | 22 |