Skip to content

Evaluation and Testing#

The evaluation for each model can be found on the GitHub repo of the PoV.

System Evaluation Results#

The section shows the benchmark results of the VisionPilot model on different hardware environment as the references. There are two procedures to conduct the benchmark:

Two sets of computation configurations are used to benchmark the pipeline:

  • X86-based Computer: link
  • ARM-based Computer: link

Hardware Spec#

  • CPU: Intel 12th Gen Core i7-12700E (12 cores / 20 threads)
  • GPU: NVIDIA RTX 4000 SFF Ada (20 GB VRAM)
  • Memory: 128 GB DDR5 ECC
  • Driver: Driver Version: 580.95.05 & CUDA Version: 13.0
  • ROS: ROS Humble & Zenoh
  • Runtime: TensorRT
  • OS: Ubuntu 22.04.5

link to AL30 ( Autonomous Driving Solutions)

Benchmark Result#

  • Zenoh:

    Model CPU Utilization GPU Utilization Peak Memory Usage Frame Rate
    SceneSeg 7% 37.9% 3.25G 50
    DomainSeg 7% 38.4% 3.28G 55
    Scene3D 7% 42.4% 3.21G 55.56
    EgoSpace
    • SceneSeg
    * Current FPS: 50.00
    --- Per-frame Timing (microseconds) ---
    * Total processing time: 23450 us
    * Preprocessing time: 670 us
    * Inference time: 22460 us
    * Output time: 319 us
    
    • DomainSeg
    * Current FPS: 55.00
    --- Per-frame Timing (microseconds) ---
    * Total processing time: 14539 us
    * Preprocessing time: 330 us
    * Inference time: 14007 us
    * Output time: 201 us
    
    • Scene3D
    * Current FPS: 55.56
    --- Per-frame Timing (microseconds) ---
    * Total processing time: 18802 us
    * Preprocessing time: 726 us
    * Inference time: 16102 us
    * Output time: 1973 us
    
  • ROS 2:

    Model CPU Utilization GPU Utilization Peak Memory Usage Frame Rate
    SceneSeg 7% 39% 2.89G 46.67
    DomainSeg 7% 41% 2.98G 53.85
    Scene3D 6% 40% 2.76G 50
    EgoSpace
    • SceneSeg
    * Current FPS: 46.67
    --- Per-frame Timing (microseconds) --- 
    * Total processing time: 23795 us
    * Preprocessing time: 739 us
    * Inference time: 22058 us
    * Output time: 996 us
    
    • DomainSeg
    * Current FPS: 53.85
    --- Per-frame Timing (microseconds) --- 
    * Total processing time: 17951 us
    * Preprocessing time: 746 us
    * Inference time: 16390 us
    * Output time: 814 us
    
    • Scene3D
    * Current FPS: 50.00
    --- Per-frame Timing (microseconds) ---
    * Total processing time: 12981 us
    * Preprocessing time: 211 us
    * Inference time: 11026 us
    * Output time: 1743 us
    

ARM processors and nVidia AGX Orin#

Hardware spec:#

  • CPU: 12-core ARM Cortex-A78AE CPU at 2.2GHz.
  • GPU: NVIDIA Ampere GPU with 2048 CUDA Cores.
  • Memory: 64GB LPDDR5. The system and GPU memories are shared.
  • Driver: The NVIDIA JetPack 6.0 (Ubuntu 22.04 LTS based) was used.
  • ROS: ROS Humble with Autoware recommended Cyclone DDS settings.
  • Runtime: ONNX runtime 1.19.0 or TensorRT

link to nVidia Jetson Orin AGX

Benchmark results:#

Model CPU Utilization GPU Utilization Peak Memory Usage Frame Rate
SceneSeg
(ONNX runtime)
91% ~ 99% 99% 45G
including network model (~30G) + other process (15G)
8
SceneSeg
(TensorRT runtime - FP16)
57 ~ 66 % 74 % 0.8 % (~0.50 GB) 29.12
DomainSeg
(TensorRT runtime - FP16)
56 ~ 60 % 88 % 0.8 % (~0.50 GB) 29.85
Scene3D
(TensorRT runtime - FP16)
53 ~ 56 % 82 % 0.6 % (~0.38 GB) 29.90
SceneSeg
(TensorRT runtime - FP32)
42 ~ 49 % 99 % 0.6 % (~0.38 GB) 17.10
DomainSeg
(TensorRT runtime - FP32)
43 ~ 47 % 99 % 0.6 % (~0.38 GB) 17.07
Scene3D
(TensorRT runtime - FP32)
44 ~ 46 % 99 % 0.6 % (~0.38 GB) 17.03

link to the instructions and complete results.

Advantech AFE-R750#

Hardware spec:#

  • CPU: 8-core NVIDIA ArmĀ® Cortex A78AE v8.2.
  • GPU: 1792-core NVIDIA Ampere GPU with 56 Tensor Cores.
  • Memory: 32GB LPDDR5.
  • Driver: The NVIDIA JetPack 6.1.
  • ROS: ROS Humble.
  • Runtime: TensorRT

link to Advantech AFE-R750

Benchmark results:#

Model CPU Utilization GPU Utilization Peak Memory Usage Frame Rate
SceneSeg
(TensorRT runtime - FP16)
45 ~ 50 % 80 % < 0.50 GB 21
DomainSeg
(TensorRT runtime - FP16)
55 ~ 60 % 90 % < 0.50 GB 21
Scene3D
(TensorRT runtime - FP16)
40 ~ 45 % 85 % < 0.40 GB 22