EPDMS Metrics#
This document summarizes the Autoware-compatible migrated EPDMS metrics written by the open-loop evaluator. The metric is migrated from the original NAVSIM EPDMS definition in https://github.com/autonomousvision/navsim/blob/main/docs/metrics.md.
It describes the semantic score first, then the migrated Autoware subscore equations used by this package.
This Autoware-compatible migration keeps the final single-trajectory EPDMS composition, but adapts the inputs to Autoware ROS topics, Autoware lanelet maps, and one selected planner trajectory. The score is local to each synchronized evaluation sample; it does not implement NAVSIM's later pseudo-closed-loop scene aggregation stage.
Common Notation#
Let the selected trajectory be:
For trajectory sample \(q_t\):
- \(T_t\) is the relative time from
time_from_start. - \(p_t\) is the ego pose.
- \(\vec{v}_t\) is the 2D world-frame velocity vector.
- \(P_t\) is the ego footprint polygon generated from
VehicleInfoat \(p_t\). - \(X_{t,k}\) is an ego footprint corner, with \(k\) ranging over the four corners.
- \(R_t\) is the route/lanelet context available from
RouteHandler.
Each subscore is reported with an availability flag. The synthetic EPDMS score is available only when every required raw subscore is available.
The EPDMS subscores are written as uppercase symbols in equations. Each item links to the corresponding subscore logic section:
NC: no at-fault collisionDAC: drivable area complianceDDC: driving direction complianceTLC: traffic light complianceTTC: time-to-collision within boundLK: lane keepingHC: history comfortEC: extended comfortEP: ego progress
Parameters#
This section lists the main constants used by the implemented EPDMS equations.
| Symbol / value | Default | Used by | Meaning |
|---|---|---|---|
| \(w_{EP}\) | 5 |
EPDMS | Ego-progress weight |
| \(w_{TTC}\) | 5 |
EPDMS | Time-to-collision within bound weight |
| \(w_{LK}\) | 2 |
EPDMS | Lane-keeping weight |
| \(w_{HC}\) | 2 |
EPDMS | History-comfort weight |
| \(w_{EC}\) | 2 |
EPDMS | Extended-comfort weight |
| \(\epsilon_H\) | 1.0e-9 |
Human filter | Human-reference zero threshold |
| \(\tau_{stop}\) | 0.05 m/s |
NC | Stopped ego / stopped track threshold |
| \(\theta_{behind}\) | 150 deg |
NC, TTC | Object-behind angle threshold |
| \(r_{semantic}\) | 15 m |
DAC, shared area context | Expanded search range for semantic drivable-area polygons |
| \(r_{border}\) | 5 m |
DAC, shared area context | Expanded search range for road_border line strings |
| \(\rho\) | {0.3, 0.6, 1.0, 1.5, 2.0, 2.5, 3.0, 4.0} m |
DAC | Road-border side-probe distances |
| \(d^{max}_{corner,border}\) | 3.0 m |
DAC | Maximum corner-to-road-border distance for fallback acceptance |
| \(d^{max}_{semantic,border}\) | 4.0 m |
DAC | Maximum semantic-boundary-to-road-border distance for fallback acceptance |
| \(T_{DDC}\) | 1.0 s |
DDC | Rolling window for oncoming-progress accumulation |
| \(C_{minor}\) | 2.0 m |
DDC | Lower threshold for partial DDC penalty |
| \(C_{major}\) | 6.0 m |
DDC | Threshold for full DDC penalty |
| \(\tau_{TTC,stop}\) | 0.005 m/s |
TTC | Stopped ego threshold for skipping TTC projection |
| \(\delta\) | {0.0, 0.3, 0.6, 0.9} s |
TTC | Future projection offsets |
| \(\theta_{ahead}\) | 30 deg |
TTC | Object-ahead angle threshold |
| \(D_{max}\) | 0.5 m |
LK | Maximum accepted lateral deviation from route centerline |
| \(T_{LK}\) | 2.0 s |
LK | Maximum continuous lane-keeping violation duration |
| \(T^{pre}_{LC}\) | 1.0 s |
LK | Lane-change pre-grace duration |
| \(T^{post}_{LC}\) | 1.0 s |
LK | Lane-change post-grace duration |
| \(v_{queue}\) | 1.0 m/s |
LK | Queue low-speed threshold |
| \(T_{queue}\) | 1.0 s |
LK | Queue progress-check window |
| \(d_{queue}\) | 1.5 m |
LK | Maximum progress for queue exemption |
| \(T_{release}\) | 1.5 s |
LK | Queue-release grace duration |
| \(T_{past}\) | 1.5 s |
HC | Past motion-history horizon |
| \(\Delta T_{HC}\) | 0.1 s |
HC | History-comfort sample interval |
| \(T_{future}\) | 4.0 s |
HC | Future trajectory horizon used for HC |
| \(a_x^{min}\) | -4.05 m/s^2 |
HC | Minimum longitudinal acceleration |
| \(a_x^{max}\) | 2.40 m/s^2 |
HC | Maximum longitudinal acceleration |
| \(\lvert a_y \rvert^{max}\) | 4.89 m/s^2 |
HC | Maximum lateral acceleration magnitude |
| \(\lvert j \rvert^{max}\) | 8.37 m/s^3 |
HC | Maximum jerk magnitude |
| \(\lvert j_x \rvert^{max}\) | 4.13 m/s^3 |
HC | Maximum longitudinal jerk magnitude |
| \(\lvert \dot{\psi} \rvert^{max}\) | 0.95 rad/s |
HC | Maximum yaw-rate magnitude |
| \(\lvert \ddot{\psi} \rvert^{max}\) | 1.93 rad/s^2 |
HC | Maximum yaw-acceleration magnitude |
| \(\tau_a\) | 0.7 |
EC | Maximum acceleration RMS discrepancy |
| \(\tau_j\) | 0.5 |
EC | Maximum jerk RMS discrepancy |
| \(\tau_{\dot{\psi}}\) | 0.1 |
EC | Maximum yaw-rate RMS discrepancy |
| \(\tau_{\ddot{\psi}}\) | 0.1 |
EC | Maximum yaw-acceleration RMS discrepancy |
| \(\tau_G\) | 5.0 m |
EP | Progress denominator threshold for fallback |
EPDMS Definition#
The Extended Predictive Driver Model Score (EPDMS) is a comprehensive, rule-based evaluation metric used to benchmark the performance and safety of autonomous driving systems and trajectory planners
EPDMS scores one selected trajectory by multiplying safety and rule-compliance subscores with a weighted quality term. The human filter is applied inside the aggregation so that failures also observed in the human reference are suppressed.
| Metric | Weight | Range |
|---|---|---|
No at-fault collision (NC) |
multiplier | {0, 0.5, 1} |
Drivable area compliance (DAC) |
multiplier | {0, 1} |
Driving direction compliance (DDC) |
multiplier | {0, 0.5, 1} |
Traffic light compliance (TLC) |
multiplier | {0, 1} |
Ego progress (EP) |
5 |
[0, 1] |
Time-to-collision within bound (TTC) |
5 |
{0, 1} |
Lane keeping (LK) |
2 |
{0, 1} |
History comfort (HC) |
2 |
{0, 1} |
Extended comfort (EC) |
2 |
{0, 1} |
The full EPDMS is defined as:
where:
Here:
Here:
- Multiplicative penalty gate:
The left product term combinesNC,DAC,DDC, andTLC. These are safety and rule-compliance checks. A score of0collapses EPDMS to0, while0.5partially penalizes it. Therefore, this term controls whether the weighted quality score should pass through fully, partially, or not at all.
- Weighted quality term:
The right fraction combinesEP,TTC,LK,HC, andECusing their configured weights. It is normalized by the total weight, so it remains in[0, 1]. Larger weights makeEPandTTCmore influential than lane keeping and comfort.
- Human filter:
filter_m(agent, human)returns1when the human reference also failed the same metric, suppressing shared human-reference failures. Otherwise, it keeps the agent score.ECbypasses this filter because it measures consistency between consecutive agent trajectories.
Shared Map and Area Sources#
Several subscores reuse the same Autoware footprint and lanelet-map helper outputs. This section defines the area scopes used later in the subscore equations.
For each trajectory sample, evaluate_trajectory_footprints() builds the ego
footprint \(P_t\) and, when a route handler is available, computes
EgoAreaEvaluation.
The semantic drivable-area search uses the ego footprint bounding box expanded
by 15m. It collects:
- route-designated road lanelets collected along the selected trajectory
- all road lanelets from the expanded local map search
- road lanelets at the current pose if they intersect the ego footprint
- road-shoulder lanelets from the expanded local map search
intersection_areapolygons from the expanded local map searchhatched_road_markingspolygons from the expanded local map searchparking_lotpolygons from the expanded local map search
The road-border fallback uses road_border line strings from the ego footprint
bounding box expanded by 5m. Road borders are never converted into a broad
drivable polygon; they are used only as local one-corner boundary evidence after
semantic drivable-area containment fails.
The driving-direction local context uses a center-point search, not the full ego
footprint. It searches lanelets and intersection_area polygons within a 5m
point bounding box. Local route-consistent lanelets include route lanelets and
same-direction nearby road or shoulder lanelets, where same direction means the
lanelet yaw differs from the route reference yaw by at most 45deg. A 0.35m
margin around these local lanelets is accepted for route-lane containment.
The traffic-light and ego-progress subscores use route-relevant lanelets collected along the selected trajectory from road lanelets at each trajectory pose that are also route lanelets. They do not use the 15 m global semantic drivable-area search.
NC: No At-Fault Collision#
NC checks whether the ego trajectory overlaps a recorded tracked object in an ego-responsible way.
NC uses the ego footprint \(P_t\) from the
shared map and area sources. Its lateral-fault
bad-area term reuses the shared MultipleLanes_t and NonDrivableArea_t flags
from the semantic drivable-area evaluation.
Therefore, lateral at-fault classification depends on the 15 m semantic
drivable-area search and the 5 m road-border fallback, while front and stopped
track collisions do not depend on map-area polygons.
For object track \(o\), let \(O_{t,o}\) be the interpolated object polygon at the trajectory query time. Candidate contact is:
Objects whose highest-probability classification is UNKNOWN are ignored. This
suppresses short-lived Autoware tracking artifacts.
The ego stopped flag is:
For a tracked object:
An object is behind ego when the relative angle from ego heading to the ego-to-object vector exceeds the behind threshold used by the helper:
The front-bumper hit flag is:
where \(F_t\) is the front-bumper line segment of the ego footprint.
Collision type is evaluated in this priority order:
- If \(StoppedEgo_t\) is true, then \(CollisionType_{t,o}=STOPPED\_EGO\).
- Else if \(StoppedTrack_{t,o}\) is true, then \(CollisionType_{t,o}=STOPPED\_TRACK\).
- Else if \(Behind_{t,o}\) is true, then \(CollisionType_{t,o}=ACTIVE\_REAR\).
- Else if \(FrontHit_{t,o}\) is true, then \(CollisionType_{t,o}=ACTIVE\_FRONT\).
- Otherwise, \(CollisionType_{t,o}=ACTIVE\_LATERAL\).
The bad-area flag used for lateral fault judgement is:
The at-fault predicate is:
Per-contact score is:
The final subscore is the minimum at-fault contact score:
When there is no at-fault contact, \(NC=1.0\).
DAC: Drivable Area Compliance#
DAC checks whether all ego footprint corners remain inside the semantic drivable area, with a conservative road-border fallback.
DAC is the direct consumer of the shared semantic drivable-area evaluation. It uses the full ego footprint, not only the ego center. At each trajectory sample, all four ego corners must be accepted by either a semantic drivable polygon or the conservative road-border fallback.
For each footprint, the semantic drivable union is:
The road-lanelet set is intentionally not narrowed to only ego-route direction. Opposite-direction lanelets are still physically road surface for DAC; DDC is responsible for wrong-way progress.
The detailed area scope is:
RoadLanelets_t: route-designated road lanelets along the trajectory, plus all road lanelets from the ego-footprint bounding box expanded by15m, plus road lanelets at the pose that intersect the footprintShoulderLanelets_t: road-shoulder lanelets from the ego-footprint bounding box expanded by15mIntersectionAreas_t:intersection_areapolygons from the ego-footprint bounding box expanded by15mHatchedRoadMarkings_t:hatched_road_markingspolygons from the ego-footprint bounding box expanded by15mParkingLots_t:parking_lotpolygons from the ego-footprint bounding box expanded by15m
The semantic corner predicate is:
If a corner is not semantically drivable, the road-border fallback may accept it. Let:
- \(S_{t,k}\) be the closest point from \(X_{t,k}\) to the boundary of \(U_t\).
- \(B_{t,k}\) be the closest point from \(X_{t,k}\) to a candidate
road_borderline segment. - \(n_{t,k}\) be the unit normal of that border segment.
- \(Y^+_{t,k}(\rho)=B_{t,k}+\rho n_{t,k}\).
- \(Y^-_{t,k}(\rho)=B_{t,k}-\rho n_{t,k}\).
The implementation probes:
The road side is valid only when exactly one of \(Y^+\) and \(Y^-\) is inside \(U_t\). The corner is accepted by the road-border fallback only if all of these conditions hold:
- exactly one side sample is semantically drivable
- \(X_{t,k}\) lies on the same border half-plane as that drivable side
- \(X_{t,k}\) lies in the bounded gap from \(S_{t,k}\) to \(B_{t,k}\)
- the vector from \(S_{t,k}\) to \(B_{t,k}\) is across the border, not along it
- the corner-to-border distance is at most
3.0m - the semantic-boundary-to-border distance is at most
4.0m
In equation form:
The final corner predicate is:
A trajectory sample is non-drivable if any corner fails:
The final DAC score is:
DDC: Driving Direction Compliance#
DDC measures wrong-way or oncoming progress over a rolling horizon.
DDC uses the
driving-direction local context, not the DAC
semantic drivable union. The context is computed from the ego center point with
a 5m local lanelet and intersection search. A sample is treated as not on the
route direction when the center point is outside local route-consistent
road/shoulder lanelets, including the 0.35m lane margin.
For sample \(t\), let:
- \(g_t\) be positive ego progress since the previous sample.
- \(Oncoming_t\) be true when ego is judged to be in oncoming traffic.
- \(Intersection_t\) be true when ego is in an intersection context.
The oncoming predicate used by the implementation is:
The route-lane predicate includes exact containment and the lane-margin fallback:
The intersection predicate is true when the ego center is inside a local
intersection_area polygon, or inside a route-consistent lanelet tagged as an
intersection lanelet:
Only non-intersection oncoming progress contributes. When the sample is in oncoming traffic and not in an intersection:
Otherwise:
The rolling one-second oncoming progress is:
The maximum rolling oncoming progress is:
The score is:
TLC: Traffic Light Compliance#
TLC checks whether the ego footprint crosses a relevant stop line while the traffic signal requires stopping.
TLC does not use red-light pseudo polygons. It uses route-relevant lanelets, traffic-light regulatory elements, traffic-light group messages, and regulatory stop-line line strings. Ego interaction is tested by the ego footprint intersecting the stop line.
The evaluator first collects traffic-light regulatory element groups from route lanelets relevant to the selected trajectory. For each group \(r\):
- \(StopLine_r\) is the regulatory stop line.
- \(SelectedLanelets_r\) are route lanelets matching the inferred intended turn direction when available.
- \(Signal_r\) is the matching
TrafficLightGroupmessage by regulatory element ID.
Turn intent is inferred from the current turn indicator when it is left or right; otherwise it is inferred from route lanelet turn-direction attributes when unambiguous.
Stop-line crossing is:
Stop-required state is:
The violation predicate is:
The final score is:
If no relevant traffic-light groups exist, the metric is available and returns \(1.0\).
TTC: Time-to-Collision Within Bound#
TTC projects the current ego state to short future offsets and checks overlap with recorded tracked objects at matching future query times.
TTC uses two different geometry sources. For map context at the current
trajectory sample, it reuses
shared footprint evaluation flags:
MultipleLanes_t, NonDrivableArea_t, and Intersection_t. These flags come
from the same 15 m semantic drivable-area search and 5 m road-border fallback
used by DAC, with intersection context enabled.
For collision geometry, TTC builds a separate projected ego footprint at each
offset \(\delta\) and compares it against recorded tracked-object polygons; the
projected footprint is not reclassified against map polygons at \(t+\delta\).
The checked offsets are:
For sample \(t\) and offset \(\delta\), ego pose is projected with the current velocity:
The projected ego footprint is:
The object polygon is interpolated from recorded tracks at time \(T_t+\delta\):
Overlap is:
Stopped ego samples are skipped when:
Previously collided object IDs are skipped, and UNKNOWN classified objects are
ignored.
The bad-or-intersection predicate is:
The relative-position tests are nuPlan-style angle checks:
TTC fails when the object is ahead, or when ego is in a bad/intersection area and the object is not behind:
The final score is:
LK: Lane Keeping#
LK evaluates sustained route-centerline deviation outside intersections, queues, and explicit lane-change intent windows.
LK uses route centerline geometry, not the DAC drivable-area polygon union. For each trajectory sample, the reference lanelet is the closest lanelet within the route when available; otherwise, a route lanelet at the pose is used. The intersection exemption uses the same 5 m center-point driving-direction local context as DDC.
For each sample:
The over-threshold flag is:
where the configured default is:
Lane-change exemption windows are built from left/right turn indicators and hazard lights. Each active signal interval \([a,b]\) is expanded by:
The queue exemption checks low speed and low progress over a one-second window:
After a queue sample, queue-release grace is active for 1.5s.
The per-sample violation flag is:
Let \(\mathcal{R}\) be the set of continuous runs where \(LKViolation_t\) is true. The score is:
HC: History Comfort#
HC evaluates whether the padded past-plus-future ego motion stays within NAVSIM comfort thresholds.
The padded state sequence is:
The past segment is sampled from recorded odometry and acceleration history from
-1.5s to -0.1s relative to the trajectory stamp. The future segment uses the
selected trajectory from 0.0s through 4.0s.
The comfort signal helper computes:
The six checks are:
The final score is:
EC: Extended Comfort#
EC compares consecutive planned trajectories and penalizes large dynamic-signal changes over their time-overlap.
Let:
- \(Q^{prev}\) be the previous selected trajectory.
- \(Q^{curr}\) be the current selected trajectory.
- \(\Delta T\) be the header-stamp interval.
- \(\Delta q\) be the trajectory sample interval.
The overlap shift is:
The compared overlap sequences are:
For signal:
the RMS discrepancy is:
The thresholds are:
The final score is:
The first trajectory has no previous trajectory and is therefore unavailable for EC.
EP: Ego Progress#
EP measures route progress of the selected trajectory. This implementation uses the current single selected trajectory topic, not a multi-candidate proposal batch.
EP uses route-relevant lanelets collected along the selected trajectory. It does not use DAC's semantic drivable-area union, and it does not use a search over opposite-direction or shoulder lanelets except when those lanelets are part of the route-relevant centerline context returned by the route handler.
Let:
Raw progress is:
The multiplicative safety mask is:
The denominator used by the single-proposal NAVSIM-style ratio is:
With the progress threshold:
the score is:
Otherwise:
Because this implementation currently has only one selected trajectory proposal,
EP is effectively always 1.0 when available. Multi-candidate proposal-batch
progress remains a future faithfulness improvement.
Output Topics#
EPDMS metric topics are written under:
text
/open_loop/metrics/epdms/*
The main score topics are:
/open_loop/metrics/epdms/no_at_fault_collision/open_loop/metrics/epdms/drivable_area_compliance/open_loop/metrics/epdms/driving_direction_compliance/open_loop/metrics/epdms/traffic_light_compliance/open_loop/metrics/epdms/time_to_collision_within_bound/open_loop/metrics/epdms/lane_keeping/open_loop/metrics/epdms/history_comfort/open_loop/metrics/epdms/extended_comfort/open_loop/metrics/epdms/ego_progress/open_loop/metrics/epdms/synthetic_epdms_raw/open_loop/metrics/epdms/synthetic_epdms_human_filtered
Availability and reason topics are published with matching metric-specific suffixes. Diagnostic raw arrays such as acceleration, jerk, TTC values, lateral deviation, and travel distance intentionally remain outside the EPDMS metric namespace.