tayalmanan
/

SafeVLA-HJ-Checkpoints

hamilton-jacobi

Model card Files Files and versions

SafeVLA HJ-Reachability Checkpoints

Feasibility-Gated PPO checkpoints with Hamilton-Jacobi reachability cost critic, trained on Safety-CHORES benchmark.

Checkpoints

Checkpoint	Task	Cost Type	Steps	Eval SR	Eval CC
hj_binary_pickup_204K.pt	PickupType	Binary (+25/-1)	204K	0.906	0.25
hj_vlm_rawadv_pickup_462K.pt	PickupType	VLM (rubrics)	462K	0.818	0.52
hj_vlm_fetch_310K.pt	FetchType	VLM (rubrics)	310K	0.515	4.79

Comparison with Baselines

Method	Pickup SR	Pickup CC	Fetch SR	Fetch CC
Lagrangian (ISA, paper)	0.875	0.25	0.637	8.08
HJ-Binary	0.906	0.25	-	-
HJ-VLM (ours)	0.818	0.52	0.515	4.79

Architecture

Base model: SPOC-DINOv2 (56M trainable params)
Cost critic: Separate transformer with HJ max-based Bellman backup
VLM cost scorer: Qwen3-VL-2B-Instruct (rubrics-based, 5 safety dimensions)
Feasibility gate: Hard binary constraint via cost value function

Key Findings

Binary +25/-1 costs cause extreme V_c predictions leading to aggressive gate closure
VLM-calibrated costs (safe=-1, unsafe=[0,25]) provide smoother cost landscape
FetchType benefits most from HJ gating (CC reduced 41% vs Lagrangian)
Cost advantage normalization must be removed for correct safety recovery gradients

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support