GR00T-N1.5-RL-Rheo Assemble Trocar

GR00T-N1.5-RL-Rheo Assemble Trocar #

GR00T-N1.5-RL-Rheo-AssembleTrocar is a vision language action model (VLA) fine-tuned for surgical instrument handling in the Isaac for Healthcare Rheo workflow. Using a G1 embodiment, it performs trocar assembly: retrieves the trocar (obturator and cannula) from a surgical tray on the left, assembles it, and places it on a Mayo Stand on the right. Intended for Rheo simulation workflows only; not for real-world clinical deployment. NVIDIA License; Apache-2.0 for Qwen2.5-7B-Instruct and SigLIP2-SO400M. Ready for commercial/non-commercial use.

Property	Details
Model size	3B parameters (GR00T N1.5)
Model type	Vision Language Action (VLA); PyTorch 2.8.0; GR00T N1.5. Input: vision (3×480×640 RGB, head + 2 wrist cameras), state (1×28), language. Output: 16×28 action tensor. Linux Ubuntu 22.04/24.04. Supported: Ampere, Blackwell, Hopper.
Performance	NVIDIA RTX 5880 Ada: 54.2 ± 8.5 ms latency, 8 GB VRAM. Trained on 59 simulation samples (manual teleoperation).
Workflow	Rheo
Hugging Face	nvidia/GR00T-N1.5-RL-Rheo-AssembleTrocar

GR00T-N1.5-RL-Rheo Assemble Trocar

GR00T-N1.5-RL-Rheo Assemble Trocar#

GR00T-N1.5-RL-Rheo Assemble Trocar #