GR00T-N1.5-RL-Rheo Assemble Trocar
GR00T-N1.5-RL-Rheo Assemble Trocar#
GR00T-N1.5-RL-Rheo-AssembleTrocar is a vision language action model (VLA) fine-tuned for surgical instrument handling in the Isaac for Healthcare Rheo workflow. Using a G1 embodiment, it performs trocar assembly: retrieves the trocar (obturator and cannula) from a surgical tray on the left, assembles it, and places it on a Mayo Stand on the right. Intended for Rheo simulation workflows only; not for real-world clinical deployment. NVIDIA License; Apache-2.0 for Qwen2.5-7B-Instruct and SigLIP2-SO400M. Ready for commercial/non-commercial use.
| Property | Details |
|---|---|
| Model size | 3B parameters (GR00T N1.5) |
| Model type | Vision Language Action (VLA); PyTorch 2.8.0; GR00T N1.5. Input: vision (3×480×640 RGB, head + 2 wrist cameras), state (1×28), language. Output: 16×28 action tensor. Linux Ubuntu 22.04/24.04. Supported: Ampere, Blackwell, Hopper. |
| Performance | NVIDIA RTX 5880 Ada: 54.2 ± 8.5 ms latency, 8 GB VRAM. Trained on 59 simulation samples (manual teleoperation). |
| Workflow | Rheo |
| Hugging Face | nvidia/GR00T-N1.5-RL-Rheo-AssembleTrocar |