SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models
Hyeonbeom Choi, Daechul Ahn, Youhan Lee +3 more
Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robotic control, with test-time scaling (TTS) gaining attention to enhance robustness beyond training. However, existing TTS methods for VLAs require additional training, verifiers, and multiple forward pass...