Large Language Models

LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation

AAhmadreza JeddiMMarco CicconeBBabak Taati
Published
February 11, 2026
Authors
3
Word Count
8,915
Code
Includes code

LoopFormer makes looped transformers adaptable to variable compute budgets through time-conditioned training.

Abstract

Looped Transformers have emerged as an efficient and powerful class of models for reasoning in the language domain. Recent studies show that these models achieve strong performance on algorithmic and reasoning tasks, suggesting that looped architectures possess an inductive bias toward latent reasoning. However, prior approaches fix the number of loop iterations during training and inference, leaving open the question of whether these models can flexibly adapt their computational depth under variable compute budgets. We introduce LoopFormer, a looped Transformer trained on variable-length trajectories to enable budget-conditioned reasoning. Our core contribution is a shortcut-consistency training scheme that aligns trajectories of different lengths, ensuring that shorter loops yield informative representations while longer loops continue to refine them. LoopFormer conditions each loop on the current time and step size, enabling representations to evolve consistently across trajectories of varying length rather than drifting or stagnating. Empirically, LoopFormer demonstrates robust performance on language modeling and reasoning benchmarks even under aggressive compute constraints, while scaling gracefully with additional budget. These results show that looped Transformers are inherently suited for adaptive language modeling, opening a path toward controllable and budget-aware large language models.

Key Takeaways

  • 1

    LoopFormer enables looped transformers to adapt to variable compute budgets by training on variable-length trajectories.

  • 2

    Conditioning on normalized time and step size allows models to handle different loop counts at inference.

  • 3

    This approach enables parameter-efficient reasoning models that gracefully degrade with reduced computational resources.

Limitations

  • Standard looped transformers fail when inference loops differ from training loops due to representation drift.

  • Previous looped transformer designs fixed loop counts during training, preventing flexible compute adaptation.

Keywords

looped Transformersreasoninginductive biasloop iterationsvariable compute budgetsLoopFormershortcut-consistency trainingtrajectory alignmentcomputational depthadaptive language modeling

More in Large Language Models

View all
LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation | Paperchime