AI Agents

TodoEvolve: Learning to Architect Agent Planning Systems

JJiaxi LiuYYanzuo JiangGGuibin ZhangZZihan ZhangHHeng ChangZZhenfei YinQQibing RenJJunchi Yan
Published
February 8, 2026
Authors
8
Word Count
9,894

AI learns to design adaptive planning architectures tailored to specific task requirements.

Abstract

Planning has become a central capability for contemporary agent systems in navigating complex, long-horizon tasks, yet existing approaches predominantly rely on fixed, hand-crafted planning structures that lack the flexibility to adapt to the structural diversity of open-ended problems. To address this limitation, we introduce TodoEvolve, a meta-planning paradigm that autonomously synthesizes and dynamically revises task-specific planning architectures. Specifically, we first construct PlanFactory, a modular design space that standardizes diverse planning paradigms within a unified codebase encompassing topology, initialization, adaptation, and navigation, thereby providing a common interface for heterogeneous planning patterns. Leveraging PlanFactory, we collect high-quality planning trajectories and train Todo-14B via Impedance-Guided Preference Optimization (IGPO), a multi-objective reinforcement learning objective that encourages the generation of planning systems that are performant, stable, and token-efficient across arbitrary tasks and agent backbones. Empirical evaluations on five agentic benchmarks demonstrate that TodoEvolve consistently surpasses carefully engineered planning modules while maintaining economical API costs and runtime overhead.

Key Takeaways

  • 1

    TodoEvolve learns to design task-specific planning architectures rather than using fixed single planning systems.

  • 2

    PlanFactory unifies diverse planning paradigms into four dimensions: topology, initialization, adaptation, and navigation.

  • 3

    Meta-planners can automatically synthesize optimal planning structures on-the-fly based on task demands.

Limitations

  • No single planning architecture optimally handles all task types, requiring constant architectural adaptation.

  • Complex multimodal tasks and highly conflicted environments need continuous topological restructuring for effectiveness.

Keywords

meta-planningplanning architecturesPlanFactoryImpedance-Guided Preference OptimizationIGPOreinforcement learningtask-specific planningmodular design space

More in AI Agents

View all
TodoEvolve: Learning to Architect Agent Planning Systems | Paperchime