LongCat-Flash-Thinking-2601 Technical Report

MMeituan LongCat TeamAAnchun GuiBBei LiBBingyang TaoBBole ZhouBBorun ChenCChao ZhangCChao ZhangCChen GaoCChen ZhangCChengcheng HanCChenhui YangCChuyu ZhangCCong ChenCCunguang WangDDaoru PanDDefei BuDDengchang ZhaoDDi XiuDDishan LiuDDongyu RuDDunwei TuFFan WuFFengcheng YuanFFengcun LiGGang XuGGuanyu WuGGuoyuan LinHHaibin WangHHansi YangHHao YangHHaonan YanHHaoxiang MaHHaoxing WenHHongyan HaoHHongyin TangHHongyu ZangHHongzhi NiHHui SuJJiacheng ZhangJJiahong ZhouJJiahuan LiJJiaming WangJJian YangJJianfei ZhangJJianhao XuJJianing WangJJiapeng ZhuJJiaqi SunJJiarong ShiJJiarui ZhaoJJingang WangJJinluan YangJJinrui DingJJinwei XiaoJJiyuan HeJJuncan XuKKefeng ZhangKKeheng WangLLi WeiLLianhui MaLLin QiuLLingbing KongLLingchuan LiuLLinsen GuoMMengshen ZhuMMengxia ShenMMingyang ZhuPPeiguang LiPPeng PeiPPengcheng JiaPPengtao ZhangPPeng ZhaoQQi GuQQiong HuangQQiyuan DuanQQuanchi WengRRongxiang WengRRongzhi ZhangRRumei LiSShanglin LeiSShengnan AnSShijun DaiSShuaikang LiuSShuang ZhouSShuo WangSSongyuan ZhaoTTao LiangTTianhao HuTTianze ChenWWei LiuWWei ShiWWei WangWWeifeng TangWWenjie ShiWWenlong ZhuWWentao ChenWWentao ShiXXi SuXXiangcheng LiuXXiandi MaXXiangyu XiXXiangyuan LiuXXiangzhou HuangXXiao LiuXXiaodong CaiXXiaolong ChenXXiaowei ShiXXiaoyu LiXXin ChenXXingchen LiuXXuan HuangXXuezhi CaoXXunliang CaiYYan ChenYYang BaiYYang LiuYYang YangYYang ZhengYYaoming WangYYaoming ZhuYYaqi HuoYYanyu ChenYYaorui ShiYYerui SunYYi ZhangYYihao ChenYYi-Kai ZhangYYifan LuYYifan ZhaoYYitao ZhaiYYongjing YinYYongwei ZhouYYoushao XiaoYYuchuan DaiYYuchen XieYYuchen YuYYufei ZhangYYuhuai WeiYYulei QianYYunfan LiangYYunke ZhaoYYuwei JiangYYuxin BianYYuxin ChenYYuxin LiuYYue XuYYueqing SunZZeyang YuZZhao YangZZhengsheng HuangZZhengyu ChenZZhijian LiuZZhikang XiaZZhimin LinZZhiyuan YaoZZhuofan ChenZZhuowen HanZZijian ZhangZZiran LiZZiwen WangZZiyuan Zhuang

Published: January 23, 2026
Authors: 162
Word Count: 16,060

View on arXiv Download PDF

Advanced agentic reasoning model for complex tasks.

Abstract

We introduce LongCat-Flash-Thinking-2601, a 560-billion-parameter open-source Mixture-of-Experts (MoE) reasoning model with superior agentic reasoning capability. LongCat-Flash-Thinking-2601 achieves state-of-the-art performance among open-source models on a wide range of agentic benchmarks, including agentic search, agentic tool use, and tool-integrated reasoning. Beyond benchmark performance, the model demonstrates strong generalization to complex tool interactions and robust behavior under noisy real-world environments. Its advanced capability stems from a unified training framework that combines domain-parallel expert training with subsequent fusion, together with an end-to-end co-design of data construction, environments, algorithms, and infrastructure spanning from pre-training to post-training. In particular, the model's strong generalization capability in complex tool-use are driven by our in-depth exploration of environment scaling and principled task construction. To optimize long-tailed, skewed generation and multi-turn agentic interactions, and to enable stable training across over 10,000 environments spanning more than 20 domains, we systematically extend our asynchronous reinforcement learning framework, DORA, for stable and efficient large-scale multi-environment training. Furthermore, recognizing that real-world tasks are inherently noisy, we conduct a systematic analysis and decomposition of real-world noise patterns, and design targeted training procedures to explicitly incorporate such imperfections into the training process, resulting in improved robustness for real-world applications. To further enhance performance on complex reasoning tasks, we introduce a Heavy Thinking mode that enables effective test-time scaling by jointly expanding reasoning depth and width through intensive parallel thinking.

Key Takeaways

1
LongCat-Flash-Thinking-2601 excels in complex agentic tasks.
2
Innovative training framework enhances model robustness.
3
Achieves state-of-the-art performance on agentic benchmarks.

Limitations

Requires high-quality, executable environments for deployment.
Substantial computational resources needed for training.

Keywords

Mixture-of-Expertsagentic reasoningdomain-parallel expert trainingfusionasynchronous reinforcement learningDORAlong-tailed generationmulti-turn interactionsreal-world noise patternstest-time scalingreasoning depthreasoning widthparallel thinking

More in AI Agents

View all

Agentic Reasoning for Large Language Models

Tianxin Wei, Ting-Wei Li +27

Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making. While large language models (LLMs) demonstrate strong reasoning capabilities in closed-world se...

Jan 18149

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Ailin Huang, Ang Li +213

We introduce Step 3.5 Flash, a sparse Mixture-of-Experts (MoE) model that bridges frontier-level agentic intelligence and computational efficiency. We focus on what matters most when building agents: ...

Feb 11140

UI-Venus-1.5 Technical Report

Veuns-Team, Changlong Gao +25

GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging.In ...

Feb 9140

daVinci-Dev: Agent-native Mid-training for Software Engineering

Ji Zeng, Dayuan Fu +15

Recently, the frontier of Large Language Model (LLM) capabilities has shifted from single-turn code generation to agentic software engineering-a paradigm where models autonomously navigate, edit, and ...

Jan 26113

MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

Zhiheng Song, Jingshuai Zhang +7

Route-planning agents powered by large language models (LLMs) have emerged as a promising paradigm for supporting everyday human mobility through natural language interaction and tool-mediated decisio...

Feb 26103

More AI Agents papers