Multimodal AI

Advancing Open-source World Models

RRobbyant TeamZZelin GaoQQiuyu WangYYanhong ZengJJiapeng ZhuKKa Leong ChengYYixuan LiHHanlin WangYYinghao XuSShuailei MaYYihang ChenJJie LiuYYansong ChengYYao YaoJJiayi ZhuYYihao MengKKecheng ZhengQQingyan BaiJJingye ChenZZehong ShenYYue YuXXing ZhuYYujun ShenHHao Ouyang
Published
January 28, 2026
Authors
24

Abstract

We present LingBot-World, an open-sourced world simulator stemming from video generation. Positioned as a top-tier world model, LingBot-World offers the following features. (1) It maintains high fidelity and robust dynamics in a broad spectrum of environments, including realism, scientific contexts, cartoon styles, and beyond. (2) It enables a minute-level horizon while preserving contextual consistency over time, which is also known as "long-term memory". (3) It supports real-time interactivity, achieving a latency of under 1 second when producing 16 frames per second. We provide public access to the code and model in an effort to narrow the divide between open-source and closed-source technologies. We believe our release will empower the community with practical applications across areas like content creation, gaming, and robot learning.

Keywords

world simulatorvideo generationworld modellong-term memoryreal-time interactivity

More in Multimodal AI

View all
Advancing Open-source World Models | Paperchime