HY-World 1.5 (WorldPlay): A Game-Changer for Real-Time Interactive World Models

The quest for AI that can generate and simulate consistent, interactive worlds in real-time has taken a monumental leap forward. On December 17, 2025, Tencent's Hunyuan team open-sourced HY-World 1.5, codenamed WorldPlay. This isn't just an incremental update; it's a comprehensive framework that claims to resolve the fundamental trade-off between speed, memory, and long-term consistency in world modeling.

In short, WorldPlay enables the generation of long-horizon, interactive streaming video at a stunning 24 FPS, all while maintaining geometric consistency over time. Let's dive into what makes this model so revolutionary.

The Core Problem: Speed vs. Consistency#

Previous world models, including the team's own HY-World 1.0, often faced a critical limitation. They could generate impressive 3D worlds but typically through a slow, offline process. Achieving real-time interaction meant sacrificing the long-term consistency of the environment—objects would morph, textures would flicker, and the geometry would drift over time. WorldPlay aims to shatter this compromise.

The Four Pillars of WorldPlay's Architecture#

The breakthrough is powered by four key technical innovations:

Dual Action Representation: This is the "controller" of the model. It translates user inputs (like keyboard and mouse movements) into a robust, model-understandable action space that allows for precise and responsive control over the generated world's viewpoint.
Reconstituted Context Memory: This is the core of long-term consistency. To prevent the model from "forgetting" the past, this module dynamically rebuilds context from previously generated video chunks. It uses a clever technique called temporal reframing to keep geometrically important frames from the distant past accessible, effectively solving the problem of memory attenuation.
WorldCompass: A Novel RL Post-Training Framework: After initial training, the model undergoes a reinforcement learning (RL) phase specifically designed for long-horizon tasks. WorldCompass directly optimizes the model for better action-following and higher visual quality over extended sequences, ensuring the output remains stable and coherent.
Context Forcing: Memory-Aware Distillation: To achieve real-time speeds, a smaller, faster "student" model is often distilled from a larger "teacher" model. However, standard distillation can cause the student to lose its ability to use long-range context. Context Forcing is a novel distillation method that aligns the memory context between teacher and student, preserving the student's capacity for long-term reasoning while enabling 24 FPS generation.

Key Features and Capabilities#

Real-Time and Interactive: Generates video streams at 24 FPS, allowing for live interaction based on user input.
Long-Term Geometric Consistency: Maintains the stability and coherence of the world's structure over long generation horizons.
Versatile Applications: Supports both first-person and third-person perspectives in real-world and stylized environments. Potential applications include interactive 3D reconstruction, promptable events (e.g., "make it rain"), and infinite world extension.
Comprehensive Open-Source Release: The team has open-sourced not just the model weights but a full-stack framework covering data, training, and inference deployment.

Quantitative Superiority#

The model's performance is backed by extensive evaluations. As shown in the table below, the full WorldPlay model ("Ours (full)") outperforms existing state-of-the-art methods across key metrics like PSNR, SSIM, and LPIPS, especially in long-term scenarios, while being the only one that operates in real-time.

Model	Real-time	Short-term PSNR/SSIM/LPIPS	Long-term PSNR/SSIM/LPIPS
CameraCtrl	❌	17.93 / 0.569 / 0.298	10.09 / 0.241 / 0.549
Gen3C	❌	21.68 / 0.635 / 0.278	15.37 / 0.431 / 0.483
Matrix-Game-2.0	✅	17.26 / 0.505 / 0.383	9.57 / 0.205 / 0.631
Ours (full)	✅	21.92 / 0.702 / 0.247	18.94 / 0.585 / 0.371

Getting Started with WorldPlay#

For developers eager to experiment, the repository provides a clear path to quick start. The model is built upon the powerful HunyuanVideo-1.5 base model. The setup involves:

Creating a Python 3.10 environment and installing dependencies.
Installing Flash Attention for optimized performance.
Downloading the pre-trained HunyuanVideo-1.5 model and the specific WorldPlay checkpoints.
Running the provided inference scripts (generate.py or generate_custom_trajectory.py for custom camera paths).

The code supports inference with different model variants: bidirectional, autoregressive, and the distilled autoregressive model for maximum speed.

Conclusion and Future Work#

HY-World 1.5 (WorldPlay) represents a significant milestone in AI-driven content creation and simulation. By systematically addressing the bottlenecks of speed and consistency, it opens up new possibilities for real-time, interactive applications in gaming, virtual reality, and architectural visualization.

The team has indicated that the training code is still on the TODO list for open-sourcing, which will be a crucial next step for the research community to build upon this work. For now, the release of the models and inference code is a massive contribution that allows everyone to experience and benchmark this state-of-the-art interactive world model.

Learn More:

GitHub Repository: https://github.com/Tencent-Hunyuan/HY-WorldPlay
Technical Report & Paper: Check the repository for links to the detailed technical report and research papers.

HY-World 1.5 (WorldPlay): A Game-Changer for Real-Time Interactive World Models

The Core Problem: Speed vs. Consistency#

The Four Pillars of WorldPlay's Architecture#

Key Features and Capabilities#

Quantitative Superiority#

Getting Started with WorldPlay#

Conclusion and Future Work#

Start Creating with AI

Related Articles

Fish Audio S2: The Most Expressive Open-Source Voice AI for Creators

GPT-5.3 Instant: The Ultimate Efficiency Tool for Content Creators

The Ultimate Guide to Gemini 3.1 Flash-Lite: Revolutionizing Creative Workflows