Next-Gen AI Multimodal Video Generation Tool
Hunyuan Custom is Tencent's state-of-the-art multimodal video generation solution that allows users to create customized, subject-consistent videos using AI. Upload an image, type a prompt, or add audio/video input to generate cinematic-quality content in seconds.

Explore more AI models from the same provider
Hunyuan Motion is a cutting-edge text-to-3D human motion generation suite that turns natural language into high-quality, skeleton-based character animation. Built on a billion-parameter Diffusion Transformer and Flow Matching, Hunyuan Motion delivers state-of-the-art instruction following, smooth motion, and production-ready outputs with a simple prompt-to-animation workflow backed by CLI and Gradio. Learn more and get started via the official repository on [github.com](https://github.com/Tencent-Hunyuan/HY-Motion-1.0).
Transform your ideas and images into stunning, production-ready 3D assets with Tencent's revolutionary Hunyuan 3D. Featuring advanced diffusion models, professional texture synthesis, and seamless workflow integration for game development, product design, and digital art.
Hunyuan Image 3.0 transforms your ideas into stunning, photorealistic images with unprecedented prompt adherence and intelligent reasoning. Powered by 80B parameters and 64 experts MoE architecture, it delivers exceptional semantic accuracy and visual excellence. Experience the future of AI image generation with native multimodal understanding.
Hunyuan Video transforms your text descriptions into stunning, high-quality videos with exceptional physical accuracy and temporal consistency. Powered by a 13B parameter Unified Diffusion Transformer architecture, it generates up to 5-second videos at 720p resolution with superior motion dynamics and visual fidelity. Experience the future of video creation with advanced Flow Matching schedulers and parallel inference capabilities.
Transform text & images into high-quality 3D models. Unleash your creative potential.
Bring portraits to life. Create expressive talking-head videos from a single image and audio.