Sana video : Efficient Text-to-Video and Image-to-Video by NVIDIA NVLabs
Sana video brings efficient, high-quality text-to-video and image-to-video generation to your browser. Create coherent 720p, 16 fps clips up to one minute with research-backed performance. Try Sana video on Story321 and ship polished motion content fast.
Why choose Sana video on Story321
Story321 pairs Sana video with a streamlined UI, consistent defaults, and versioned settings so you can focus on creative direction, not plumbing or GPU micro-tuning.
Coherent motion and ‘World Sim’
Enjoy stable subjects, realistic physics cues, and scene continuity for believable motion and camera moves (nvlabs.github.io).
Right-size output for speed
720p, 16 fps, up to 1 minute—an ideal balance of quality and iteration speed for most creative workflows (nvlabs.github.io).
Workflow-first integration
Batch runs, preset templates, safe defaults, and quick retries reduce friction from idea to export.
Sana video on Story321 is built for creators who want fast, predictable, high-quality motion results.
Meet Sana video
Sana video is NVIDIA NVLabs’ efficient diffusion-based video generator for text-to-video (T2V) and image-to-video (I2V), supporting up to 720p resolution, 16 fps, and durations up to one minute, with research-backed fidelity and coherent motion (nvlabs.github.io • nvlabs.github.io).
Text-to-Video (T2V)
Turn natural language into vivid motion. Sana video supports multi-style narratives, smooth transitions, and consistent subjects, producing high-quality 720p sequences at 16 fps (nvlabs.github.io).
Image-to-Video (I2V)
Animate a single frame into a dynamic clip. Preserve identity and composition while adding realistic motion, camera moves, and scene depth (nvlabs.github.io).
Efficient, practical runtime
Generate a 5-second clip in about 60s, or ~29s on RTX 5090 with NVFP4 optimizations—efficient enough for iteration loops (youtube.com).
Open-source and research-backed
Built on the SANA family (Linear Diffusion Transformer) with ICLR 2025 recognition, plus open-source code for exploration and extensions (nvlabs.github.io • research.nvidia.com • github.com).
What you can create with Sana video
From brand teasers to tutorial loops, Sana video accelerates concepting and production-grade motion.
Launch teasers
Cut 5–10s hero shots with controlled camera moves and consistent branding.
Product explainers
Demonstrate features with readable motion beats and legible close-ups.
Character moments
Animate mascot gestures, expressions, and micro-acting from a single image.
Cinematic b‑roll
Generate stylized transitions, establishing shots, and ambient loops.
Social trends
Prototype punchy, loopable clips that match platform pacing.
Education & how‑tos
Show step-by-step motion with camera clarity and temporal structure.
Prompting Sana video like a pro
Clear intent and temporal cues help Sana video deliver consistent motion and style.
Key elements of a strong prompt
Subject + art direction
Define who/what, plus aesthetics. Name character traits, materials, and style anchors.
Action + camera
Describe verbs and camera language to lock motion and framing.
Environment + mood
Specify space, light, and atmosphere to stabilize look across frames.
Temporal beats
Add start/middle/end pacing to guide progression in short clips.
Reference-first I2V
For image-to-video, say what to preserve vs. what to animate.
Pro tips
Be explicit, not verbose
Short, concrete phrasing outperforms long, poetic text for motion control.
Tie motion to time
Use seconds (“hold 1s”, “ramp over 2s”) so timing maps to clip length.
Iterate in short clips
Refine in 3–5s; upscale or extend after Sana video matches your intention.
Prompt refinement examples
"A fox running in a forest"
"A red fox dashes along a mossy path; steady cam at fox height; morning mist; sunbeams through pines; start wide, mid chase, end close-up — Sana video holds framing and motion cues"
"A sports car on a coastal road"
"Vintage red sports car, low tracking shot, lens flare, ocean cliffs; smooth roll; pass two bends; end on cliff vista — Sana video maintains speed and composition"
How to use on Story321
Follow these steps to produce consistent results with Sana video.
Pick the model
Choose Sana video from the model list.
Select mode
Use Text-to-Video for prompts, or Image-to-Video to animate a reference.
Write the prompt / set reference
Describe subject, motion, camera, time; upload an image for I2V.
Set duration, resolution, fps
Choose up to 60s, 720p, and 16 fps for balanced quality.
Tune controls
Adjust motion strength, camera jitter, aspect ratio, and seed for reproducibility.
Generate and refine
Preview, trim, and iterate in short clips; extend once locked.
Tips
- •Iterate at 3–5s lengths before extending to 30–60s.
- •Keep subject names, styles, and lens terms consistent across runs.
- •Use time cues like “hold 1s” to stabilize beats.
- •For I2V identity, upload crisp, evenly lit references.
- •Organize winning prompts as templates for Sana video.
Specs such as 720p, 16 fps, and up to 1 minute reflect current public research notes; see the project pages for updates ([nvlabs.github.io](https://nvlabs.github.io/Sana/Video/) • [github.com](https://github.com/NVlabs/Sana)).
Frequently asked questions
Answers to common Sana video setup and workflow questions.
Start creating with Sana video
Prototype, iterate, and publish compelling motion content—Sana video on Story321 gives you speed, coherence, and research-grade quality.
Performance and specs are based on public materials and may evolve with new releases ([nvlabs.github.io](https://nvlabs.github.io/Sana/Video/)).