/// media generation

Video generation

Nodes can generate text-to-video on your own GPU, alongside text and images. It runs entirely on your hardware through stable-diffusion.cpp— there's no cloud path, so it never touches credits. Generation is asynchronous (a clip takes minutes): you start a job and poll it until the video is ready.

1. Deploy a video model

Video needs a video model loaded on a node. On the Nodes page, open a node, go to Media, and deploy one. The node downloads the weights, starts its media runtime, and reports ready — watch it with wai status.

modelneedsnotes
Wan 2.2 TI2V 5B12GB+ VRAMThe lightweight default. Short clips (~2s) in a few minutes on a 12GB card.
LTX-2.3 (22B)14GB+ VRAM, ~12GB free RAMHigher quality, larger. Its Gemma-12B text encoder is streamed from RAM so the diffusion model fits a 16GB GPU; slower than Wan.

The dashboard greys out a model a node can't fit. One media model runs per node at a time; deploying a new one replaces it. Text inference on the same node is unaffected.

2. Test it from the node

Once a video model reports ready, the fastest check is wai test-video on the node itself. It generates a short clip straight against the local media server (bypassing the gateway), saves it to your Desktop, and opens it — a quick way to confirm the GPU actually renders before wiring it into an app.

windows · powershell
wai status                       # confirm the media model is "ready"
wai test-video                   # default prompt
wai test-video "a red kite over a green field, slow motion"
macos / linux
wai status
wai test-video
wai test-video "a red kite over a green field, slow motion"

It prints the job id and polls until done, then writes wai-test-video.webm. A first run also confirms the heavier bits work end to end — for LTX-2.3 that's the Gemma encoder offload and the temporal-tiled VAE decode. (wai test-image does the same for image models.)

Prefer raw HTTP? The node's media server speaks an async API on 127.0.0.1:8081 (loopback only): POST /sdcpp/v1/vid_gen returns a job id, then GET /sdcpp/v1/jobs/{id} returns the base64 clip when complete.

3. Use it

The everyday way is the web UI. In Chat or the playground, pick a video model — they appear as model@node (e.g. ltx-2.3@gpu01) — type a prompt, and the clip renders inline when the job finishes.

To drive it from your own app, use the playground video endpoint. It's session-authenticated (a signed-in WAI session, not a wai_sk_… gateway key — video has no OpenAI-compatible /v1 route yet). Start a job, then poll until status: "done":

start a job
POST /api/playground/video
{ "prompt": "a red kite over a green field", "node": "gpu01" }   // node optional

-> 202  { "jobId": "…", "nodeId": "…", "node": "gpu01" }
poll until done
GET /api/playground/video?jobId=…&nodeId=…

-> { "status": "pending" | "running", "progress": … }
-> { "status": "done", "video": "<base64>", "contentType": "video/webm" }
-> { "status": "failed", "error": "…" }

Good to know

  • It takes minutes, not seconds. Clips are short by design; expect a few minutes per generation depending on the model, resolution, and frame count.
  • Output is WebM. The node returns a base64 WebM clip; the UI renders it inline.
  • Node-only, free. There is no cloud failover for video — if no node is ready, the request returns a 503 telling you to deploy a model. Nothing is billed.
  • Storage.Model weights (Wan is ~10GB, LTX-2.3 ~24GB) land in the node's models folder; move it to another disk with wai models-dir.

See also the wai CLI reference for the full command list and node states.