Get3W Get3W
Tutorials Video Generation

How to Generate AI Video with Get3W

A complete guide to creating AI videos from text prompts and reference images using Get3W's unified API with Kling, Wan, Sora, and more.

Get3W Team ·

If you are researching how to generate AI video for marketing, storytelling, or product demos, the fastest path is usually a hosted API that routes your request to a strong video model, handles queues and scaling, and returns a shareable file. This guide explains what that involves, how to pick a workflow on Get3W, and how to call the API from Python.

What AI video generation actually is

AI video generation turns a specification—most often a text prompt, sometimes a reference image or clip—into a short video file. Modern models predict motion, lighting, and (on some models) audio over a few seconds of footage. Outputs are rarely “final broadcast masters” in one click; they are best treated as director’s dailies: strong starting points you refine with better prompts, a different model, or post-production.

Quality depends on the model, your prompt, resolution and duration settings, and whether you constrain the scene (single subject, simple camera move) or ask for chaos (crowds, complex physics, unreadable text).

Most teams choose a mode based on where their creative control starts.

Text-to-video

You describe the scene in words; the model invents both appearance and motion. Best when you want maximum flexibility or you do not yet have key art. Explore options on the text-to-video model type page on Get3W.

Image-to-video

You supply a still (product shot, character sheet, storyboard frame) and a prompt that describes how it should move. Motion tends to feel more anchored because composition and identity are fixed. See image-to-video listings—popular entries include Wan 2.6 Image-to-Video.

First-frame and reference workflows

Some providers treat “image as first frame” or “reference clip” as distinct run types (for example first-to-video or reference-to-video). They are useful when you need continuity between shots or tighter control than pure text-to-video. Browse the models catalog and open a model page to see supported inputs for that exact endpoint.

Models worth comparing on Get3W

You are not locked to a single vendor. On Get3W, comparable families include:

Always read the parameters on the model page: duration caps, aspect ratios, negative prompts, and whether sound is generated vary by model.

Writing prompts that survive contact with reality

Good video prompts read like shot notes, not tag clouds.

Structure that works

  1. Subject — who or what is on screen.
  2. Setting — environment, time of day, materials.
  3. Action — one clear verb phrase (“walks toward camera,” “steam rises from the cup”).
  4. Camera — lens feel, movement (slow dolly-in, handheld documentary).
  5. Lighting and style — “soft window light,” “high-contrast noir,” “clean product lighting.”

Common failure modes

  • Too many events in five seconds: pick one primary motion.
  • Illegible text in-frame: most models still struggle with crisp typography.
  • Ambiguous physics: “explodes” or “melts” without style cues often look messy—add references (“practical FX,” “cartoon squash,” “slow-motion liquid”).

Step-by-step: submit a task with the Get3W API

  1. Create an API key in the Get3W dashboard.
  2. Pick a model id from the model catalog (the same path segment you see in the URL is typically what the API expects).
  3. POST to https://api.get3w.com/api/v3/{model-id} with JSON parameters for that model.
  4. Note the task_id (prediction id) from the response.
  5. Poll GET https://api.get3w.com/api/v3/predictions/{task_id} until status indicates completion, then read outputs for the video URL.

Exact JSON fields differ per model; the model detail page is the source of truth.

Python example (submit + poll)

import os
import time
import requests

API_KEY = os.environ["GET3W_API_KEY"]
BASE = "https://api.get3w.com/api/v3"

# Example: Kling 3 text-to-video — replace with any model id from https://get3w.com/models
MODEL_ID = "kling/kling-3/text-to-video"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json",
}

body = {
    "prompt": (
        "Single continuous shot, slow dolly-in: a ceramic coffee cup on a walnut table, "
        "morning window light, soft steam rising, shallow depth of field, 35mm photographic look"
    ),
}

r = requests.post(f"{BASE}/{MODEL_ID}", headers=headers, json=body)
r.raise_for_status()
task = r.json()["data"]
task_id = task["id"]
print("Submitted:", task_id)

while True:
    s = requests.get(f"{BASE}/predictions/{task_id}", headers=headers)
    s.raise_for_status()
    data = s.json()["data"]
    status = data["status"]
    print("Status:", status)
    if status in ("succeeded", "failed", "canceled"):
        print(data.get("outputs"))
        break
    time.sleep(2)

Set GET3W_API_KEY in your environment before running. If you prefer fewer lines of code, the official get3w Python SDK wraps the same flow—see the Python SDK documentation.

Tips for reliably better results

  • Prototype short, then increase duration or resolution once the motion looks right.
  • Reuse seeds when a shot is almost perfect and you only want a small tweak (if the model exposes seed).
  • Match prompt to mode: image-to-video prompts should emphasize motion, not re-describe the whole frame.
  • Use webhooks for production pipelines so you are not holding open HTTP connections on long renders (see Get3W webhook docs).
  • Budget for iteration: professional-looking clips are usually the second or fifth generation, not the first.

Next steps

Open the Get3W model catalog, filter by video run types, and pin two models: one tuned for speed (faster iteration) and one tuned for final quality. Run the same prompt through both; you will quickly learn which family matches your brand’s look.