How to Generate AI Video with Get3W
A complete guide to creating AI videos from text prompts and reference images using Get3W's unified API with Kling, Wan, Sora, and more.
If you are researching how to generate AI video for marketing, storytelling, or product demos, the fastest path is usually a hosted API that routes your request to a strong video model, handles queues and scaling, and returns a shareable file. This guide explains what that involves, how to pick a workflow on Get3W, and how to call the API from Python.
What AI video generation actually is
AI video generation turns a specification—most often a text prompt, sometimes a reference image or clip—into a short video file. Modern models predict motion, lighting, and (on some models) audio over a few seconds of footage. Outputs are rarely “final broadcast masters” in one click; they are best treated as director’s dailies: strong starting points you refine with better prompts, a different model, or post-production.
Quality depends on the model, your prompt, resolution and duration settings, and whether you constrain the scene (single subject, simple camera move) or ask for chaos (crowds, complex physics, unreadable text).
Text-to-video vs image-to-video (and related modes)
Most teams choose a mode based on where their creative control starts.
Text-to-video
You describe the scene in words; the model invents both appearance and motion. Best when you want maximum flexibility or you do not yet have key art. Explore options on the text-to-video model type page on Get3W.
Image-to-video
You supply a still (product shot, character sheet, storyboard frame) and a prompt that describes how it should move. Motion tends to feel more anchored because composition and identity are fixed. See image-to-video listings—popular entries include Wan 2.6 Image-to-Video.
First-frame and reference workflows
Some providers treat “image as first frame” or “reference clip” as distinct run types (for example first-to-video or reference-to-video). They are useful when you need continuity between shots or tighter control than pure text-to-video. Browse the models catalog and open a model page to see supported inputs for that exact endpoint.
Models worth comparing on Get3W
You are not locked to a single vendor. On Get3W, comparable families include:
- Kling 3 (text-to-video) — strong general-purpose cinematic generations.
- OpenAI Sora 2 (text-to-video) — high-end motion and scene coherence when you need flagship quality.
- Wan 2.6 (text-to-video) — multi-shot style workflows with audio-oriented features on supported modes.
- Google Veo 3.1 Fast (text-to-video) — useful when you want a faster iteration loop while staying in the Veo family.
Always read the parameters on the model page: duration caps, aspect ratios, negative prompts, and whether sound is generated vary by model.
Writing prompts that survive contact with reality
Good video prompts read like shot notes, not tag clouds.
Structure that works
- Subject — who or what is on screen.
- Setting — environment, time of day, materials.
- Action — one clear verb phrase (“walks toward camera,” “steam rises from the cup”).
- Camera — lens feel, movement (slow dolly-in, handheld documentary).
- Lighting and style — “soft window light,” “high-contrast noir,” “clean product lighting.”
Common failure modes
- Too many events in five seconds: pick one primary motion.
- Illegible text in-frame: most models still struggle with crisp typography.
- Ambiguous physics: “explodes” or “melts” without style cues often look messy—add references (“practical FX,” “cartoon squash,” “slow-motion liquid”).
Step-by-step: submit a task with the Get3W API
- Create an API key in the Get3W dashboard.
- Pick a model id from the model catalog (the same path segment you see in the URL is typically what the API expects).
POSTtohttps://api.get3w.com/api/v3/{model-id}with JSON parameters for that model.- Note the
task_id(prediction id) from the response. - Poll
GET https://api.get3w.com/api/v3/predictions/{task_id}untilstatusindicates completion, then readoutputsfor the video URL.
Exact JSON fields differ per model; the model detail page is the source of truth.
Python example (submit + poll)
import os
import time
import requests
API_KEY = os.environ["GET3W_API_KEY"]
BASE = "https://api.get3w.com/api/v3"
# Example: Kling 3 text-to-video — replace with any model id from https://get3w.com/models
MODEL_ID = "kling/kling-3/text-to-video"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
}
body = {
"prompt": (
"Single continuous shot, slow dolly-in: a ceramic coffee cup on a walnut table, "
"morning window light, soft steam rising, shallow depth of field, 35mm photographic look"
),
}
r = requests.post(f"{BASE}/{MODEL_ID}", headers=headers, json=body)
r.raise_for_status()
task = r.json()["data"]
task_id = task["id"]
print("Submitted:", task_id)
while True:
s = requests.get(f"{BASE}/predictions/{task_id}", headers=headers)
s.raise_for_status()
data = s.json()["data"]
status = data["status"]
print("Status:", status)
if status in ("succeeded", "failed", "canceled"):
print(data.get("outputs"))
break
time.sleep(2)
Set GET3W_API_KEY in your environment before running. If you prefer fewer lines of code, the official get3w Python SDK wraps the same flow—see the Python SDK documentation.
Tips for reliably better results
- Prototype short, then increase duration or resolution once the motion looks right.
- Reuse seeds when a shot is almost perfect and you only want a small tweak (if the model exposes
seed). - Match prompt to mode: image-to-video prompts should emphasize motion, not re-describe the whole frame.
- Use webhooks for production pipelines so you are not holding open HTTP connections on long renders (see Get3W webhook docs).
- Budget for iteration: professional-looking clips are usually the second or fifth generation, not the first.
Next steps
Open the Get3W model catalog, filter by video run types, and pin two models: one tuned for speed (faster iteration) and one tuned for final quality. Run the same prompt through both; you will quickly learn which family matches your brand’s look.