ProductMay 8, 2026

Qingwave API: one gateway, every model

Qingwave API aggregates 500+ AI models behind one endpoint — OpenAI-compatible, billed by usage, with the routing and ledger work already done.

Qingwave API is our unified AI model gateway. One endpoint, one API key, and you can call GPT-4o, Claude, Gemini, FLUX, Sora, and 500+ other models without re-doing auth, routing, billing, and retry logic for each provider.

It's the infrastructure layer Qingwave Studio runs on. It's also a standalone product — you can build your own AI application on top.

What's actually in it

Three things, working together:

A model catalog of 500+ entries spanning text, image, video, and audio. Each model has a stable ID, a documented input shape, and per-call pricing.
A request router that takes your call, picks the right upstream provider, handles auth on your behalf, and streams the response back.
A usage ledger that records every call with model, tokens / pixels / seconds, cost, and timestamp. You can pull it as a CSV or query it from the dashboard.

The catalog page (Explore) is where you browse what's available. Each model card shows the price tier, supported features, and a one-click "try it" interface.

OpenAI-compatible by default

If you have code that talks to OpenAI's chat completions API, it talks to Qingwave API by changing one thing — the base URL. Same request shape, same response shape, same streaming protocol. Switch from gpt-4o to claude-opus-4-1 by changing the model field. No SDK rewrite.

We also speak the Anthropic Messages API natively and the Gemini generateContent API natively, for code already structured around those.

One API key, multiple protocols. Build once, switch models freely.

Routing and fallback

Most platforms have one upstream per model. We route per call across multiple upstreams when a model has more than one viable provider. If gpt-4o is slow at the official endpoint, the gateway tries an alternate channel before timing out. You see the same model output; the upstream selection is invisible.

This matters in production. Single-upstream services break when their upstream breaks. Multi-channel routing is how a model layer stays up while individual providers don't.

Billing, the same way as everywhere else

Qingwave API runs on the API sub-wallet in your Qingwave account. Top up the master wallet, buy an API plan, plan credits drop into the sub-wallet. Every model call debits the sub-wallet at the published per-token / per-image / per-second rate. The ledger logs every charge with its source.

The exact rationale for the two-layer structure is in Two-layer wallet, on purpose. The short version: usage and payment are different ledgers, and we don't pretend they're the same number.

Async tasks, callbacks, retries

Long-running generations (image, video) submit as tasks and complete asynchronously. You either poll GET /v1/tasks/{task_id} or hand us a callback_url and we POST back when the task is done.

Failed calls don't bill. Retries against transient upstream errors are automatic up to the gateway's policy; persistent failures are surfaced to you with the upstream error message intact, so you can debug.

Who builds on it

Three patterns we see:

Solo builders — one developer assembling a product around a couple of models. The OpenAI compatibility means most off-the-shelf code works.
Agencies — agencies bundling AI features into client work. They use Qingwave API as the model layer and the partner program to get part of their client billing back as commission.
Other AI products — including our own Qingwave Studio. The same gateway powers Studio's shot generation. When we improve routing, both Studio users and external builders benefit.

How to start

The model catalog and pricing are public — open qingbo.dev (no login required for browsing). To make calls, sign in to your Qingwave account, top up the master wallet, and grab an API key from the Console.

Daily free quota covers small experiments. Pay-as-you-go after that, with the plan tier only relevant if you want bulk credits at a discount.