Rate Limits
Request-rate limiting and concurrent task admission rules for the public API.
Request-rate limiting
The public API applies per-IP request-rate limits enforced at the edge via Google Cloud Armor. The current buckets are:
| Bucket | Limit | Applies to |
|---|---|---|
| General | 300 requests/min | Most API requests |
| Upload | 300 requests/min | POST /media/upload |
| Account creation | 15 requests/hour | POST /auth/temporary |
When the rate limit is exceeded, the API returns a 429 Too Many Requests response.
Concurrent task admission
Some heavy routes do not use request-per-minute limiting. Instead, they are admitted through a separate per-account concurrency and queue policy. This is what controls blueprint runs and direct generation/editing workloads.
| Plan | Concurrent limit | Queue limit |
|---|---|---|
| Free | 2 | 25 |
| Basic | 5 | 150 |
| Wonda | 25 | 100 |
| Pro | 15 | 500 |
| Absolute | 50 | 2000 |
When the queue limit is exceeded, the API rejects the request with a concurrency error that includes the active and queued counts. When the queue still has space, the request may be accepted and started later rather than running immediately.
What this means in practice
- Simple request bursts are governed by the request-rate buckets.
- Long-running generation, editing, and blueprint run workloads are governed by the concurrency policy instead.
- The two systems are separate. A route can be exempt from request-rate limiting and still be subject to concurrency limits.