Settingsintermediate

Rate limits reference

Four rate-limit tiers — starter, professional, enterprise, custom — with per-minute, per-day, and burst caps. What 429 responses mean, how the sliding window works, and when to upgrade tiers.

Updated May 18, 20264 min read

Rate limits reference

Every API key is assigned a rate-limit tier. The tier sets three limits: requests per minute, requests per day, and a burst allowance for short spikes.

The four tiers

starter — 60 — 10,000 — 10
professional — 300 — 100,000 — 50
enterprise — 1,000 — 1,000,000 — 200
custom — configurable — configurable — configurable

starter is the default for new keys. professional and enterprise are upgrades available based on plan. custom is for tenants with negotiated limits — admins can request specific values.

How the limits work

Per-minute limit (sliding window)

Atender uses a sliding-window enforcement: at any moment, the count of requests in the trailing 60 seconds must be at or below the per-minute cap. So a starter key can make 60 requests in any rolling 60-second window — not 60 requests at the start of each calendar minute then 60 more at the start of the next.

This avoids the “smashing the bucket at the boundary” pattern that fixed windows allow.

Burst allowance

The burst cap allows short bursts above the per-minute average without rejection. For a starter key (60/min, burst 10): the key can do up to 70 requests in any 60-second window before getting throttled. The burst is replenished as the window slides.

Bursts are useful for paginated reads (page 1, page 2, page 3 in quick succession) or batch creates (creating 10 contacts at once). Don’t rely on burst for sustained throughput — over a few minutes, the per-minute cap is what matters.

Per-day limit

A hard cap that resets at midnight UTC. A starter key making consistent traffic at the per-minute average can hit the per-day cap before the day is over (60 × 60 × 24 = 86,400, well over the 10,000 per-day cap). The per-day limit is the binding constraint for steady-state high-volume usage.

What happens when you exceed a limit

The API returns a 429 Too Many Requests response. The response includes:

Retry-After — Seconds until you can retry
X-RateLimit-Limit — Your current per-minute cap
X-RateLimit-Remaining — Requests remaining in the current window
X-RateLimit-Reset — Unix timestamp when the window resets

Your client should:

Parse Retry-After and wait that many seconds
Retry the request

Most HTTP libraries support automatic retry-with-backoff for 429s — see the example in the doc-as-code script (KB Articles/_scripts/push_kb.py uses urllib3.Retry with respect_retry_after_header=True).

Picking the right tier

One-off scripts (KB sync, occasional bulk imports) — starter
Daily integrations with steady volume — professional
Real-time integrations (chat sync, live dashboards) — enterprise
Anything outside these patterns — custom (talk to your account manager)

Move up a tier when you’re consistently hitting 429s — it’s the visible signal that the limit isn’t the right fit for your traffic.

Per-key vs per-tenant

Limits are per-key, not per-tenant. Two keys on a starter tier each get 60 requests per minute, totalling 120/min for the tenant. This means:

Splitting traffic across keys does increase total throughput
But it also makes monitoring fragmented (you’d track each key separately)
The cleaner approach is one key per logical integration, on whatever tier suits that integration

Endpoints not subject to rate limits

A few endpoints are exempt from the standard rate-limit headers:

Health checks (/healthz)
Public KB read endpoints (/api/public/kb/:tenantId/*) — these have separate per-IP limits
Webhook delivery callbacks (Atender → your system) — those use different mechanics

For everything else under /api/v1/*, the rate-limit tier on the calling key applies.