Rate Limits
The Customer API applies per-customer rate limits to keep the service stable and fair for every user. Limits are tied to your Customer Code, not your IP address — calls from any client using your credentials count towards the same bucket.
Default limits
| Window | Limit |
|---|---|
| Per minute | 60 requests |
| Per hour | 1,000 requests |
Both limits apply simultaneously — a burst that exceeds 60 requests inside a minute will be throttled, even if you are well below the hourly limit, and vice versa.
If your integration legitimately needs higher limits, contact your account manager.
What counts as a request?
Every HTTP request you make is counted. Successful, rate-limited, and unauthenticated requests all count towards your quota.
The unauthenticated endpoints /health and /api/version are also
metered — when called without an Authorization header they are
bucketed under a shared anonymous identifier, and when called with a
valid token they count against the calling customer's bucket.
What happens when you hit the limit
The API responds with:
HTTP/1.1 429 Too Many Requests
Retry-After: 35
X-Rate-Limit-Limit: 1m
X-Rate-Limit-Remaining: 0
X-Rate-Limit-Reset: 2026-05-14T10:32:00Z
Content-Type: application/json
{ "message": "API calls quota exceeded! maximum admitted 60 per 1m." }
| Header | Meaning |
|---|---|
Retry-After | The number of seconds to wait before retrying. |
X-Rate-Limit-Limit | The window the limit applies to (e.g. 1m, 1h). |
X-Rate-Limit-Remaining | How many calls are left in the current window. |
X-Rate-Limit-Reset | When the window resets, as an ISO 8601 UTC timestamp. |
Handling 429 in your client
The recommended pattern is:
- Read the
Retry-Afterheader. Wait at least that many seconds. - Use exponential back-off for repeated failures. A common pattern
is to start at the
Retry-Aftervalue, then double the wait time on each subsequent failure, up to a sensible ceiling. - Log the event. Repeated
429responses usually indicate a bug in your scheduling — for example, multiple instances polling the same endpoint concurrently. - Do not retry immediately. Retrying without waiting will continue to fail and burn through what would otherwise be your next window's quota.
A simple pseudocode example:
response = http.GET("/products", headers={Authorization: ...})
if response.status == 429:
wait_seconds = int(response.headers["Retry-After"])
sleep(wait_seconds + 1)
retry()
Designing a polling strategy
Most integrations don't need to be anywhere near the rate limit. A sensible schedule looks like:
| Data | Frequency | Calls per hour |
|---|---|---|
| Products | Once a day | Fewer than 1 |
| Stock | Every 15 minutes | 4 |
| Prices | Once a day or on demand | Fewer than 1 |
That's well inside the default quota. Even at every 5 minutes for stock and every hour for products, you would consume only about 13 requests per hour out of 1,000.
Common patterns to avoid
- Polling per SKU. There is no per-SKU customer endpoint — fetch
/productsor/stockonce and split the response client-side. - Concurrent fan-out. If your client launches many parallel calls to the same endpoint, you'll hit the per-minute limit quickly. Serial polling on a schedule is almost always sufficient.
- Hammering on failure. A
429or500should be followed by a pause, not an immediate retry.
Public endpoints
The following endpoints do not require authentication:
GET /health— liveness probe.GET /api/version— current API profile.
They are still rate-limited as described above. A polling interval of
30–60 seconds is sensible for monitoring uses; tighter intervals risk
sharing the anonymous bucket with other callers. If you need to poll
these endpoints frequently from a process that has API credentials,
send the Authorization header — your requests will then be counted
against your own bucket rather than the shared anonymous one.