Website Health Guides

Rate Limiting and How It Affects API Calls

Learn what rate limiting is, what 429 Too Many Requests means, how Retry-After works, and how to handle API limits safely with backoff and caching.

By CheckDomainHealth Editorial Team Reviewed by Dionis Ceban Updated Jun 28, 2026 10 min read Beginner

Introduction

Rate limiting controls how many requests a client, user, IP address, API key or application can make within a specific time period. APIs use rate limits to protect availability, prevent abuse, control costs and keep services stable for all users.

When a client sends too many requests too quickly, the API may slow responses, reject requests or return a 429 Too Many Requests status. Understanding rate limits helps prevent broken integrations, failed forms, delayed dashboards, blocked automation and unnecessary server load.

Quick answer

Quick answer

Rate limiting restricts request volume over time. If an API returns 429 Too Many Requests, the client should slow down, respect Retry-After headers, use exponential backoff, avoid retry loops, cache responses where possible and reduce unnecessary API calls.

Rate limiting

Rate limiting is a control that limits how many requests are allowed during a defined time window.

A limit may apply to:

  • IP address
  • API key
  • user account
  • session
  • route or endpoint
  • domain
  • token
  • organization
  • server region
  • application

Rate limiting is not always an error. It is often an intentional protection mechanism.

Why APIs use limits

APIs use rate limits to keep services stable and fair.

Common reasons:

  • prevent overload
  • stop abusive traffic
  • reduce scraping
  • protect expensive endpoints
  • control infrastructure cost
  • prevent accidental request loops
  • protect login and checkout flows
  • limit brute-force attempts
  • keep service fair for all users
  • enforce plan-based quotas

Without rate limits, one client or script can consume too many resources and affect everyone else.

Limit types

Fixed window

Allows a set number of requests during a fixed period, such as 100 requests per minute.

Sliding window

Tracks requests over a rolling time period for smoother control.

Token bucket

Requests consume tokens. Tokens refill over time.

Leaky bucket

Requests are processed at a steady rate, smoothing bursts.

Concurrent request limit

Limits how many requests can run at the same time.

Daily or monthly quota

Limits total usage over a longer billing or plan period.

Endpoint-specific limit

Certain routes have stricter limits because they are expensive or sensitive.

429 responses

A 429 status means the client has sent too many requests in a given time period.

The response may include headers such as:

  • Retry-After
  • X-RateLimit-Limit
  • X-RateLimit-Remaining
  • X-RateLimit-Reset
  • RateLimit-Limit
  • RateLimit-Remaining
  • RateLimit-Reset

A 429 response does not always mean the API is down. It usually means the client needs to slow down or wait until the limit resets.

Retry-After

When an API sends a Retry-After header, the client should wait before trying again.

Good retry behavior:

  • respect Retry-After
  • use exponential backoff
  • add jitter to avoid many clients retrying at once
  • limit maximum retries
  • avoid retrying non-idempotent actions blindly
  • log repeated 429 responses
  • alert when rate limits are repeatedly reached

Bad retry behavior can make rate limiting worse. A script that retries immediately can create a request storm.

Why this matters

Why this matters

Rate limiting matters because it affects reliability. A website or app may look broken if API calls are blocked, delayed or retried too aggressively. Forms may fail, dashboards may load slowly, integrations may miss updates and automation may stop working.

Good rate-limit handling protects both the API provider and the client application.

How to check limits

Use Website Status Checker, HTTP Header Checker, application logs and API responses to identify rate-limit behavior.

Check:

  1. HTTP status — look for 429 Too Many Requests.
  2. Response headers — check Retry-After and rate-limit headers.
  3. API key or IP usage — confirm which identity is being limited.
  4. Endpoint — check whether only one route has strict limits.
  5. Request frequency — look for bursts, loops or repeated polling.
  6. Retry behavior — confirm the client is not retrying immediately.
  7. Logs — review server, API gateway, CDN or application logs.
  8. User impact — identify which forms, dashboards, jobs or integrations fail.

Check API response headers

Use HTTP Header Checker to inspect status codes, Retry-After and rate-limit headers on API responses.

Run HTTP Header Check →
Useful checks
Check response headers:
curl -I https://api.example.com/status

Send a sample request and show headers:
curl -i https://api.example.com/status

Look for:
HTTP/1.1 429
Retry-After
RateLimit-Limit
RateLimit-Remaining
RateLimit-Reset
X-RateLimit-Limit
X-RateLimit-Remaining
X-RateLimit-Reset

Check repeated timing carefully:
Do not run aggressive loops against APIs you do not own.

Avoid load testing or repeated requests against third-party APIs without permission. Use provider dashboards, logs and documented rate-limit headers.

Common problems

API returns 429

Medium

The client exceeded the allowed request rate.

Next step: Respect Retry-After, reduce request frequency and review rate-limit headers.

Retry loop creates more traffic

High

The client retries immediately after failure and makes the limit worse.

Next step: Add exponential backoff, jitter and retry limits.

Too much polling

Medium

The application repeatedly checks for updates when no change has happened.

Next step: Increase polling interval, use webhooks or cache results.

No client-side throttling

Medium

The app sends requests as fast as users or scripts trigger them.

Next step: Add client-side queueing or request throttling.

Shared API key hits limit

High

Many users or servers use one key and exhaust the quota.

Next step: Separate keys by environment, app or customer where appropriate.

Expensive endpoint overloaded

Medium

A heavy endpoint has stricter limits or slower responses.

Next step: Reduce calls, paginate properly and cache results.

Concurrent requests too high

Medium

Too many requests run at the same time.

Next step: Limit concurrency and queue background jobs.

Missing caching

Medium

The same data is requested repeatedly.

Next step: Cache stable responses and avoid duplicate calls.

Rate limit differs by plan

Low

The account plan does not allow the needed request volume.

Next step: Review API plan, quota and usage pattern.

CDN/WAF rate limit blocks traffic

Medium

Security rules block repeated requests before they reach the application.

Next step: Review CDN/WAF logs and tune rules carefully.

How to handle limits

  1. Step 1: Identify the limit

    Find whether the limit applies to IP, API key, user, route, token or account.

  2. Step 2: Read response headers

    Use Retry-After and rate-limit headers to understand when requests can resume.

  3. Step 3: Reduce request volume

    Remove duplicate calls, reduce polling and batch requests where possible.

  4. Step 4: Add caching

    Cache stable API responses instead of requesting the same data repeatedly.

  5. Step 5: Use backoff

    Retry gradually with exponential backoff and jitter.

  6. Step 6: Limit concurrency

    Queue background jobs and avoid sending too many requests at once.

  7. Step 7: Use webhooks where possible

    Replace frequent polling with event-based updates.

  8. Step 8: Monitor usage

    Track request volume, 429 responses, quota usage and affected endpoints.

  9. Step 9: Upgrade quota only if needed

    If usage is legitimate and optimized, consider a higher API plan or provider limit.

Rate limiting examples
Example 1: API client receives 429

Response:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 60

Meaning:
The client should wait before sending more requests.

Fix:
Pause requests for the suggested time, then resume gradually.

Example 2: Bad retry loop

Problem:
Client sends 50 requests.
API returns 429.
Client retries all 50 immediately.
API returns more 429 responses.

Fix:
Use exponential backoff, jitter and retry limits.

Example 3: Too much polling

Problem:
Dashboard polls every 2 seconds for every user.

Fix:
Increase interval, cache results or use webhooks.

Examples are illustrative. Real headers and limits vary by API provider, gateway, CDN and application.

Retry strategy

A retry strategy should protect reliability without creating more load.

Good retry strategy:

  • retry only safe requests automatically
  • respect Retry-After
  • use exponential backoff
  • add random jitter
  • cap the maximum wait time
  • cap the maximum retry count
  • log repeated failures
  • stop retrying when the request is not safe to repeat
  • alert when rate limits affect users

Example logic: First retry after 2 seconds. Second retry after 4 seconds. Third retry after 8 seconds. Add small random jitter. Stop after a defined limit.

Do not retry payment, order creation or destructive actions blindly unless the API supports idempotency keys.

Idempotency

Some API calls are safe to retry, while others can create duplicates.

Usually safer to retry

GET requests, status checks, read-only lookups, idempotent updates, requests with idempotency keys.

Riskier to retry blindly

Payment creation, order creation, account creation, sending emails, sending SMS, deleting data, submitting forms.

For actions that create something, use idempotency keys if the API supports them.

Your own API

If you operate an API, rate limiting protects your infrastructure and users.

Consider limits for:

  • login attempts
  • password reset
  • contact forms
  • search endpoints
  • expensive reports
  • public API routes
  • write actions
  • file uploads
  • checkout actions
  • admin endpoints

Good API responses should include:

  • clear 429 status
  • useful error message
  • Retry-After when possible
  • rate-limit headers where appropriate
  • documentation explaining limits

Do not use rate limiting as the only security control. Combine it with authentication, validation, logging and abuse monitoring.

Rate limiting can happen at different layers.

CDN

Blocks or challenges high-volume traffic before it reaches the origin.

WAF

Limits suspicious patterns, bots or attack-like behavior.

API gateway

Applies plan, key, user or route-based quotas.

Application

Applies business-specific limits, such as form submissions or login attempts.

When investigating 429 errors, identify which layer returned the response.

Frequently asked questions

What is rate limiting?

Rate limiting controls how many requests a client can make in a defined time period.

What does 429 mean?

429 Too Many Requests means the client exceeded the allowed request rate.

Is a 429 response the same as downtime?

No. The API may be working, but the client is sending too many requests.

What is Retry-After?

Retry-After tells the client how long to wait before trying again.

Should I retry immediately after 429?

No. Wait, use backoff and avoid retry storms.

How can I reduce API calls?

Cache responses, batch requests, reduce polling, use webhooks and remove duplicate calls.

Can rate limiting protect my own website?

Yes. It can reduce abuse on login, forms, APIs, search and expensive endpoints.

Use these free tools to verify your configuration after applying changes.

Browse all Website Health guides →

Need help applying this fix?

Send us your domain, report link or issue details. CheckDomainHealth will review the request and route it to the right technical team if hands-on support is needed.

Get Help Run Domain Health Check

Was this guide helpful?

Your feedback helps us improve our guides for everyone.