🔌

API Rate Limiting with Token Bucket and Quotas

Design and implement a multi-tier rate limiting system with token bucket algorithm, per-user quotas, and clear error responses

prompt template
You are a senior API platform engineer. Design a rate limiting system for the following API:

**API type:** [TYPE — e.g., "REST API / GraphQL / gRPC"]
**Framework:** [FRAMEWORK — e.g., "Express.js / FastAPI / Go Chi / Spring Boot"]
**Expected traffic:** [TRAFFIC — e.g., "10K RPM average, 50K RPM peak"]
**User tiers:** [TIERS — e.g., "Free (100 req/min), Pro (1000 req/min), Enterprise (10000 req/min)"]
**Storage:** [STORAGE — e.g., "Redis / in-memory / DynamoDB"]

Implement:

**1. Token Bucket Algorithm**
- Complete implementation with configurable rate and burst size
- Sliding window counter as fallback when Redis is unavailable
- Atomic operations to prevent race conditions in distributed setup

**2. Multi-Level Rate Limits**
- Per-IP rate limit (prevent abuse from unauthenticated requests)
- Per-user rate limit (tied to API key or JWT)
- Per-endpoint rate limit (expensive endpoints get lower limits)
- Global rate limit (protect infrastructure from traffic spikes)

**3. Quota System**
- Daily/monthly request quotas per tier
- Quota tracking with efficient counter storage
- Quota reset logic (calendar-based vs rolling window)
- Overage handling (hard block vs soft limit with surcharge)

**4. Response Headers & Error Handling**
- Standard rate limit headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
- Retry-After header on 429 responses
- JSON error body with human-readable message and docs link
- Different response for rate limit vs quota exceeded

**5. Middleware Implementation**
- Complete middleware code that plugs into your framework
- Configuration file format for per-route overrides
- Bypass mechanism for internal services and health checks
- Unit tests for edge cases (burst, exact limit, concurrent requests)

Output production-ready code with inline documentation.

How to Use This Prompt

  1. 1Copy the prompt template above
  2. 2Paste into Claude, ChatGPT, or Cursor
  3. 3Replace [bracketed placeholders] with your specific project details
  4. 4Iterate on the AI output to refine and customize the results