Rate Limits
Storm MCP Gateway implements rate limiting to ensure fair usage and maintain service quality for all users. This guide explains how rate limiting works and how to handle it effectively.
Overview
Rate limits control how many API requests you can make within a specific time window. They help:
- Prevent abuse - Protect against malicious activity
- Ensure fairness - Equal access for all users
- Maintain performance - Keep the service fast
- Control costs - Manage infrastructure expenses
Rate Limit Tiers
Different tiers have different limits:
| Tier | Requests/Hour | Requests/Minute | Burst Limit | Concurrent |
|---|---|---|---|---|
| Free | 1,000 | 20 | 10 | 5 |
| Starter | 5,000 | 100 | 50 | 10 |
| Professional | 20,000 | 500 | 200 | 25 |
| Enterprise | 100,000 | 2,000 | 1,000 | 100 |
| Custom | Unlimited* | Custom | Custom | Custom |
*Subject to fair use policy
Rate Limit Headers
Every API response includes rate limit information:
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1706441445
X-RateLimit-Reset-After: 3600
X-RateLimit-Bucket: apiHeader Definitions
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in window |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when limit resets |
X-RateLimit-Reset-After | Seconds until limit resets |
X-RateLimit-Bucket | Rate limit bucket identifier |
Rate Limit Buckets
Different endpoints have separate rate limits:
Global Bucket
Default rate limit for most endpoints:
GET /gateways → Global bucket
GET /apps → Global bucket
GET /logs → Global bucket
Write Bucket
Lower limits for write operations:
POST /gateways → Write bucket (50% of global)
PATCH /gateways/:id → Write bucket
DELETE /gateways/:id → Write bucket
Analytics Bucket
Higher limits for read-heavy operations:
GET /metrics → Analytics bucket (200% of global)
GET /logs/analytics → Analytics bucket
Webhook Bucket
Separate limits for webhook management:
POST /webhooks → Webhook bucket
GET /webhooks/:id/deliveries → Webhook bucket
Handling Rate Limits
Checking Current Status
Check your current rate limit status:
GET /rate-limit
Authorization: Bearer YOUR_API_KEYResponse:
{
"data": {
"tier": "professional",
"buckets": {
"global": {
"limit": 20000,
"remaining": 18234,
"reset": 1706441445
},
"write": {
"limit": 10000,
"remaining": 9456,
"reset": 1706441445
}
}
}
}Rate Limit Exceeded Response
When you exceed the rate limit:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1706441445
X-RateLimit-Reset-After: 234
Retry-After: 234
{
"error": {
"code": "RATE_LIMITED",
"message": "Rate limit exceeded. Please retry after 234 seconds.",
"retry_after": 234
}
}Retry Strategy
Implement exponential backoff with jitter:
async function makeRequestWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
const response = await fetch(url, options);
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After');
const delay = retryAfter
? parseInt(retryAfter) * 1000
: Math.pow(2, i) * 1000 + Math.random() * 1000;
console.log(`Rate limited. Retrying after ${delay}ms`);
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
return response;
} catch (error) {
if (i === maxRetries - 1) throw error;
}
}
}Python Example:
import time
import random
import requests
def make_request_with_retry(url, headers, max_retries=3):
for i in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 429:
retry_after = response.headers.get('Retry-After')
if retry_after:
delay = int(retry_after)
else:
delay = (2 ** i) + random.random()
print(f"Rate limited. Retrying after {delay} seconds")
time.sleep(delay)
continue
return response
raise Exception("Max retries exceeded")Optimizing API Usage
Batch Operations
Combine multiple operations into single requests:
POST /batch
Authorization: Bearer YOUR_API_KEY
{
"operations": [
{ "method": "GET", "path": "/gateways/gw_1" },
{ "method": "GET", "path": "/gateways/gw_2" },
{ "method": "GET", "path": "/gateways/gw_3" }
]
}This counts as 1 request instead of 3.
Caching Strategies
Implement client-side caching:
class APIClient {
constructor() {
this.cache = new Map();
}
async get(url, cacheTTL = 60000) {
const cached = this.cache.get(url);
if (cached && Date.now() - cached.timestamp < cacheTTL) {
return cached.data;
}
const response = await fetch(url);
const data = await response.json();
this.cache.set(url, {
data,
timestamp: Date.now()
});
return data;
}
}Pagination
Use pagination efficiently:
async function getAllItems() {
const items = [];
let page = 1;
let hasMore = true;
while (hasMore) {
const response = await fetch(
`/api/v1/items?page=${page}&limit=100`
);
const data = await response.json();
items.push(...data.items);
hasMore = data.pagination.has_next;
page++;
// Respect rate limits
await new Promise(resolve => setTimeout(resolve, 100));
}
return items;
}Webhook Usage
Use webhooks instead of polling:
// Bad: Polling ❌
setInterval(async () => {
const response = await fetch('/api/v1/gateways');
// Check for changes
}, 5000);
// Good: Webhooks ✅
// Configure webhook for gateway.updated events
// Receive real-time updates without pollingRate Limit Bypass
Service Accounts
Service accounts have higher limits:
POST /service-accounts
Authorization: Bearer ADMIN_API_KEY
{
"name": "CI/CD Pipeline",
"rate_limit_multiplier": 5
}IP Whitelisting
Whitelisted IPs get increased limits:
POST /account/whitelist
Authorization: Bearer ADMIN_API_KEY
{
"ip": "203.0.113.45",
"rate_limit_multiplier": 2,
"description": "Production server"
}Monitoring Rate Limits
Usage Dashboard
Monitor your rate limit usage:
GET /metrics/rate-limits
Authorization: Bearer YOUR_API_KEYResponse:
{
"data": {
"period": "last_24h",
"usage": {
"total_requests": 15234,
"rate_limited_requests": 23,
"average_remaining": 4234,
"peak_usage": {
"timestamp": "2025-01-28T14:30:00Z",
"requests": 987
}
},
"by_bucket": {
"global": { "used": 12345, "limit": 20000 },
"write": { "used": 2345, "limit": 10000 }
}
}
}Alerts
Set up rate limit alerts:
POST /alerts
Authorization: Bearer YOUR_API_KEY
{
"type": "rate_limit",
"threshold": 80,
"notification": {
"email": "admin@example.com",
"slack": "#alerts"
}
}Burst Limits
Handle temporary traffic spikes:
Burst Allowance
Short bursts are allowed above the rate limit:
Normal rate: 100 req/min
Burst allowance: 200 req/min for 10 seconds
Token Bucket Algorithm
Rate limiting uses token bucket algorithm:
class TokenBucket {
constructor(capacity, refillRate) {
this.capacity = capacity;
this.tokens = capacity;
this.refillRate = refillRate;
this.lastRefill = Date.now();
}
consume(tokens = 1) {
this.refill();
if (this.tokens >= tokens) {
this.tokens -= tokens;
return true;
}
return false;
}
refill() {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
const tokensToAdd = elapsed * this.refillRate;
this.tokens = Math.min(
this.capacity,
this.tokens + tokensToAdd
);
this.lastRefill = now;
}
}Cost-Based Rate Limiting
Some operations cost more than others:
| Operation | Cost | Description |
|---|---|---|
| GET request | 1 | Standard read |
| POST request | 2 | Create operation |
| DELETE request | 3 | Delete operation |
| Analytics query | 5 | Complex computation |
| Batch operation | 1 + n*0.5 | Batch discount |
Example:
Rate limit: 1000 points/hour
GET /gateways → 1 point
POST /gateways → 2 points
GET /analytics/complex → 5 points
Best Practices
Client Implementation
- Always check headers - Monitor remaining quota
- Implement retries - Handle 429 gracefully
- Use caching - Reduce unnecessary requests
- Batch when possible - Combine operations
- Use webhooks - Avoid polling
Rate Limit Planning
- Monitor usage patterns - Understand your needs
- Plan for growth - Upgrade before hitting limits
- Implement circuit breakers - Fail gracefully
- Use multiple API keys - Distribute load
- Cache aggressively - Minimize API calls
Error Handling
class RateLimitHandler {
async handleRequest(fn) {
try {
return await fn();
} catch (error) {
if (error.status === 429) {
// Log rate limit hit
console.warn('Rate limit hit', {
reset: error.headers['X-RateLimit-Reset'],
bucket: error.headers['X-RateLimit-Bucket']
});
// Notify monitoring
await this.notifyMonitoring({
type: 'rate_limit',
timestamp: Date.now()
});
// Return cached data if available
return this.getCachedResponse() || {
error: 'Rate limited',
retry_after: error.headers['Retry-After']
};
}
throw error;
}
}
}Troubleshooting
Common Issues
Consistently hitting rate limits:
- Review API usage patterns
- Implement caching
- Consider upgrading tier
- Use batch operations
Sudden rate limit errors:
- Check for code loops
- Review recent changes
- Monitor for attacks
- Verify API key not shared
Reset time not updating:
- Check system clock
- Verify timezone settings
- Clear local cache
FAQ
Q: Are rate limits per API key or per account? A: Rate limits are per API key. Each key has its own quota.
Q: Do failed requests count against rate limits? A: Yes, all requests count including 4xx and 5xx responses.
Q: Can I get a temporary rate limit increase? A: Contact support for temporary increases for migrations or special events.
Q: How are websocket connections rate limited? A: Websockets have separate connection limits, not request-based limits.
Q: Do webhook deliveries count against my rate limit? A: No, incoming webhooks don't count against your API rate limits.