Rate Limits

Storm MCP Gateway implements rate limiting to ensure fair usage and maintain service quality for all users. This guide explains how rate limiting works and how to handle it effectively.

Overview

Rate limits control how many API requests you can make within a specific time window. They help:

Prevent abuse - Protect against malicious activity
Ensure fairness - Equal access for all users
Maintain performance - Keep the service fast
Control costs - Manage infrastructure expenses

Rate Limit Tiers

Different tiers have different limits:

Tier	Requests/Hour	Requests/Minute	Burst Limit	Concurrent
Free	1,000	20	10	5
Starter	5,000	100	50	10
Professional	20,000	500	200	25
Enterprise	100,000	2,000	1,000	100
Custom	Unlimited*	Custom	Custom	Custom

*Subject to fair use policy

Rate Limit Headers

Every API response includes rate limit information:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1706441445
X-RateLimit-Reset-After: 3600
X-RateLimit-Bucket: api

Header Definitions

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in window
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when limit resets
`X-RateLimit-Reset-After`	Seconds until limit resets
`X-RateLimit-Bucket`	Rate limit bucket identifier

Rate Limit Buckets

Different endpoints have separate rate limits:

Global Bucket

Default rate limit for most endpoints:

GET /gateways         → Global bucket
GET /apps            → Global bucket
GET /logs            → Global bucket

Write Bucket

Lower limits for write operations:

POST /gateways       → Write bucket (50% of global)
PATCH /gateways/:id  → Write bucket
DELETE /gateways/:id → Write bucket

Analytics Bucket

Higher limits for read-heavy operations:

GET /metrics         → Analytics bucket (200% of global)
GET /logs/analytics  → Analytics bucket

Webhook Bucket

Separate limits for webhook management:

POST /webhooks       → Webhook bucket
GET /webhooks/:id/deliveries → Webhook bucket

Handling Rate Limits

Checking Current Status

Check your current rate limit status:

GET /rate-limit
Authorization: Bearer YOUR_API_KEY

Response:

{
  "data": {
    "tier": "professional",
    "buckets": {
      "global": {
        "limit": 20000,
        "remaining": 18234,
        "reset": 1706441445
      },
      "write": {
        "limit": 10000,
        "remaining": 9456,
        "reset": 1706441445
      }
    }
  }
}

Rate Limit Exceeded Response

When you exceed the rate limit:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1706441445
X-RateLimit-Reset-After: 234
Retry-After: 234

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Rate limit exceeded. Please retry after 234 seconds.",
    "retry_after": 234
  }
}

Retry Strategy

Implement exponential backoff with jitter:

async function makeRequestWithRetry(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      const response = await fetch(url, options);
      
      if (response.status === 429) {
        const retryAfter = response.headers.get('Retry-After');
        const delay = retryAfter 
          ? parseInt(retryAfter) * 1000 
          : Math.pow(2, i) * 1000 + Math.random() * 1000;
        
        console.log(`Rate limited. Retrying after ${delay}ms`);
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      
      return response;
    } catch (error) {
      if (i === maxRetries - 1) throw error;
    }
  }
}

Python Example:

import time
import random
import requests

def make_request_with_retry(url, headers, max_retries=3):
    for i in range(max_retries):
        response = requests.get(url, headers=headers)
        
        if response.status_code == 429:
            retry_after = response.headers.get('Retry-After')
            if retry_after:
                delay = int(retry_after)
            else:
                delay = (2 ** i) + random.random()
            
            print(f"Rate limited. Retrying after {delay} seconds")
            time.sleep(delay)
            continue
            
        return response
    
    raise Exception("Max retries exceeded")

Optimizing API Usage

Batch Operations

Combine multiple operations into single requests:

POST /batch
Authorization: Bearer YOUR_API_KEY

{
  "operations": [
    { "method": "GET", "path": "/gateways/gw_1" },
    { "method": "GET", "path": "/gateways/gw_2" },
    { "method": "GET", "path": "/gateways/gw_3" }
  ]
}

This counts as 1 request instead of 3.

Caching Strategies

Implement client-side caching:

class APIClient {
  constructor() {
    this.cache = new Map();
  }
  
  async get(url, cacheTTL = 60000) {
    const cached = this.cache.get(url);
    
    if (cached && Date.now() - cached.timestamp < cacheTTL) {
      return cached.data;
    }
    
    const response = await fetch(url);
    const data = await response.json();
    
    this.cache.set(url, {
      data,
      timestamp: Date.now()
    });
    
    return data;
  }
}

Pagination

Use pagination efficiently:

async function getAllItems() {
  const items = [];
  let page = 1;
  let hasMore = true;
  
  while (hasMore) {
    const response = await fetch(
      `/api/v1/items?page=${page}&limit=100`
    );
    const data = await response.json();
    
    items.push(...data.items);
    hasMore = data.pagination.has_next;
    page++;
    
    // Respect rate limits
    await new Promise(resolve => setTimeout(resolve, 100));
  }
  
  return items;
}

Webhook Usage

Use webhooks instead of polling:

// Bad: Polling ❌
setInterval(async () => {
  const response = await fetch('/api/v1/gateways');
  // Check for changes
}, 5000);

// Good: Webhooks ✅
// Configure webhook for gateway.updated events
// Receive real-time updates without polling

Rate Limit Bypass

Service Accounts

Service accounts have higher limits:

POST /service-accounts
Authorization: Bearer ADMIN_API_KEY

{
  "name": "CI/CD Pipeline",
  "rate_limit_multiplier": 5
}

IP Whitelisting

Whitelisted IPs get increased limits:

POST /account/whitelist
Authorization: Bearer ADMIN_API_KEY

{
  "ip": "203.0.113.45",
  "rate_limit_multiplier": 2,
  "description": "Production server"
}

Monitoring Rate Limits

Usage Dashboard

Monitor your rate limit usage:

GET /metrics/rate-limits
Authorization: Bearer YOUR_API_KEY

Response:

{
  "data": {
    "period": "last_24h",
    "usage": {
      "total_requests": 15234,
      "rate_limited_requests": 23,
      "average_remaining": 4234,
      "peak_usage": {
        "timestamp": "2025-01-28T14:30:00Z",
        "requests": 987
      }
    },
    "by_bucket": {
      "global": { "used": 12345, "limit": 20000 },
      "write": { "used": 2345, "limit": 10000 }
    }
  }
}

Alerts

Set up rate limit alerts:

POST /alerts
Authorization: Bearer YOUR_API_KEY

{
  "type": "rate_limit",
  "threshold": 80,
  "notification": {
    "email": "admin@example.com",
    "slack": "#alerts"
  }
}

Burst Limits

Handle temporary traffic spikes:

Burst Allowance

Short bursts are allowed above the rate limit:

Normal rate: 100 req/min
Burst allowance: 200 req/min for 10 seconds

Token Bucket Algorithm

Rate limiting uses token bucket algorithm:

class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRate;
    this.lastRefill = Date.now();
  }
  
  consume(tokens = 1) {
    this.refill();
    
    if (this.tokens >= tokens) {
      this.tokens -= tokens;
      return true;
    }
    
    return false;
  }
  
  refill() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    const tokensToAdd = elapsed * this.refillRate;
    
    this.tokens = Math.min(
      this.capacity,
      this.tokens + tokensToAdd
    );
    this.lastRefill = now;
  }
}

Cost-Based Rate Limiting

Some operations cost more than others:

Operation	Cost	Description
GET request	1	Standard read
POST request	2	Create operation
DELETE request	3	Delete operation
Analytics query	5	Complex computation
Batch operation	1 + n*0.5	Batch discount

Example:

Rate limit: 1000 points/hour

GET /gateways         → 1 point
POST /gateways        → 2 points
GET /analytics/complex → 5 points

Best Practices

Client Implementation

Always check headers - Monitor remaining quota
Implement retries - Handle 429 gracefully
Use caching - Reduce unnecessary requests
Batch when possible - Combine operations
Use webhooks - Avoid polling

Rate Limit Planning

Monitor usage patterns - Understand your needs
Plan for growth - Upgrade before hitting limits
Implement circuit breakers - Fail gracefully
Use multiple API keys - Distribute load
Cache aggressively - Minimize API calls

Error Handling

class RateLimitHandler {
  async handleRequest(fn) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429) {
        // Log rate limit hit
        console.warn('Rate limit hit', {
          reset: error.headers['X-RateLimit-Reset'],
          bucket: error.headers['X-RateLimit-Bucket']
        });
        
        // Notify monitoring
        await this.notifyMonitoring({
          type: 'rate_limit',
          timestamp: Date.now()
        });
        
        // Return cached data if available
        return this.getCachedResponse() || {
          error: 'Rate limited',
          retry_after: error.headers['Retry-After']
        };
      }
      throw error;
    }
  }
}

Troubleshooting

Common Issues

Consistently hitting rate limits:

Review API usage patterns
Implement caching
Consider upgrading tier
Use batch operations

Sudden rate limit errors:

Check for code loops
Review recent changes
Monitor for attacks
Verify API key not shared

Reset time not updating:

Check system clock
Verify timezone settings
Clear local cache

FAQ

Q: Are rate limits per API key or per account? A: Rate limits are per API key. Each key has its own quota.

Q: Do failed requests count against rate limits? A: Yes, all requests count including 4xx and 5xx responses.

Q: Can I get a temporary rate limit increase? A: Contact support for temporary increases for migrations or special events.

Q: How are websocket connections rate limited? A: Websockets have separate connection limits, not request-based limits.

Q: Do webhook deliveries count against my rate limit? A: No, incoming webhooks don't count against your API rate limits.