Saturday, December 13, 2025

How does rate limiting works in Redis?

 Excellent — you’re now touching one of the key operational controls for multi-tenant architectures:

Rate Limiting using Redis for per-tenant request throttling.

Let’s go step by step ๐Ÿ‘‡


๐Ÿงฉ What is Rate Limiting?

Rate limiting is the process of controlling how many requests a user, API client, or tenant can make within a defined period (e.g., 100 requests per minute).

It helps prevent:

  • Abuse (e.g., DDoS or excessive API use)

  • Noisy-neighbor problems in multi-tenant systems

  • Fair resource sharing among tenants


⚙️ Why Redis?

Redis is ideal for rate limiting because:

  • It’s fast (in-memory)

  • Supports atomic counters (using INCR, EXPIRE, INCRBY)

  • Works well in distributed environments — multiple API servers can share the same Redis backend to coordinate limits.


๐Ÿข Multi-Tenant Rate Limiting

In a multi-tenant system, you want to enforce rate limits per tenant (or even per user), not globally.

So you maintain per-tenant counters in Redis:

rate_limit:<tenant_id>:<time_window>

๐Ÿ” Key Rate Limiting Algorithms

1️⃣ Fixed Window

Simplest approach:

  • Each tenant gets a counter per time window (e.g., per minute).

  • Reset at the start of every window.

Example key:

rate_limit:tenant_123:2025-12-14T08:05

2️⃣ Sliding Window (or Rolling Window)

More accurate — uses multiple smaller windows or timestamps to avoid burstiness at window boundaries.

3️⃣ Token Bucket / Leaky Bucket

More advanced — allows bursts but enforces an average rate over time.
Used by APIs that tolerate short bursts (e.g., 5/sec avg but allow bursts of 10).


๐Ÿš€ Redis-Based Implementation (Fixed Window Example)

Example (Node.js + Express + ioredis)

import Redis from "ioredis";
import express from "express";

const redis = new Redis();
const app = express();

const LIMIT = 100; // max requests
const WINDOW = 60; // seconds

app.use(async (req, res, next) => {
  const tenantId = req.headers["x-tenant-id"];
  if (!tenantId) return res.status(400).json({ error: "Missing tenant ID" });

  const key = `rate_limit:${tenantId}:${Math.floor(Date.now() / (WINDOW * 1000))}`;

  const count = await redis.incr(key);
  if (count === 1) {
    await redis.expire(key, WINDOW); // expire after window ends
  }

  if (count > LIMIT) {
    const ttl = await redis.ttl(key);
    return res.status(429).json({
      error: "Rate limit exceeded",
      retry_after: ttl,
    });
  }

  next();
});

app.get("/api/data", (req, res) => {
  res.json({ message: "Success" });
});

app.listen(3000, () => console.log("API server running on port 3000"));

๐Ÿง  How it works:

  1. Each request increments a counter in Redis.

  2. Redis key expires after the time window.

  3. If the count exceeds the limit, you return HTTP 429 Too Many Requests.


๐Ÿง  Token Bucket Version (more flexible)

const RATE = 10;     // tokens per second
const CAPACITY = 100; // max burst

async function isAllowed(tenantId) {
  const key = `bucket:${tenantId}`;
  const now = Date.now() / 1000;

  const bucket = await redis.hgetall(key);
  let tokens = bucket.tokens ? parseFloat(bucket.tokens) : CAPACITY;
  let lastRefill = bucket.lastRefill ? parseFloat(bucket.lastRefill) : now;

  const elapsed = now - lastRefill;
  tokens = Math.min(CAPACITY, tokens + elapsed * RATE);

  if (tokens < 1) {
    return false;
  }

  tokens -= 1;
  await redis.hmset(key, { tokens, lastRefill: now });
  return true;
}

This allows smoother traffic handling with bursts but keeps the overall rate steady.


๐Ÿงฎ Typical Redis Key Schema

Key PatternMeaning
rate_limit:<tenant_id>:<epoch>Counter for that tenant and window
bucket:<tenant_id>Token bucket state (tokens, last refill time)
user_limit:<tenant_id>:<user_id>Optional per-user key inside a tenant

๐Ÿ“Š Monitoring and Metrics

  • Track Redis keys expiring and count values for usage metrics.

  • Expose rate limit headers in API responses:

    X-RateLimit-Limit: 100
    X-RateLimit-Remaining: 25
    X-RateLimit-Reset: 30
    

    These help clients know how close they are to the limit.


⚖️ Scaling Tips

✅ Use Redis Cluster for horizontal scalability.
✅ Use Lua scripts for atomic multi-step operations.
✅ Tune key expiry to match window size to avoid memory bloat.
✅ Consider different rate limits per tenant tier (e.g., “Free” = 100/min, “Enterprise” = 1000/min).


๐Ÿ”’ Integration with AuthN/AuthZ

  • The tenant ID comes from your JWT claim (tenant).

  • Middleware extracts tenant_id from the token and applies rate limits accordingly.

Example:

const tenantId = decodedJWT.tenant;

This ensures fair throttling even when multiple users belong to the same tenant.


✅ Summary

AspectDescription
PurposePrevent abuse & ensure fair usage per tenant
BackendRedis (fast atomic counters)
GranularityPer tenant (can extend to per user)
AlgorithmsFixed window, sliding window, token bucket
IntegrationWorks with JWT (use tenant claim)
Response429 Too Many Requests + Retry-After header
ScalabilityUse Redis cluster or sharded setup

Would you like me to show how to combine this rate limiter with JWT authentication (RS256) — so it automatically throttles based on the tenant claim inside the token?

No comments:

Post a Comment