Secure your API with Rate Limiting using Redis

Fixed window rate limiting - Edition #20

Apr 29, 2024

Hey, I'm Marco and welcome to my newsletter!

As a software engineer, I created this newsletter to share my first-hand knowledge of the development world. Each topic we will explore will provide valuable insights, with the goal of inspiring and helping all of you on your journey.

In this episode I want to show you how to create a rate limiter using Redis with its atomic and transactional methods.

You can download all the code shown directly from my Github repository: https://github.com/marcomoauro/redis-rate-limiter

Subscribe to Implementing 💻

👋 Introduction

Rate limiting is a technique employed to restrict the number of requests a client can make to a server, ensuring the security and stability of the application. By implementing rate limiting, we can prevent overload scenarios that may impact essential resources such as the database.

This technique is crucial in countering various cyber attacks, including:

Brute force attacks: By limiting the number of requests, we prevent attackers from using server resources to systematically attempt to guess user passwords and gain unauthorised access to the application.
Denial of Service attacks (DoS): Rate limiting prevents attackers from flooding the server with a barrage of requests, thus limiting attempts to overload and disrupt the server's operations.
Scraping: Limiting the number of requests per IP address prevents attackers from scraping sensitive data from our application by restricting their ability to make excessive requests.

Possible solutions

We can use various strategies to implement a rate limiting mechanism in our application, one does not exclude the other, in fact mixed systems are usually employed:

Fixed-window: this consists of checking how many calls are made by the client in a fixed window such as 1 minute, if more are made then all subsequent calls are rejected until the start of the next window.
Sliding-window: differs from fixed-window in that the window runs over time, e.g. we could take into account the calls made in the last 5 minutes, this approach allows us to better adapt to traffic spikes.
Token bucket: a bucket is used which contains tokens, each of which represents a call that can be made and with each invocation the tokens are decremented. Once finished, the client cannot call again until the tokens in the bucket are restored, so the server can manage the rate of calls. This approach is efficient in memory but may be susceptible to race conditions in multi-threaded environments or when processes share the same resources.

In our case, we are going to implement the fixed-window mechanism using Redis via the atomic function INCR and the use of TTL.

Subscribe to Implementing 💻

✅ Fixed-window strategy

Sequence diagram illustrating the interaction among different components and the flow of execution

The client calls the server API identifying itself by the API key.
The server queries the database to verify the validity of the API key.
The database returns the result of the key validity check to the server.
The server verifies the use of the API key in the cache.
The cache increments the API key usage counter within the time window.
The cache checks whether the API key usage is within the limits allowed by the time window.
If the API key usage is within limits, cache informs the server that the usage is within limits.
If the API key usage is within limits, server tells the client that it can continue with processing.
If the API key usage exceeds the allowed limits, cache informs the server that the usage has exceeded the limits.
If the API key usage exceeds the allowed limits, server stops API execution by responding to the client that API key usage has exceeded limits.
Subscribe to Implementing 💻

⚡️ Redis: MULTI, INCR and EXPIRE commands

I used Redis in this way to limit requests within a fixed window of one minute:

First, I created a key in this format: "rate_limit/{api-key-id}/{time-format}/{time}", where:

api-key-id: is the ID of the API key in our database.
time-format: is the format used, in our case is minutes ("m").
time: represents the specific minute when a client makes a call.

if a call is made at minute 3 of the api key with ID 1 we would have the following Redis key:

rate_limit/1/m/3

Every time a call is made within the same minute and api-key, we use the same Redis key and increase the count by 1 using the INCR operation.

To prevent keeping unnecessary keys in memory, we make the key expire after a minute using the EXPIRE operation.

To ensure that both the INCR and EXPIRE operations happen together and don't leave any issues if something goes wrong in between, we use a transaction with the MULTI operation. This ensures everything stays consistent and reliable.

Using Redis commands, we can represent operation with the following expression:

MULTI
INCR rate_limit/1/m/3
EXPIRE rate_limit/1/m/3 60
EXEC

Subscribe to Implementing 💻

👨‍💻 Let’s get down to practice

You can download all the code shown directly from my Github repository: https://github.com/marcomoauro/redis-rate-limiter

We are going to create an API that will have the purpose of showing how the rate limit works, the api will provide the detail of the api-key that we may have previously provided to the clients.

I started from the backend template I made for Node.js, you can retrieve it from here:

My Backend Template for Node.js

Marco Moauro

Jan 8

Read full story

For the sake of brevity, I'll leave the links to the ApiKey.js model and api_key.js controller. They have been implemented in the standard format that I use for all my implementations.

I defined a new API in the router.js file:

import Router from '@koa/router';
import {healthcheck} from "./api/healthcheck.js";
import {authenticate, rateLimit, routeToFunction} from "./middlewares.js";
import {getApiKey} from "./controllers/api_keys.js";

const router = new Router();

router.get('/healthcheck', routeToFunction(healthcheck));
router.get('/v1/api-keys/:api_key_code', authenticate, rateLimit, routeToFunction(getApiKey));

export default router;

I have attached two middleware to the route which are:

authenticate: has the task of verifying the existence of the api key in the database.
rateLimit: which has the task of verifying whether the client respects the number of calls configured for our application, I have chosen a maximum of 10 calls per minute.

Implementation of authenticate middleware:

export const authenticate = async (ctx, next) => {
  const token = ctx.headers['x-token'] ?? ctx.request.params.api_key_code;
  if (!token) throw new APIError401();

  let api_key;
  try {
    api_key = await ApiKey.get({code: token});
  } catch (error) {
    if (error instanceof APIError404) {
      throw new APIError401();
    } else {
      throw error;
    }
  }

  asyncStorage.enterWith({...asyncStorage.getStore(), api_key_id: api_key.id});

  await next();
};

it retrieves the api key from the headers or route, checks for existence in the database, and saves the ID in an async storage that I can use in the context of executing the entire API call.

Implementation of rateLimit middleware:

export const rateLimit = async (ctx, next) => {
  const api_key_id = asyncStorage.getStore()?.api_key_id

  const rate_limit = new RateLimit({code: api_key_id});

  await rate_limit.validateWithinMinute()
  await rate_limit.validateWithinHour()

  await next();
};

it creates an instance of the RateLimit class with which it checks the number of calls in the last minute, maximum 10, and in the last hour, maximum 500.

Here is shown the RateLimit.js class that takes care of the number of call checks:

import {DateTime} from "luxon";
import {APIError429} from "../errors.js";
import Cache from "../cache.js";

export default class RateLimit {
  #code
  #THRESHOLD_KEYS = {
    MINUTE: 'minute',
    HOUR: 'hour'
  }
  #THRESHOLDS = {
    [this.#THRESHOLD_KEYS.MINUTE]: 10,
    [this.#THRESHOLD_KEYS.HOUR]: 500
  }

  constructor({code}) {
    this.#code = code
  }

  validateWithinMinute = async () => {
    const minute = DateTime.local().toFormat('m')
    const key = `rate_limit/${this.#code}/m/${minute}`

    const threshold_key = this.#THRESHOLD_KEYS.MINUTE

    await this.#checkByThresholdAndIncrement({key, threshold_key, ttl: 60})
  }

  validateWithinHour = async () => {
    const hour = DateTime.local().toFormat('h')
    const key = `rate_limit/${this.#code}/h/${hour}`

    const threshold_key = this.#THRESHOLD_KEYS.HOUR

    await this.#checkByThresholdAndIncrement({key, threshold_key, ttl: 60 * 60})
  }

  #checkByThresholdAndIncrement = async ({key, threshold_key, ttl}) => {
    const cache_tx = Cache.getClient().pipeline();
    cache_tx.incr(key);
    cache_tx.expire(key, ttl);
    const [[, invocations]] = await cache_tx.exec();

    const threshold_value = this.#THRESHOLDS[threshold_key]
    if (invocations > threshold_value) {
      throw new APIError429(`Rate limit exceeded, max ${threshold_value} calls per ${threshold_key}.`)
    }
  }
}

the validateWithinMinute and validateWithinHour methods aim to verify the usage in the two fixed time windows of 1m and 1h by defining the Redis key, the maximum threshold of calls, 10 and 500 respectively, and the key expiration in seconds (TTL) of 1m (60s) and 1h (60*60s).

the private #checkByThresholdAndIncrement method takes care of starting a Redis transaction, doing the increment and expire of the key, and committing the transaction. Once done it checks if the counter value is above the configured threshold, if it returns HTTP 429 Rate Limit error.

Using the return value of the cache_tx.incr operation, and since this operation is atomic, we avoid running into race conditions problems due to calls at the same time, a very important issue in concurrency contexts.

Subscribe to Implementing 💻

⭐️ Bonus: Try Rate Limiter!

I've deployed my version of the application. You can test it by calling the API more than 10 times and, once you surpass the threshold, the API will respond with the error HTTP 429 Rate Limit.

Try one of the 5 links, to each one I configured a different api-key:

Subscribe to Implementing 💻

🌟 Top lectures of the week

Database Intermediate Series: Change Data Capture(I)

Pratik Pandey

How Dropbox Scaled to 100 Thousand Users in a Year After Launch

Neo Kim

Best API Design Practices

Vivek Bansal

How To Reduce and Report Uncertainty In Features

Llorenç Muntaner

And that’s it for today! If you are finding this newsletter valuable, consider doing any of these:

🍻 Read with your friends — Implementing lives thanks to word of mouth. Share the article with someone who would like it.
Share
📣 Provide your feedback — We welcome your thoughts! Please share your opinions or suggestions for improving the newsletter, your input helps us adapt the content to your tastes.
Leave a comment
💬 Chat with me — If you have any doubts or curiosity, please write to me, I will be happy to answer you!

I wish you a great day! ☀️

Marco