How to Implement Rate Limiting in Spring Applications using Redis

In this series of mini-tutorials we'll explore several approaches to implement rate limiting in Spring applications using Redis. We’ll start with the most basic of Redis recipes and we’ll slowly increase the complexity of our implementations.

What is Rate Limiting?

Rate Limiting entails techniques to regulate the number of requests a particular client can make against a networked service. It caps the total number and/or the frequency of requests.

Why do we need Rate Limiting?

There are many reasons why you would want to add a rate limiter to your APIs, whether it is to prevent intentional or accidental API abuse, a rate limiter can stop the invaders at the gate. Let’s think of some scenarios where a rate limiter could save your bacon:

  • If you ever worked at an API-based startup, you know that to get anywhere you need a “free” tier. A free tier will get potential customers to try your service and spread the word. But without limiting the free tier users you could risk losing the few paid customers your startup has.

  • Programmatic integrations with your API could have bugs. Sometimes resource starvation is not caused by a malicious attack. These FFDoS (Friendly-Fire Denial of Service) attacks happen more often than you can imagine.

  • Finally, there are malicious players recruiting bots on a daily basis to make API providers’ lives miserable. Being able to detect and curtail those attacks before they impact your users could mean the life of our business.

Rate limiting is typically implemented on the server-side but if you have control of the clients you can also preempt certain types of access at that point. It relies on three particular pieces of information:

  1. Who’s making the request: Identifying the source of the attack or abuse is the most important part of the equation. If the offending requests cannot be grouped and associated with a single entity you’ll be fighting in the dark.

  2. What’s the cost of the request: Not all requests are created equal, for example, a request that’s bound to a single account’s data, likely can only cause localized havoc, while a requests that spans multiple accounts, and/or broad spans of time (like multiple years) are much more expensive

  3. What is their allotted quota: How many total requests and/or what’s the rate of requests permitted for the user. For example, in the case of the "free tier" you might have a smaller allotment/bucket of requests they can make, or you migth reduce them during certain peek times.

Why Redis for Rate Limiting?

Redis is especially positioned as a platform to implement rate limiting for several reasons:

  • Speed: The checks and calculations required by a rate limiting implementation will add the total request-response times of your API, you want those operations to happen as fast as possible.

  • Centralization and distribution: Redis can seamlessly scale your single server/instance setup to hundreds of nodes without sacrificing performance or reliability.

  • The Right Abstractions: Redis provides optimized data structures to support several of the most common rate limiter implementations and with i’s built-in TTL (time-to-live controls) it allows for efficient management of memory. Counting things is a built-in feature in Redis and one of the many areas where Redis shines above the competition.

Now, let’s get started with our first implementation; the simple “Fixed Window” implementation.