Rate Limiting in the Era of Infinite IPs
Rate limiting faces new challenges with IPv6, where the massive number of IP addresses render traditional methods ineffective.
Rate Limiting in the Era of Infinite IPs
What is Rate Limiting?
Rate limiting is a critical part of network and application security, used to control the flow of requests sent to a server or service within a given timeframe. By capping the number of requests a single user can make, rate limiting prevents abuse and protects systems from being overwhelmed. It is especially important when preventing attacks like denial of service (DoS) attacks, where attackers flood a server with requests to take it down. Additionally, rate limiting can help combat things like web scraping, API abuse, and other forms of malicious or excessive traffic.
Traditionally, rate limiting is implemented per IP address, leveraging the assumption that each IP represents a unique user or device. This approach works effectively in IPv4 networks, where the relatively small address space of around 4.3 billion IPs imposes natural constraints. However, IPv4 also relies heavily on Network Address Translation (NAT), which allows multiple devices to share a single public IP address. While this can lead to some bycatch, where legitimate users behind the same NAT as an abusive user are rate limited, the issue is generally manageable. In practice, the number of devices behind a single NAT is often limited, and NAT itself imposes constraints that prevent excessive abuse. As a result, the per IP rate limiting model has proven sufficient in most cases.
As the internet very slowly transitions to IPv6, the foundational assumptions of IPv4 rate limiting no longer hold. To understand why, we should first look at how IPv6 allocation and subnetting work.
IPv6 Allocation and Subnetting
IPv6, the successor to IPv4, was introduced to address the growing scarcity of IPv4 addresses. With its 128-bit address space, IPv6 offers an incomprehensibly vast number of unique addresses. This abundance eliminates the need for NAT and provides enough addresses for every device on Earth to have its own public IP, with a little room to spare.
Unlike IPv4, where individual IPs are often assigned, IPv6 addresses are typically distributed in blocks, or subnets, to users. Subnetting in IPv6 is based on a slash notation (CIDR), such as /64 or /48, which defines how many bits are fixed and how many are available for devices within the subnet. For example:
- A /64 block, commonly assigned to end-users, provides 2⁶⁴ (about 18 quintillion) unique addresses. This size is mandated for Stateless Address Auto-Configuration (SLAAC), a mechanism for devices to configure their own addresses within a subnet without DHCP.
- Larger organizations may receive a /48 block, allowing them to divide it into 65,536 /64 subnets for internal use.
IPv6 allocation follows a hierarchical structure. The Internet Assigned Numbers Authority (IANA) delegates large blocks to Regional Internet Registries (RIRs), which then distribute smaller subnets to Internet Service Providers (ISPs) and enterprises. ISPs often assign residential users a /64 block or, in some cases, multiple /64 blocks.
Challenges in IPv6 Rate Limiting
In IPv4, an attacker’s ability to cycle through IPs is largely constrained by the scarcity of addresses. In IPv6, a single user with a /64 block has access to an astronomical number of addresses, rendering traditional per IP rate limiting ineffective.
Many widely-used servers and providers still default to per IP rate limiting, a method that assumes address scarcity. Some providers and tools have begun addressing this by rate limiting at the subnet level rather than per individual IP. Cloudflare, which has some really good documentation on this stuff, has implemented a default rate limit at the /64 block level. When used properly, this method effectively mitigates the risk of attackers cycling through millions of addresses within a single subnet while not being overly restrictive to legitimate users.
However, many open-source and enterprise solutions lag behind. Popular servers like NGINX, Apache, and HAProxy still default to per IP limiting, though most web servers can be configured to behave differently. For instance, in NGINX, modules like ngx_http_limit_req_module can be used to rate limit based on subnet masks.
Subnet Blocking and Bycatch
While subnet-based rate limiting is a step forward, it introduces a new problem: bycatch. In many cases, attackers leveraging larger subnets are using blocks assigned by legitimate ISPs. Blocking an entire subnet to mitigate abuse could inadvertently disrupt legitimate users who share the same allocation.
This bycatch issue isn't unique to IPv6. In IPv4, Carrier-Grade NAT (CGNAT) creates similar challenges. With CGNAT, multiple users share a single public IPv4 address, and rate limiting at the IP level can impact all users behind the same NAT.
Rate Limiting Beyond IP
An alternative approach to rate limiting focuses on characteristics beyond IP addresses. Cloudflare’s advanced rate-limiting, for example, builds rules based on other parameters such as HTTP headers, request methods, URLs, JA3 fingerprints, etc.
Rules can be defined for specific endpoints or types of requests rather than for all traffic to a particular domain. Rate limits can be applied to User-Agent headers commonly associated with bot traffic. They can also be applied to API endpoints vulnerable to abuse, such as login pages.This level of granularity helps block specific attack patterns without affecting legitimate users.
Authenticated users can be subject to rate limits tied to their account, which remains consistent regardless of their IP address. This is particularly useful for applications where users are likely to share IP space, such as those behind corporate firewalls or using mobile networks. Additionally, this approach can help mitigate abuse from web scrapers, as it ties limits to identifiable accounts rather than IPs, which can often be rotated more easily.
Limitations of Characteristic-Based Rate Limiting
I am a little skeptical about the practical effectiveness of rate limiting based on request characteristics. Attackers can easily modify or diversify the characteristics of their HTTP requests to bypass many of these rules. Even JA3 fingerprints can be spoofed to evade detection. I plan to write more about TLS fingerprinting in a blog post soon, so stay tuned if that sounds interesting.
While this blog focuses on rate limiting, more sophisticated anti-bot solutions, such as those provided by Akamai, represent another layer to this story, which I plan to explore in future posts.
Anyway, the internet still works, so while each of these approaches has its limitations, they seem to hold up well enough to keep things running smoothly—for now.