Every server on the public internet gets hammered. Credential-stuffing bots trying username/password combinations against your login. Scrapers pulling your entire site at 200 requests per second. Vulnerability scanners testing for last week’s WordPress CVE. Most of this traffic is automated, none of it pays you, and all of it eats CPU you could use serving real visitors. NGINX rate limiting is the cheap, simple, and remarkably effective first line of defence.
This guide covers the full NGINX rate limit configuration stack: limit_req_zone and limit_req, burst handling, per-endpoint limits, per-IP and per-user keys, connection limits, the WordPress-specific patterns that protect wp-login.php and xmlrpc.php, and Redis-backed cross-server limiting for multi-NGINX deployments. Tested on Debian and Ubuntu with the myguard packaged NGINX.
How NGINX Rate Limiting Actually Works
NGINX rate limiting is built on the leaky bucket algorithm. You declare a shared memory zone that tracks request rates per key (usually the client IP). When a request arrives, NGINX checks whether the key has exceeded its allowed rate. If it has, NGINX returns a 429 Too Many Requests immediately, without touching your backend.
The key insight: the rate limit is enforced at the NGINX layer, before PHP-FPM, before your database, before any application code. A bot can hammer you at 10,000 req/s and your PHP workers never wake up. That is the difference between “site is slow” and “site is fine, bots are 429ing.”
Basic NGINX Rate Limit Configuration
Two directives do the work: limit_req_zone (declare the zone) and limit_req (apply it):
http {
# Declare the zone — 10MB of shared memory, 5 requests per second per IP
limit_req_zone $binary_remote_addr zone=general:10m rate=5r/s;
server {
location / {
# Apply the zone, allow short bursts of up to 10 requests
limit_req zone=general burst=10 nodelay;
# ... your usual config
}
}
}
Decoding it:
- $binary_remote_addr — the client IP in a compact 4-byte (IPv4) or 16-byte (IPv6) form. Use this rather than $remote_addr; it costs less memory.
- zone=general:10m — name the zone “general”, give it 10MB of shared memory. 10MB tracks roughly 160,000 unique IPs simultaneously.
- rate=5r/s — five requests per second sustained, per IP. Below the threshold most browsers naturally fall under, well above the threshold most bots hammer at.
- burst=10 — allow up to 10 requests to queue up if the IP briefly spikes above 5r/s. Without burst, every spike returns 429 immediately and breaks legitimate users.
- nodelay — process burst requests immediately instead of evenly spreading them. Without nodelay, NGINX delays responses to enforce the average rate, which feels broken to users on legitimate traffic spikes.
Burst vs nodelay: The Setting Most People Get Wrong
Without nodelay, NGINX delays burst requests to smooth the rate. With nodelay, NGINX processes burst requests immediately but counts them against the bucket. You almost always want nodelay — delayed responses look like a hanging server to the client browser, and modern HTTP clients give up.
# Bad (delays user requests, looks like a hang):
limit_req zone=general burst=10;
# Good (allows bursts, returns 429 if the bucket overflows):
limit_req zone=general burst=10 nodelay;
Per-Endpoint Rate Limits
One zone for the whole site is fine for general protection. The bigger win is different rates for different endpoints. Login pages, search, and admin panels each get their own bucket:
http {
# General browsing: generous limit
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
# Login: very strict — 5 attempts per minute
limit_req_zone $binary_remote_addr zone=login:10m rate=5r/m;
# Search: medium strict — search is expensive
limit_req_zone $binary_remote_addr zone=search:10m rate=2r/s;
# API: separate bucket so the website is unaffected
limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;
server {
# Default for all locations
location / {
limit_req zone=general burst=20 nodelay;
}
# Strict on login
location = /wp-login.php {
limit_req zone=login burst=5 nodelay;
limit_req_status 429;
# ...PHP-FPM config
}
# Strict on search
location /?s= {
limit_req zone=search burst=5 nodelay;
}
# API gets its own pool
location /api/ {
limit_req zone=api burst=50 nodelay;
}
}
}
Protecting WordPress: wp-login.php and xmlrpc.php
This is the configuration every WordPress site should run. WordPress’s wp-login.php and xmlrpc.php are the two most-attacked URLs on the internet. Rate limiting them stops credential-stuffing attacks cold:
http {
limit_req_zone $binary_remote_addr zone=wp_login:10m rate=5r/m;
}
server {
# Hard limit on wp-login.php
location = /wp-login.php {
limit_req zone=wp_login burst=3 nodelay;
limit_req_status 429;
try_files $uri =404;
fastcgi_pass unix:/run/php/php8.4-fpm.sock;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
}
# xmlrpc.php is almost never legitimate — block entirely if you don't use Jetpack
location = /xmlrpc.php {
deny all;
access_log off;
}
# Or, if you DO use xmlrpc.php, rate-limit instead of block:
# location = /xmlrpc.php {
# limit_req zone=wp_login burst=2 nodelay;
# ... PHP-FPM config
# }
}
5 requests per minute is brutal — a legitimate user logging in once does not hit it. A credential-stuffing bot hitting 50 passwords per second hits it on the second attempt and stays 429’d for the next minute. This single change typically cuts bot traffic by 80-95% overnight.
Connection Limits Versus Request Limits
Rate limiting limits requests per second. The other useful primitive is connection limiting — capping how many concurrent connections one IP can hold open. The two work well together against different attack shapes:
http {
# Rate (requests per second)
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
# Concurrent connections per IP
limit_conn_zone $binary_remote_addr zone=conn_per_ip:10m;
server {
location / {
limit_req zone=general burst=20 nodelay;
limit_conn conn_per_ip 20; # max 20 simultaneous connections per IP
}
}
}
20 concurrent connections is generous (most real browsers use 6 per host). A scraper opening 200 connections from one IP gets cut off at 20. Without this, a single attacker can exhaust your worker_connections pool.
Whitelisting Trusted IPs
Monitoring tools, your office IP, your CDN — these should bypass rate limits. Use the geo directive plus a map to build a whitelist:
http {
geo $limit_exempt {
default 0;
10.0.0.0/8 1; # Internal network
203.0.113.45/32 1; # Office IP
2001:db8::/32 1; # Office IPv6
}
# Use the geo result to choose the rate-limit key
map $limit_exempt $limit_key {
0 $binary_remote_addr;
1 ""; # empty key — no rate limit
}
limit_req_zone $limit_key zone=general:10m rate=10r/s;
server {
location / {
limit_req zone=general burst=20 nodelay;
}
}
}
Requests from whitelisted IPs use an empty key, which NGINX treats as “no limit.” Everyone else is rate-limited normally.
Customising the 429 Response
By default NGINX returns a plain 429. For a friendlier user experience (and a faster page), serve a custom static 429 page:
server {
error_page 429 /429.html;
location = /429.html {
root /var/www/error-pages;
internal;
add_header Retry-After 60 always;
}
location / {
limit_req zone=general burst=20 nodelay;
limit_req_status 429;
}
}
The Retry-After header tells well-behaved clients (Googlebot, modern HTTP libraries) when to come back. Bots tend to ignore it but legitimate scrapers respect it, which keeps SEO crawlers happy.
Logging Rate-Limited Requests
NGINX logs 429 responses in the regular access log. For dedicated rate-limit visibility, add a separate access log with a focused format:
http {
map $status $loggable_429 {
429 1;
default 0;
}
log_format ratelimit '$remote_addr - $time_iso8601 "$request" '
'status=$status zone=$limit_req_zone '
'ua="$http_user_agent"';
server {
access_log /var/log/nginx/ratelimit.log ratelimit if=$loggable_429;
}
}
Tail /var/log/nginx/ratelimit.log and you have a live feed of attacking IPs and their user agents. Combine with fail2ban for automated longer-term IP banning.
fail2ban Integration: From Rate Limit to IP Ban
NGINX rate limiting returns 429s. fail2ban watches the log, counts repeated 429s from one IP, and adds an iptables rule blocking that IP entirely for a configurable duration. Two layers, much harder to attack:
# /etc/fail2ban/jail.d/nginx-ratelimit.conf
[nginx-ratelimit]
enabled = true
filter = nginx-ratelimit
action = iptables-multiport[name=NginxRatelimit, port="http,https"]
logpath = /var/log/nginx/ratelimit.log
maxretry = 50
findtime = 600 # 10-minute window
bantime = 3600 # ban for 1 hour
# /etc/fail2ban/filter.d/nginx-ratelimit.conf
[Definition]
failregex = ^<HOST> - .* status=429
ignoreregex =
50 rate-limit hits in 10 minutes earns an IP a one-hour iptables ban. Tune the numbers to your traffic. Combine with cloud-level WAF (Cloudflare, AWS WAF) for the heaviest attackers.
Sizing the Shared Memory Zone
The shared memory zone holds one record per tracked key. The records are tiny (about 64 bytes each in modern NGINX). Rough sizing:
- 10MB tracks roughly 160,000 unique IPs simultaneously — fine for almost any site.
- 50MB tracks roughly 800,000 unique IPs — for very high-traffic public sites.
- 1MB tracks roughly 16,000 IPs — fine for internal services.
If the zone fills up, NGINX evicts the oldest record. The zone never blocks; the worst case is a brief tracking gap during an attack.
Frequently Asked Questions
Related Posts
- WordPress NGINX + PHP-FPM Configuration Guide — the full WordPress server config rate limiting fits inside.
- How to Install ModSecurity and OWASP CRS on NGINX — the WAF that pairs perfectly with rate limiting.
- NGINX Reverse Proxy Configuration Guide — proxying with rate limits in front of backends.
- NGINX Load Balancing — protecting each upstream from abuse.
- PHP-Snuffleupagus — interpreter-level security for the layer behind NGINX.