NGINX Rate Limiting: Protect Your Server from Bots, Scrapers and Brute Force

Every server on the public internet gets hammered. Credential-stuffing bots trying username/password combinations against your login. Scrapers pulling your entire site at 200 requests per second. Vulnerability scanners testing for last week’s WordPress CVE. Most of this traffic is automated, none of it pays you, and all of it eats CPU you could use serving real visitors. NGINX rate limiting is the cheap, simple, and remarkably effective first line of defence.

This guide covers the full NGINX rate limit configuration stack: limit_req_zone and limit_req, burst handling, per-endpoint limits, per-IP and per-user keys, connection limits, the WordPress-specific patterns that protect wp-login.php and xmlrpc.php, and Redis-backed cross-server limiting for multi-NGINX deployments. Tested on Debian and Ubuntu with the myguard packaged NGINX.

How NGINX Rate Limiting Actually Works

NGINX rate limiting is built on the leaky bucket algorithm. You declare a shared memory zone that tracks request rates per key (usually the client IP). When a request arrives, NGINX checks whether the key has exceeded its allowed rate. If it has, NGINX returns a 429 Too Many Requests immediately, without touching your backend.

The key insight: the rate limit is enforced at the NGINX layer, before PHP-FPM, before your database, before any application code. A bot can hammer you at 10,000 req/s and your PHP workers never wake up. That is the difference between “site is slow” and “site is fine, bots are 429ing.”

Basic NGINX Rate Limit Configuration

Two directives do the work: limit_req_zone (declare the zone) and limit_req (apply it):

http {
    # Declare the zone — 10MB of shared memory, 5 requests per second per IP
    limit_req_zone $binary_remote_addr zone=general:10m rate=5r/s;

    server {
        location / {
            # Apply the zone, allow short bursts of up to 10 requests
            limit_req zone=general burst=10 nodelay;

            # ... your usual config
        }
    }
}

Decoding it:

  • $binary_remote_addr — the client IP in a compact 4-byte (IPv4) or 16-byte (IPv6) form. Use this rather than $remote_addr; it costs less memory.
  • zone=general:10m — name the zone “general”, give it 10MB of shared memory. 10MB tracks roughly 160,000 unique IPs simultaneously.
  • rate=5r/s — five requests per second sustained, per IP. Below the threshold most browsers naturally fall under, well above the threshold most bots hammer at.
  • burst=10 — allow up to 10 requests to queue up if the IP briefly spikes above 5r/s. Without burst, every spike returns 429 immediately and breaks legitimate users.
  • nodelay — process burst requests immediately instead of evenly spreading them. Without nodelay, NGINX delays responses to enforce the average rate, which feels broken to users on legitimate traffic spikes.

Burst vs nodelay: The Setting Most People Get Wrong

Without nodelay, NGINX delays burst requests to smooth the rate. With nodelay, NGINX processes burst requests immediately but counts them against the bucket. You almost always want nodelay — delayed responses look like a hanging server to the client browser, and modern HTTP clients give up.

# Bad (delays user requests, looks like a hang):
limit_req zone=general burst=10;

# Good (allows bursts, returns 429 if the bucket overflows):
limit_req zone=general burst=10 nodelay;

Per-Endpoint Rate Limits

One zone for the whole site is fine for general protection. The bigger win is different rates for different endpoints. Login pages, search, and admin panels each get their own bucket:

http {
    # General browsing: generous limit
    limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;

    # Login: very strict — 5 attempts per minute
    limit_req_zone $binary_remote_addr zone=login:10m rate=5r/m;

    # Search: medium strict — search is expensive
    limit_req_zone $binary_remote_addr zone=search:10m rate=2r/s;

    # API: separate bucket so the website is unaffected
    limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;

    server {
        # Default for all locations
        location / {
            limit_req zone=general burst=20 nodelay;
        }

        # Strict on login
        location = /wp-login.php {
            limit_req zone=login burst=5 nodelay;
            limit_req_status 429;
            # ...PHP-FPM config
        }

        # Strict on search
        location /?s= {
            limit_req zone=search burst=5 nodelay;
        }

        # API gets its own pool
        location /api/ {
            limit_req zone=api burst=50 nodelay;
        }
    }
}

Protecting WordPress: wp-login.php and xmlrpc.php

This is the configuration every WordPress site should run. WordPress’s wp-login.php and xmlrpc.php are the two most-attacked URLs on the internet. Rate limiting them stops credential-stuffing attacks cold:

http {
    limit_req_zone $binary_remote_addr zone=wp_login:10m rate=5r/m;
}

server {
    # Hard limit on wp-login.php
    location = /wp-login.php {
        limit_req zone=wp_login burst=3 nodelay;
        limit_req_status 429;
        try_files $uri =404;
        fastcgi_pass unix:/run/php/php8.4-fpm.sock;
        include fastcgi_params;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    }

    # xmlrpc.php is almost never legitimate — block entirely if you don't use Jetpack
    location = /xmlrpc.php {
        deny all;
        access_log off;
    }

    # Or, if you DO use xmlrpc.php, rate-limit instead of block:
    # location = /xmlrpc.php {
    #     limit_req zone=wp_login burst=2 nodelay;
    #     ... PHP-FPM config
    # }
}

5 requests per minute is brutal — a legitimate user logging in once does not hit it. A credential-stuffing bot hitting 50 passwords per second hits it on the second attempt and stays 429’d for the next minute. This single change typically cuts bot traffic by 80-95% overnight.

Connection Limits Versus Request Limits

Rate limiting limits requests per second. The other useful primitive is connection limiting — capping how many concurrent connections one IP can hold open. The two work well together against different attack shapes:

http {
    # Rate (requests per second)
    limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;

    # Concurrent connections per IP
    limit_conn_zone $binary_remote_addr zone=conn_per_ip:10m;

    server {
        location / {
            limit_req zone=general burst=20 nodelay;
            limit_conn conn_per_ip 20;  # max 20 simultaneous connections per IP
        }
    }
}

20 concurrent connections is generous (most real browsers use 6 per host). A scraper opening 200 connections from one IP gets cut off at 20. Without this, a single attacker can exhaust your worker_connections pool.

Whitelisting Trusted IPs

Monitoring tools, your office IP, your CDN — these should bypass rate limits. Use the geo directive plus a map to build a whitelist:

http {
    geo $limit_exempt {
        default        0;
        10.0.0.0/8     1;   # Internal network
        203.0.113.45/32 1;  # Office IP
        2001:db8::/32   1;  # Office IPv6
    }

    # Use the geo result to choose the rate-limit key
    map $limit_exempt $limit_key {
        0  $binary_remote_addr;
        1  "";   # empty key — no rate limit
    }

    limit_req_zone $limit_key zone=general:10m rate=10r/s;

    server {
        location / {
            limit_req zone=general burst=20 nodelay;
        }
    }
}

Requests from whitelisted IPs use an empty key, which NGINX treats as “no limit.” Everyone else is rate-limited normally.

Customising the 429 Response

By default NGINX returns a plain 429. For a friendlier user experience (and a faster page), serve a custom static 429 page:

server {
    error_page 429 /429.html;

    location = /429.html {
        root /var/www/error-pages;
        internal;
        add_header Retry-After 60 always;
    }

    location / {
        limit_req zone=general burst=20 nodelay;
        limit_req_status 429;
    }
}

The Retry-After header tells well-behaved clients (Googlebot, modern HTTP libraries) when to come back. Bots tend to ignore it but legitimate scrapers respect it, which keeps SEO crawlers happy.

Logging Rate-Limited Requests

NGINX logs 429 responses in the regular access log. For dedicated rate-limit visibility, add a separate access log with a focused format:

http {
    map $status $loggable_429 {
        429  1;
        default  0;
    }

    log_format ratelimit '$remote_addr - $time_iso8601 "$request" '
                        'status=$status zone=$limit_req_zone '
                        'ua="$http_user_agent"';

    server {
        access_log /var/log/nginx/ratelimit.log ratelimit if=$loggable_429;
    }
}

Tail /var/log/nginx/ratelimit.log and you have a live feed of attacking IPs and their user agents. Combine with fail2ban for automated longer-term IP banning.

fail2ban Integration: From Rate Limit to IP Ban

NGINX rate limiting returns 429s. fail2ban watches the log, counts repeated 429s from one IP, and adds an iptables rule blocking that IP entirely for a configurable duration. Two layers, much harder to attack:

# /etc/fail2ban/jail.d/nginx-ratelimit.conf
[nginx-ratelimit]
enabled  = true
filter   = nginx-ratelimit
action   = iptables-multiport[name=NginxRatelimit, port="http,https"]
logpath  = /var/log/nginx/ratelimit.log
maxretry = 50
findtime = 600   # 10-minute window
bantime  = 3600  # ban for 1 hour
# /etc/fail2ban/filter.d/nginx-ratelimit.conf
[Definition]
failregex = ^<HOST> - .* status=429
ignoreregex =

50 rate-limit hits in 10 minutes earns an IP a one-hour iptables ban. Tune the numbers to your traffic. Combine with cloud-level WAF (Cloudflare, AWS WAF) for the heaviest attackers.

Sizing the Shared Memory Zone

The shared memory zone holds one record per tracked key. The records are tiny (about 64 bytes each in modern NGINX). Rough sizing:

  • 10MB tracks roughly 160,000 unique IPs simultaneously — fine for almost any site.
  • 50MB tracks roughly 800,000 unique IPs — for very high-traffic public sites.
  • 1MB tracks roughly 16,000 IPs — fine for internal services.

If the zone fills up, NGINX evicts the oldest record. The zone never blocks; the worst case is a brief tracking gap during an attack.

Frequently Asked Questions

What rate limit should I set for wp-login.php?
5 requests per minute per IP with burst=3 nodelay is a robust default. A legitimate user logging in once never hits it; credential-stuffing bots are 429’d within seconds. If you also need WordPress REST API auth, give /wp-json/wp/v2/users/me a separate, slightly more relaxed bucket.
Does NGINX rate limiting work behind Cloudflare?
Yes, but you must use $http_cf_connecting_ip instead of $binary_remote_addr — otherwise every request looks like it came from a Cloudflare edge IP and your limits trigger on Cloudflare itself. Set real_ip_header CF-Connecting-IP and trust the Cloudflare IP ranges with set_real_ip_from.
Will rate limiting block Googlebot?
Not at sensible thresholds. Googlebot respects 429 responses and the Retry-After header. Set generous limits on /, strict limits on /wp-login.php. Whitelisting Googlebot by user agent is not recommended (any bot can lie about its UA) — verify Googlebot via reverse DNS if you really need it bypassed.
Should I use burst with delay or burst with nodelay?
Almost always nodelay. Without nodelay, NGINX holds burst requests open and serves them at the configured rate — that looks like a hung server to the user’s browser. With nodelay, burst requests are processed immediately and only excess returns 429.
How do I rate-limit by user (logged-in cookie) instead of IP?
Use a custom variable. Extract the user from $http_cookie or a session header with a map directive, fall back to $binary_remote_addr when the user variable is empty. The map output becomes your limit_req_zone key.
Can NGINX rate limiting protect against full DDoS?
No — by the time traffic reaches your NGINX, it has already consumed bandwidth. Volumetric DDoS protection happens at the cloud / CDN layer (Cloudflare, AWS Shield, etc.). NGINX rate limiting protects against application-layer abuse: credential stuffing, scraping, scanner traffic, slow-burn brute force.
Does rate limiting work for HTTP/2 and HTTP/3?
Yes — limit_req and limit_conn work identically across HTTP/1.1, HTTP/2 and HTTP/3 in modern NGINX. The leaky-bucket algorithm tracks logical requests, not TCP connections, so multiplexed HTTP/2 and QUIC streams count individually.

Related Posts