Your API’s request log lights up — 500 hits from one IP in seconds. Servers sweat, costs spike. That’s when Sushant Dhiman’s configurable rate limiter kicks in, a custom beast that doesn’t just block; it adapts.
Zoom out. Rate limiting isn’t new. It’s been table stakes since AWS and Stripe started charging per call. But most devs grab Redis + some leaky bucket lib and call it done. Dhiman’s take? A from-scratch implementation that’s swappable across token bucket, sliding window, even fixed window counters. Why? Because generic crates ignore your quirky traffic patterns — think bursty e-commerce spikes versus steady IoT drips.
Why Roll Your Own Rate Limiter in 2024?
Here’s the thing. Market data screams for it. Cloudflare’s 2023 DDoS report clocked 20 million attacks daily; API endpoints took 30% of the hit. Off-the-shelf tools like Express-rate-limit cap at basic fixed windows — fine for hobby projects, lousy for scale. Dhimaan’s version, coded in Go (yeah, that performant darling), lets you dial limits per IP, user, or endpoint. Configure via YAML: window: 60s, capacity: 100, refill: 10/s. Boom, tailored defense.
He walks through the math early. Token bucket? Permits bursts up to a bucket size, then trickles. Equation’s simple: tokens added at rate r, consumed per request. Exceed? 429 response. But fixed windows? They cluster at boundaries — 100 reqs at minute’s end, zero mid-window. Sliding window smooths that with log-stored counters, querying [now - window, now].
“I wanted something configurable because APIs can’t say ‘Chill’ on their own. Most libraries force one algo; mine swaps them runtime.” — Sushant Dhiman, via his dev blog
Smart. And it’s open source — GitHub repo linked, MIT license. Fork it, tweak it.
But wait — is this reinventing nginx’s limit_req? Kinda. Except nginx is server-level, sticky to HTTP. This? Pluggable middleware for any Go API, even gRPC.
Does a Configurable Rate Limiter Actually Scale?
Scale’s the rub. Dhiman benchmarks it: 10k req/s on a t3.medium AWS box, sub-1ms latency under load. Uses Ristretto cache for counters (LRU with metrics), dodging Redis’ network hop. At 1M keys? Memory caps at 500MB, eviction tuned.
Compare to market leaders. Kong’s rate limiter plugin? Solid, but Lua VM overhead kills microsecond responses. Tyk? Enterprise pricing starts at $15k/year. This? Free, with 99th percentile p99 latency under 200μs per his Artillery tests.
Look, production war stories back this. Early Twitter (pre-Fail Whale) ignored per-user limits; one celeb tweet = outage. Netflix’s Zuul gateway? Custom hystrix + token buckets saved their bacon during Stranger Things peaks. Dhiman’s insight mirrors that: configurability beats rigidity. My bold prediction? This pattern hits OSS charts hard in 2025 as API economies boom — Gartner pegs API management at $7B by then.
Critique time. His PR spin calls it ‘battle-tested’ after ‘months in prod.’ Months? Cute, but show me the war scars — uptime SLOs, say 99.99%. Still, code’s clean: interfaces for strategies, Prometheus metrics baked in. No bloat.
Token Bucket vs. Sliding Window: Pick Your Poison
Short para: Token bucket wins bursts.
Now, deep dive. Fixed window’s trash — imagine New Year’s Eve clock reset, everyone slams. Leaky bucket smooths output but queues inputs (hello, backlog). Dhiman’s hybrid? Runtime flag: algo: token_bucket or sliding_log. Logs? Gorilla time-series style, compact bloom filters for probable hits, exact on collision.
Data point: In his load test graphs (blog screenshots), sliding window cuts false negatives 40% vs. fixed. For high-cardinality keys (millions of users), it shards by prefix — clever, scales horizontal.
And the config YAML? Genius for ops.
limits:
global:
window: 1m
capacity: 1000
per_user:
window: 60s
capacity: 50
Reloads hot, no restart. Beats env vars every time.
The Hidden Cost of API Abuse — And How This Fixes It
Costs. Stripe charges $0.0001 per API call overage; AWS API Gateway? $3.50/million after free tier. Unthrottled bots? Your bill triples. Dhiman’s limiter logs violators — IP, endpoint, burst size — feeds straight to Fail2Ban or your SIEM.
Unique angle: Echoes Akamai’s 90s hardware limiters, but software-only. Back then, $100k boxes guarded Yahoo. Now? This runs on $10/month VPS. Democratized.
Downsides? Go-only for now. Port to Rust? Easy, actix-web crate awaits. No geo-IP yet — add MaxMind, boom.
Teams adopting similar: Vercel Edge Config for limits, dynamic. But locked-in. This? Self-hosted freedom.
Why Does This Matter for API Builders?
Single line: It hands control back.
Builders — if you’re gluing Stripe + OpenAI calls, this middleware proxies and throttles upstream too. Chain limits: app-level + vendor. Prevents cascade failures.
Market dynamic: With AI agents spamming APIs (Anthropic’s Claude chews 10k tokens/call), per-agent limits save cash. Dhiman’s extensible — add JWT claims.
🧬 Related Insights
- Read more: Astral’s Ruthless GitHub Actions Lockdown: Securing Open Source from Within
- Read more: AWS S3 Files: Buckets Morph into Local Hard Drives Overnight
Frequently Asked Questions
What is a configurable rate limiter?
It’s a throttling system you tweak on-the-fly — window size, algo, keys — without code changes. Perfect for varying loads.
How do you implement a token bucket rate limiter in Go?
Grab Ristretto for storage, define capacity/refill rate, check tokens before serving. Dhiman’s repo has the full handler.
Is a custom rate limiter better than Redis-based ones?
For low-latency, yes — skips network. Redis shines for distributed, but adds 5-10ms RTT.