Redactions crumble.
Cross-reference a dozen government dumps from the Epstein saga, and hidden text bubbles up—algorithmically, no guesses. That’s Unobfuscator’s magic, spitting out a SQLite database crammed with 1.4 million documents. Fine for coders. Useless for journalists chasing leads. Enter TEREDACTA: a web interface that flips this into something explorable, fast, fearless.
And here’s the kicker—no React bloat, no npm hell. Just FastAPI, HTMX, Jinja2, and SQLite. A stack so lean it feels like 2015 all over again, but sharper.
Why Ditch the JS Overlords?
Look, we all drank the SPA Kool-Aid. Client-side rendering promised snappiness; delivered bundle sizes from hell and debug nightmares. This tool laughs at that. HTMX swaps HTML chunks over HTTP—partial updates, no state hoarding in some virtual DOM.
The entire frontend is HTMX with vendored JS and server-rendered templates.
Boom. That’s the original builder’s mic drop. Server-sent events pipe progress on chunky ops. Feels reactive. Ships zero app JS.
But does it scale to investigative drudgery? Boolean search across millions. Entity graphs linking names, emails, orgs. Doc viewers highlighting recoveries. Progress bars on de-redaction runs. HTMX nails it with hx-get, hx-swap. One request, one HTML snippet. Browser network tab? Your crystal ball.
Debugging’s a dream—no Redux rivers to trace, no component re-renders glitching out. HTTP in, HTML out. When a search flakes, refresh the tab. Done.
How SQLite Swallows 1.4 Million Docs Whole
SQLite. Single-writer darling. Reads fly concurrent on static data—perfect for FOIA spelunking. But cold starts? Ten seconds of yawn. Fix: ruthless indexes, PRAGMA tweaks. mmap_size balloons memory-mapped I/O; cache_size stuffs the page cache.
No Redis crutch. Why? This app’s zero-deps pure. Caching layers? Architectural graffiti on a minimalist canvas. Entity index splits off—people, phones, locations in their own DB. Query isolation keeps doc searches from poisoning graph traversals.
Sub-2-second searches. Often under 500ms. Keeps investigators in flow, not twiddling thumbs.
Boolean search across 1.4 million documents needs to return results fast enough that investigators don’t lose their train of thought. “Fast enough” in this context means under 2 seconds, and ideally under half a second.
That’s the raw truth from the trenches.
Separate read workloads. Surgical indexes. Done. No vaporware queues.
Security: Because Targets Love This Stuff
Public docs, sure. But Epstein bait draws script kiddies. Signed cookies auth. CSRF everywhere. Unobfuscator DB? Read-only mount. Regex inputs? Backtracking shields—no ReDoS crashes.
Caddy fronts it—TLS auto-magical. SSE for progress? Three rewrites to tame disconnects, proxies (Cloudflare hates buffering). Proxy timeouts, client reconnections—hairy, but tamed.
MIT licensed. Transparent algo. No AI hallucinations—just math on public dumps.
The Architectural Shift: Back to HTTP Basics
This isn’t retro. It’s rebellion. Frontend frameworks ballooned—why ship a runtime when servers render free? FastAPI’s async shines on Python; HTMX hijacks hypermedia. SQLite scales quiet for reads.
My take? Parallels WikiLeaks’ early dumps. Raw archives, simple search overlays. No Angular cruft. Journalists thrived. Today, we’d React-ify it into oblivion. TEREDACTA revives that: lightweight, auditable, deployable anywhere.
Prediction: investigative tools swing this way. Heavily redacted corps? FOIA floods? Ditch the JS tax. Server-first wins for truth-digging.
Corporate spin calls this “lightweight.” Nah—it’s efficient warfare on bloat.
And the recoveries? BOP emails on Epstein’s MCC stay. Staff rosters. FBI logs—113 passages revived. Maxwell’s PR drafts. Gold for watchdogs.
Can HTMX Really Replace React for Big Data UIs?
Short answer: for this workload, hell yes.
Long? Feature parity’s there—searches, graphs, viewers. No webpack deploys. Vendored Tailwind, Alpine if needed (wasn’t). Iteration speed? Warp drive.
Tradeoffs? Heavy writes kill SQLite—batch updates only. Real-time collab? Bolt WebSockets later. But for solo sleuths or small teams? Untouchable.
Builder admits SSE headaches. Document your recon logic early—proxies butcher streams.
Why SQLite Over Postgres for Doc Dumps?
Postgres tempts with extensions. But overhead. Migrations. This? One binary. PRAGMA cache_size=1000000—boom, 400MB cache. mmap_size=1GB. Cold queries? Vaporized.
Static dataset. Read-only bliss. Entities segregated—no cache thrash.
Vandalism avoided: Redis would’ve prettied benchmarks, wrecked purity.
Deploy and Iterate: Zero-Frills Victory
Caddy reverse proxy. Docker optional. Python app.py—uvicorn & go. HTMX scripts? CDN or vendor. Tailwind CSS minified.
No CI/CD ritual. Git push, restart. Fixes land instant—no bundle rebuilds.
That’s the how. Server owns logic. Client? Dumb terminal with flair.
Recovered Gems: What’s TEREDACTA Unearthing?
Internal MCC emails—Epstein’s jail saga unmasked. FBI logs detailing evidence chains. Maxwell’s damage control scribbles.
5,600+ passages. 15k match groups. Congressional DOJ dumps aligned.
Journalists: search “MCC shift”—staff lists emerge. “Ghislaine PR”—drafts surface.
🧬 Related Insights
- Read more: 2-5 AM Coding: Myth or Productivity Goldmine?
- Read more: Edtech’s Hidden Integration Bomb: LTI Fails Where APIs Shine
Frequently Asked Questions
What is TEREDACTA and Unobfuscator?
TEREDACTA’s the web UI; Unobfuscator recovers redacted text by cross-referencing gov releases. Handles Epstein/Maxwell docs—1.4M files.
How to build a FastAPI HTMX SQLite app like this?
FastAPI for API/routes, HTMX for dynamic HTML swaps, SQLite with indexes/PRAGMAs. Server-render Jinja2. Add SSE for progress.
Does HTMX work for large-scale search tools?
Yes—for read-heavy, static data. Sub-second queries via smart SQLite tuning. Skip for high-write apps.