Fingers hammering keys at 2 AM, cache invalidation blowing up again. Seed AI — my CLI AI coding assistant — just ate a file edit, spat out stale code. Third time this week.
Zoom out. Several months ago, I tore into Claude Code’s guts. Excellent tool. Locked to Anthropic’s API. No local models. No memory. Tools run serially, like it’s 1995. I built Seed AI in TypeScript. Fourteen fixes. Here’s the brain dump — minus the coffee stains.
Claude Code shines. But those constraints? Product choices, sure. They scream opportunity.
Claude’s Serial Tool Hell
Picture this: LLM wants three file reads. Boom — three permission pops, one after another. Latency? Triples. Naive parallelism? Permission spam confuses everyone.
The split’s elegant. Permissions serial. Execution parallel.
// Permissions: serial (user reviews one at a time) const approvedCalls: ToolCall[] = []; for (const call of toolCalls) { const approved = await askPermission(call); if (approved) approvedCalls.push(call); } // Execution: parallel (no UX interaction needed) const results = await Promise.allSettled( approvedCalls.map(call => tools.execute(call.name, call.input)) );
Latency plummets. N times T becomes 1.2 times T. Dry? Brutally effective.
LLMs reread files obsessively. file_read, glob, grep — idempotent drudgery. Cache it, idiot.
Trick? Writes invalidate before execution. Miss that, and you serve old content post-edit. Costly.
Cache keys? Tool name plus JSON-stringified input. Hit rates? 20-40% per session. Obvious now. Wasn’t then.
Memory That Actually Remembers
Long chats? Claude truncates. Poof — history gone.
Seed AI compresses with Haiku. Pennies per summary ($0.0002). Injects into system prompt:
Earlier conversation summary (compressed)
[Completed: refactored auth module, fixed token expiry bug] [Decided: use refresh tokens over session extension] [Current state: tests passing, ready for PR]
Layers stack with — separators. Main model stays clued in.
Long-term? Three-tier memory in ~/.seed/memory/. User profiles. Project contexts via SHA1 path fingerprints. Decisions, learnings. Haiku extracts durable bits. Embeddings pull top-8 chunks. 800 tokens flat, no matter the bloat.
No embeddings? TF-IDF fallback. Smart.
This isn’t just persistence. It’s the anti-forgetful-AI we’ve begged for since ChatGPT day one.
Docker Sandbox Without the Drama
Bash tools? Fresh containers. Auto-removed. Strict mode: read-only FS, no network, 512MB cap, no privileges.
[“run”, “–rm”, “-v”,
${mountPath}:/workspace, “-w”, “/workspace”, “–network”, “none”, // strict mode “–memory”, “512m”, “–cpus”, “1”, “–security-opt”, “no-new-privileges”, “alpine:latest”, “sh”, “-c”, command]
Docker down? Probes first. Falls back to host with a big warning. No crashes. Users know the mode.
Isolation levels: strict, standard, permissive. Pick your poison.
Why Does Ollama Suck at Tool Calls?
Ollama? No endpoint for tool support. Probe with a dummy request. 400 with ‘not support tools’? Fallback to XML in prompts.
{“name”: “file_read”, “parameters”: {“path”: “src/index.ts”}}
One registry handles both. qwen2.5-coder nails native. DeepSeek? XML or bust — and even then, flaky.
Took sessions to nail. Ollama’s opacity? Criminal.
Ink redraws on every stream delta. Buffers ‘em. Terminal stays sane.
Why Bother Building Your Own CLI AI Coder?
Claude Code’s hype machine rolls on. But vendor lock? Serial slowness? No local runs? It’s a toy for Anthropic fans.
Seed AI breaks free. DeepSeek. Ollama. Whatever. Caching slashes redundancy. Memory builds wisdom. Docker keeps it safe.
Here’s my unique jab: This echoes the ’90s Linux fork from Unix — proprietary pain birthed open chaos. Prediction? CLI AI coders fragment into a dozen Seeds. Innovation explodes. Anthropic scrambles with ‘Claude Code Pro: Now With Caching!’ (tm). Corporate spin incoming.
Devs, you’re not powerless. Build your own. Or fork this.
But wait — is it prod-ready? Nah. Edge cases lurk. Docker probes can flake on wonky setups. Embeddings need tuning. Still, for solo hacks? Gold.
Claude’s PR would call these ‘features.’ I call ‘em handcuffs. Seed AI? Keys.
A single sentence: Freedom tastes better than lock-in.
We’ve got parallelism slicing latency. Caching at 40% hits. Summaries for pennies. Memory that spans projects. Sandboxes that don’t explode.
Fourteen improvements. Each solves a real itch. Claude Code set the bar. Seed AI vaults it.
Skeptical? Run it. See the cache warm up. Watch permissions flow smooth. Feel the speed.
Or stick with the mothership. Your funeral.
Will Seed AI Replace Claude Code for Good?
Not tomorrow. Claude iterates fast. But trends? Local models cheapen. Open tools win. Seed AI proves you don’t need a $100B corp for killer CLI AI coding.
DevTools Feed verdict: Grab it. Tinker. The future’s local, parallel, remembered.
**
🧬 Related Insights
- Read more: Gemma 4 Tears Through Benchmarks – Google’s Open AI Power Grab
- Read more: Token Burn Leaderboards: Rewarding AI Waste in the Name of Progress
Frequently Asked Questions**
What is Seed AI CLI?
TypeScript CLI AI coding assistant. Fixes Claude Code’s locks: local models, caching, parallel tools, persistent memory, Docker safety.
How does Seed AI improve latency over Claude Code?
Serial permissions, parallel execution. Cache hits 20-40%. Summaries keep context cheap.
Can I run Seed AI without Docker?
Yes. Auto-fallback to host with warnings. Three isolation modes.