Vitalik Buterin is running his AI in a cage. Not literally—but close enough that it matters.
The Ethereum co-founder just published a detailed breakdown of his personal AI setup, and buried in the technical specifics is something that cuts to the heart of a much bigger problem: we’ve normalized feeding our entire lives to cloud-based AI services, and almost nobody seems worried about it. Buterin is. He’s built walls.
He runs everything locally. The model—Qwen3.5:35B, open-source, nothing proprietary—sits on his laptop with an Nvidia 5090 GPU, hitting 90 tokens per second. That’s fast enough to feel responsive. He’s cached an entire Wikipedia dump on his machine to avoid pinging external search engines (which he treats as privacy leaks). And here’s the kicker: when his AI agent wants to send a message or move money, it can’t. Not without him explicitly approving it first.
“The new two-factor authentication is the human and the LLM,” he wrote.
The 15% Problem Nobody’s Talking About
Buterin opened his post with a statistic that should’ve gotten more attention. Security researchers found that roughly 15% of skills built for OpenClaw—the fastest-growing GitHub repository in history—contained malicious code. Some of it silently exfiltrated user data. Users had no idea.
That’s not a rounding error. That’s a feature of how AI agents are being deployed right now: fast, loose, and almost entirely trust-based. You point an LLM at a third-party tool and hope nothing goes wrong. What could possibly go wrong?
Everything. And Buterin knows it.
Is This Actually Practical, or Just Crypto Theater?
Here’s where it gets interesting. Buterin isn’t just preaching—he’s implemented it. The architecture is almost boring in its simplicity: his AI can read from Signal and email freely. It can analyze, summarize, draft responses. But when it comes to write-level operations? Hard stop. Manual review only.
For Ethereum wallet teams, he’s proposing a tiered approach. Transactions under $100 per day? Autonomous. Anything above that? Human sign-off required.
It’s not revolutionary. It’s not even that clever. But it’s the exact opposite of what every mainstream AI company is doing right now. OpenAI, Google, Anthropic—they’re all racing toward agentic autonomy without guardrails, betting that rate-limiting and safety training will handle the edge cases. Buterin’s betting that they won’t.
He’s probably right. And he’s probably alone in actually building for that assumption.
Why This Matters Beyond the Crypto Bubble
Look, there’s something genuinely unsettling about Buterin’s framing here. He’s not angry at AI companies. He’s scared—genuinely scared—that we’re about to undo a decade of privacy progress.
End-to-end encryption became normal. Local-first software started gaining momentum. And then—just as privacy architecture was finally winning—AI arrived and said: “Actually, let’s send everything to our cloud servers, okay?” No opt-out. No choice. Just normalized extraction.
Buterin saw it coming. In February, he published a four-quadrant roadmap for Ethereum-AI alignment. That was theory. This blog post is practice. It’s him saying: here’s what actual privacy-preserving AI looks like. It’s slower. It’s less convenient. It requires you to approve things manually. And maybe—just maybe—that’s the price of not living in someone else’s data center.
The crypto angle here is almost secondary. Yes, he’s worried about AI agents autonomously moving funds. Yes, he’s built his setup to mirror how he already manages his crypto (90% in a multisig Safe wallet with distributed keys). That’s smart architecture. But the real insight is simpler: if you’re letting an AI system do anything important without human approval gates, you’ve lost.
The Architectural Shift Nobody Predicted
What Buterin’s actually describing is a return to something older—the principle that proximity equals control. Cloud AI looked like the future because it was fast and cheap. Local AI was supposed to be too slow to matter, too weak to compete. That was five years ago. Inference speeds have exploded. Model sizes have been optimized. An Nvidia 5090 can now run serious language models at conversation speed.
Suddenly, the tradeoff looks different. You can have privacy and capability now. The only thing you lose is the extraction play—the ability for companies to aggregate everyone’s queries and retrain on them.
But that extraction was never a feature. It was always the business model wearing a fake smile. Buterin just showed what happens when you build for users instead of datasets.
🧬 Related Insights
- Read more: Cambodia’s Life Sentence Law Won’t Stop Crypto Scammers—It’ll Just Scatter Them
- Read more: Quantum Computers Are Coming for Your Bitcoin. Here’s Who’s Actually Ready.
Frequently Asked Questions
What model does Vitalik Buterin use for his AI setup? He runs Qwen3.5:35B, an open-source model, locally on a laptop with an Nvidia 5090 GPU via llama-server. It achieves around 90 tokens per second, which he finds usable for real-time tasks.
How does Vitalik prevent his AI from making unauthorized transactions? He built a custom messaging daemon that allows the AI to read incoming messages and emails freely, but requires manual human approval before the AI can send outbound messages or move cryptocurrency. Transactions above $100 per day require confirmation.
Why did Vitalik switch to local AI instead of using OpenAI or other cloud services? He’s concerned about privacy risks from cloud-based AI services and cited research showing 15% of AI skills contain malicious code that can silently exfiltrate data. He treats cloud infrastructure as a privacy leak and prefers local-first control.