Replace Vector DBs: Google's Memory Agent for Obsidian

Forget vector databases for your Obsidian notes. Google's Memory Agent Pattern crams months of memories into Claude's 200K window, skipping embeddings entirely.

Diagram of Google's Memory Agent replacing vector DBs in an Obsidian notes workflow

Key Takeaways

  • Google's Memory Agent skips vector DBs by stuffing 650 structured memories into 250K token windows.
  • SQLite + LLM reasoning beats embeddings for personal notes—faster setup, better semantics.
  • Trend alert: Exploding context windows make direct memory the default for solo AI agents.

Vector DBs? Overkill.

And here’s why that hits different now. Claude Haiku 4.5 packs a 250K token context window—enough for 650 structured memories at 300 tokens each. No more Pinecone sign-ups, no Redis sprawl, just SQLite and raw LLM smarts. The original hack ditches embeddings because modern models reason over semantics better than cosine similarity ever could. It’s a pattern borrowed from Google’s always-on-memory agent, tested on AWS Bedrock, and tuned for Obsidian vaults.

Look, I’ve seen this before. Back in 2010, when Solr ruled search, devs obsessed over inverted indexes for tiny RAM limits. Then hardware leaped—SSD prices crashed, cores multiplied—and suddenly, full-text search on the whole corpus beat fancy retrieval every time. Same math here. Vector DBs fixed 4K-8K token poverty; today’s windows flip the script.

Why Google’s Memory Agent Crushes Embeddings for Notes?

The setup’s dead simple. Ingest raw notes—like “Alice approved Q3 budget, $2.4M”—feed to Haiku. It spits structured gold:

{ “id”: “a3f1c9d2-…”, “summary”: “Alice confirmed Q3 budget approval of $2.4M”, “entities”: [“Alice”, “Q3 budget”], “topics”: [“finance”, “meetings”], “importance”: 0.82, … }

That’s your memory row in SQLite. No vectors, no API keys for embeddings. Query time? Grab the last 50 (or 650 if you’re bold), plus consolidations, shove into prompt. Haiku synthesizes with citations. Boom—“What happened Feb 1?” or “Recap Alice meetings” without RAG fuzziness.

But wait—consolidations add spice. A separate agent batches unprocessed memories daily, hunts cross-links, generates insights. Think: “Q3 budget ties to Alice’s Feb approval and vendor delays.” Lands in its own table. Scheduling’s smart—triggers on thresholds or boot-up. For personal notes, that’s months of context without a hiccup.

Does This Kill Vector DBs for Personal AI?

Short answer: For solo devs and note-takers, yeah. Market data backs it. Pinecone’s grown 10x since 2022 on RAG hype, but context windows exploded—Gemini 1.5 hit 1M tokens, Claude’s at 200K+. Embeddings shine for billion-scale corpora (enterprise CRM, say), but for 10K notes? LLM direct-read wins on accuracy. Embeddings miss nuances—dates, entities, sequences. Haiku groks “last meeting with Alice” natively.

My bold call: This pattern halves dev time for 80% of agent builds. Vector stacks demand pipeline glue—ingest, chunk, embed, index, hybrid search. Here? Python class in FastAPI. Three agents: ingest, consolidate, query. GitHub repo’s live; fork it tomorrow.

Obsidian fits perfect. Vaults are Markdown goldmines—personal journals, work recaps. Old flow: MCP search on query, slow and spotty. Now? Memory agent owns it. Claude Code for life notes, Kiro-CLI for work. Roll-ups for bosses? Tracked goals? Auto-reports? All memory-fueled.

Risks? Context cliffs. Hit 650 memories, evict old ones by recency or importance. But that’s tunable—prioritize high-score items. No free lunch on token burn, though Haiku’s cheap at $0.25/M input.

Scale questions linger. Personal? Nailed. Enterprise? SQLite bottlenecks at 1M rows; swap for Postgres. But vectors still rule there—legal docs, customer 360s dwarf 250K. Still, hybrid’s coming: memory for hot data, vectors for cold.

Here’s the overlooked parallel—email search. Gmail ditched keyword hell for semantic AI years ago. No one embeds their inbox; models scan it live. Notes are inboxes 2.0. This hack just localizes that magic.

Obsidian’s ecosystem explodes with this. Plugins for Claude integration already hum; memory agent slots right in. Devs, test it—your vault’s amnesia ends.

PR spin check: Google’s repo pitches “always-on,” but it’s no silver bullet. Works because context grew; credit hardware, not just smarts. Original post nails the shift, though—“Vector search was mostly a workaround.”

Numbers don’t lie. 300 tokens/memory × 650 = 195K tokens. Leaves room for prompt (5K), query (1K), output (10K). Daily consolidations keep it fresh—cross-memory insights compound value.

What About Costs and Speed?

SQLite’s free, local. Haiku ingest: ~500 tokens/text block, pennies. Query: 50 memories = 15K tokens, sub-second. Vs. Pinecone: upsert latency, query API roundtrips, embedding costs. For 1K notes/month? Memory agent saves $10-50.

Edge over RAG? Citations direct to memory IDs—trace back to source note. Embeddings? Opaque chunks.

Pushback: What if notes explode? Cap at 200 memories, auto-consolidate aggressively. Or shard by topic. Flexible.

This isn’t hype—it’s economics. As windows hit 1M+, vector DBs niche to mega-scale. Personal AI? Direct memory reigns.


🧬 Related Insights

Frequently Asked Questions

Will Google’s Memory Agent replace vector DBs entirely? No, not for enterprise-scale data. But for notes, meetings, personal vaults—absolutely viable alternative.

How do I set up Memory Agent in Obsidian? Grab the GitHub repo, hook Claude via Bedrock, point at your vault. Python + FastAPI; runs local.

Does it work with other LLMs? Yes—any 100K+ window model. Tested on Haiku; Gemini, GPT-4o fine-tune easy.

Priya Sundaram
Written by

Hardware and infrastructure reporter. Tracks GPU wars, chip design, and the compute economy.

Frequently asked questions

Will Google's Memory Agent replace vector DBs entirely?
No, not for enterprise-scale data. But for notes, meetings, personal vaults—absolutely viable alternative.
How do I set up Memory Agent in Obsidian?
Grab the GitHub repo, hook Claude via Bedrock, point at your vault. Python + FastAPI; runs local.
Does it work with other LLMs?
Yes—any 100K+ window model. Tested on Haiku; Gemini, GPT-4o fine-tune easy.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by Towards Data Science

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.