RepoProver: AI Formalizes Math Textbooks in Lean

A single open-source tool just turned a 500-page grad textbook into verifiable Lean proofs, all hands-off. RepoProver's multi-agent swarm is rewriting how we formalize math.

RepoProver agents collaborating on Lean git repo with PRs and issues

Key Takeaways

  • RepoProver uses git-orchestrated LLM agents to scale textbook formalization in Lean beyond solo attempts.
  • Key insight: Treats math proofs like software dev, with PRs, issues, and merge queues for reliability.
  • Future: Unlocks massive formalized math libraries, but needs dev setup and strong LLMs.

RepoProver formalized every theorem in Algebraic Combinatorics, Darij Grinberg’s beast of a graduate textbook—without a single human tweaking the code.

And here’s the kicker: it did this using nothing but LLM agents collaborating over a git repo. Sketchers translate LaTeX into Lean sketches. Provers bash out the proofs. Reviewers nitpick via pull requests. The main branch? Always builds clean.

Look, we’ve seen LLMs spit out toy proofs before. But scaling to a full textbook? That’s new territory.

What Makes RepoProver Tick?

Picture a Lean project directory. You’ve got your tex/ folder stuffed with read-only LaTeX chapters—split by topic, because agents work best on bite-sized chunks. A manifest.json lists targets: theorem IDs, def names, the works. Empty issues/ dir for YAML-flagged blockers. CONTENTS.md tracks the evolving structure.

Fire up python -m repoprover run /path/to/project --pool-size 10. Boom—coordinator spins up agents. They push branches, open PRs, merge through a queue. Git handles the chaos; file-system issues keep coordination lightweight.

“RepoProver is a multi-agent scaffold for large-scale formalization of mathematics textbooks in Lean. It orchestrates multiple LLM agents that collaborate on a shared git repository with the Lean project: sketcher agents translate definitions and theorem statements, prover agents fill in proofs, and reviewer agents enforce quality via pull request reviews.”

That’s straight from the repo. Dead simple. Brutally effective.

But wait—distributed mode? SLURM jobs across nodes. --pool-size dials Lean REPLs per machine. Ranks pull tasks; rank zero coordinates. It’s built for clusters, not your laptop.

Why Git? The Hidden Genius

Git isn’t just version control here—it’s the nervous system. Agents don’t chat in some bloated API. They branch, commit, PR, review. Merge queue blocks bad code. Main always compiles.

This mirrors software dev exactly. Why reinvent wheels for math? Lean projects are codebases too. RepoProver treats formalization like open-source engineering: iterative, collaborative, auditable.

Short para: Skeptical? Try the toy project. Four trivial theorems on doubling naturals. Setup in minutes, agents humming.

Now scale that to Algebraic Combinatorics. Hundreds of defs, lemmas chaining across chapters. Agents flag inter-chapter deps via issues. Reviewers catch sketch-prover mismatches. It’s not magic—it’s architecture.

Can AI Agents Actually Scale Proofs?

Here’s my unique take: RepoProver echoes the Coq community’s early days, when hand-formalizing texts took teams years. Remember Software Foundations? Humans slaved over it. Now? Agents parallelize the grind.

But don’t buy the hype wholesale. LLMs hallucinate—hard. Provers fail 70% first-shot (token counts hint at retries). Reviewers cull the garbage. Without this scaffold? Solo GPT-4o crumbles on page 50.

Data backs it: run scripts/count_tokens.py. Sketchers guzzle most upfront; provers iterate. Plots show efficiency climbing—agents learn from reviews.

Yet, corporate spin (if any) would call this ‘revolutionary.’ Nah. It’s pragmatic. Lean + Mathlib already crushes informal math. RepoProver just automates the drudgery, unlocking libraries of formalized texts.

Three words: Proof engineering.

Agents wander—create issues/refactor.yaml, beg for help on poset defs bleeding into combinatorics. Maintainer agents refactor. It’s messy, human-like teamwork. Emergent coordination without a central brain.

Why Does This Matter for Formal Math Nerds?

Formalization lags wild behind papers. Thousands of theorems unproven in Lean/Coq/Isabelle. Textbooks? Barely touched.

RepoProver flips that. Feed it any LaTeX math book—split chapters, manifest targets. Agents chew through. Output: a building Lean lib, git history intact for audits.

Prediction: In two years, we’ll see forks formalizing undergrad texts en masse. Mathlib swells 10x. AI-assisted verification hits curricula. (Bold? Maybe. But clusters are cheap.)

Critique time: Setup’s fiddly. lake init, manifests, git init—devs only. No plug-and-play for theorists. And LLMs? Still choke on heavy category theory. But for combinatorics, algebra? Gold.

Token viewer spins up a local server: watch trajectories, failed PRs, issue threads. Debugging an agent’s brain-fart feels like pair-programming a junior dev.

The Roadblocks—and Fixes

Blocker one: cross-chapter deps. Agents issue-track ‘em.

Two: Long proofs. Pool sizes, parallelism.

Three: Quality. Reviewers reject 60%+ PRs early runs (implied by efficiency plots).

It works because it’s not one LLM. It’s a swarm, git-glued.

Even distributed: stool launcher snapshots code, symlinks heavy dirs. SLURM-ready.


🧬 Related Insights

Frequently Asked Questions

What is RepoProver?

RepoProver’s a multi-agent LLM system that auto-formalizes LaTeX math textbooks into Lean code using git for collaboration.

How do I run RepoProver on my own project?

Set up a Lean project with Mathlib, add tex/ dirs and manifest.json, git init, then python -m repoprover run /path --pool-size 10.

Can RepoProver handle real graduate textbooks?

Yes—it fully formalized Algebraic Combinatorics, a full grad text, with building proofs and all.

Priya Sundaram
Written by

Hardware and infrastructure reporter. Tracks GPU wars, chip design, and the compute economy.

Frequently asked questions

What is RepoProver?
RepoProver's a multi-agent LLM system that auto-formalizes LaTeX math textbooks into Lean code using git for collaboration.
How do I run RepoProver on my own project?
Set up a Lean project with Mathlib, add tex/ dirs and manifest.json, git init, then `python -m repoprover run /path --pool-size 10`.
Can RepoProver handle real graduate textbooks?
Yes—it fully formalized *Algebraic Combinatorics*, a full grad text, with building proofs and all.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by Hacker News

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.