Staring at my terminal last week, I watched four AI agents chew through a simple request for an AWS app server — and spiral into chaos over a security group rule.
That’s InfraSquad in action, folks. This open-source multi-agent setup, built on LangGraph, promises to ditch the endless meetings where architects, DevOps folks, security teams, and diagrammers all yell about 0.0.0.0/0 CIDRs. You type plain English requirements. Boom: deployable Terraform HCL, a security audit with fixes, and a pretty Mermaid diagram. No drama.
Or so they thought.
What the Hell is InfraSquad?
Four agents, one pipeline, shared state. Product Architect dreams up a numbered AWS plan from your words, pondering scale, costs, compliance. DevOps Engineer cranks out Terraform code — and iterates on security gripes. Security Auditor scans with tfsec or Checkov, spits JSON reports, flags horrors. Visualizer renders the final diagram as PNG.
The kicker? Loops. Security doesn’t just nag; it kicks code back for rewrites. Up to three cycles, then mercy — proceed anyway, warnings and all.
Here’s the raw truth from the builders:
The critical word in that table is “sent back.” The Security Auditor does not just generate a report and hand it off. It can send the DevOps Engineer back to fix its own code. That feedback loop is the most interesting design decision in the system.
Smart. Risky.
But.
Integration test, day two. Request an internet-facing Application Load Balancer. Security flags AVD-AWS-0107: HIGH, unrestricted ingress from 0.0.0.0/0. DevOps fixes. Scan again. Same flag. Fix. Flag. Infinite hell.
Why? Public ALB needs open ingress. That’s the point. LLM can’t grok “intentional risk” vs. “bonehead mistake.” No cap? Eternal loop. They added one — after the fact.
Why Can’t AI Stop Hallucinating 0.0.0.0/0?
Every model barfs the same insecure crap. Weeks of prompt tweaks, examples, counter-examples. Nada. It nods at rules, then ignores ‘em.
This ain’t new. Flashback to 2014, Ansible’s YAML heyday. Ops teams thought declarative configs would end misconfigs forever. Ha. ‘Hallucinations’ were just bugs back then — human ones. Now LLMs dress ‘em in suits, charge premium, and call it ‘agents.’ Same trap: tools amplify our blind spots, don’t erase ‘em.
Here’s my hot take, absent from the original post: InfraSquad’s loop cap is a band-aid on AI’s core flaw — zero real-world judgment. It’ll churn safe cookie-cutter VPCs fine. Throw in edge cases like hybrid clouds or regulated finance? Humans still rule. And who’s cashing in? Andela’s bootcamp crew open-sources it, sure, but watch for the enterprise pivot: ‘Pay us to tweak your prompts.’ Classic Valley grift.
State machine’s typed, TypedDict with total=False — agents touch only their turf, dodging None-value crashes. Validate input first, cheap checks. Caps on HCL validation loops too.
Routing logic? Crystal:
def route_after_security(state: AgentState) -> Literal[“visualizer”, “devops”]: if state.get(“security_passed”, False): return “visualizer” if state.get(“remediation_count”, 0) >= settings.max_remediation_cycles: return “visualizer” # move on regardless return “devops”
After three tries, ship it. Unfixed issues? Advisory only. Pragmatic.
Is This Ready for Prime Time, or Dev Playground Toy?
Tested on AWS, Terraform-focused. Scale it? Unknown. Multi-cloud? Dream on. But for solo devs or small teams grinding VPCs, it’s a time-suck killer — if you babysit.
Broke badly without caps. Silent failures from state mismatches. Models too literal on security dogma.
Prediction: By 2025, we’ll see forks with human-in-loop overrides, maybe RAG on your org’s compliance docs. But full autonomy? Nah. Too much liability when that ‘fixed’ SG nukes prod.
Who wins? Open-source hackers iterating fast. Andela gets cred. You? Fewer meetings, maybe. But don’t fire your SecOps guy yet.
The code’s out there: Andela-AI-Engineering-Bootcamp/infrasquad. Fork it. Break it. Fix it.
Short version: Add cycle caps day zero. Prompt-fight less, state-share more. AI infra? Promising hack. Total replacement? Laughable hype.
🧬 Related Insights
- Read more: The 30-Second Rollback: Why Deploynix’s Release Strategy Actually Works (And Why It Matters)
- Read more: AirData UAV Caves to Open Source Pressure: Drone Logs Go Fully Portable
Frequently Asked Questions
What is InfraSquad and how does it work?
Multi-agent LangGraph pipeline turns English into secure Terraform HCL + diagram, with security loops.
Can AI agents replace DevOps meetings for Terraform?
Partially — handles boilerplate, but edge cases and judgment calls still need humans.
Why do AI Terraform tools keep generating 0.0.0.0/0?
LLMs prioritize rote security rules over intent; needs caps and overrides.