The AI Catchup

AI's Step-by-Step Lies: Chain-of-Thought's Dirty Secret

Chain-of-Thought reasoning was supposed to make AI transparent. Turns out, it's often just post-hoc BS from models that already know their answer.

5 min read 4 weeks, 1 day ago

🛠️

AI Tools

I Just Built This 'Domain Expert' Agentic AI—And It's Smarter Than It Looks (Barely)

Picture this: an AI agent that chats about data pipelines, pulls from its 'memory,' then learns from its own BS. Sounds agentic. But does it deliver—or just hallucinate smarter?

4 min read 1 month ago

Illustration of recursive language model tree processing long input via sub-calls

Large Language Models

Recursive Language Models: AI's Recursion Fix for 'Context Rot' – Or Compute Nightmare?

LLMs choke on their own long prompts, dropping accuracy by 40% past 50K tokens. Recursive language models promise a recursive escape—smart, or just recursive madness?

5 min read 1 month ago

Gemini 3.1 Pro generating interactive 3D starling murmuration simulation

Large Language Models

Gemini 3.1 Pro's 77% ARC-AGI Leap: Google's Real Edge in Reasoning?

Google just dropped Gemini 3.1 Pro, claiming double the reasoning power on tough benchmarks. It's rolling out everywhere—but is this the intelligence upgrade we've been waiting for, or just another incremental tweak?

5 min read 1 month ago

OpenAI o3 reinforcement learning training pipeline diagram with GRPO optimization

Large Language Models

o3's 10x RL Compute Gambit: The Real State of LLM Reasoning Reinforcement

OpenAI's o3 didn't just scale — it poured 10x compute into reinforcement learning for reasoning, smashing benchmarks. Meanwhile, GPT-4.5's yawn proves scaling alone is tapped out.

5 min read 1 month ago

Chart comparing inference-time vs training compute scaling laws for LLMs

Large Language Models

Inference Scaling: Why It's Silently Crushing LLM Training Limits

Spending more compute at inference — not training — unlocks LLM reasoning gains that rival model upgrades. Here's the categorized playbook from recent papers.

4 min read 1 month ago