theAIcatchup

GLM-5.1 Edges Out GPT-5.4 on SWE-Bench Pro — Failure Modes Reveal the Cracks

Developers chasing AI coding assistants just got a wake-up call. GLM-5.1 scores higher than GPT-5.4 on SWE-Bench Pro — yet it crumbles in marathon sessions.

5 min read 4 weeks ago

🤖

Large Language Models

GPT-5.4 Unleashed: When AI Codes Better Than Your Best Engineer

OpenAI's GPT-5.4 just hit 92% on HumanEval — that's better than most human coders. Meanwhile, lab-grown neurons are fragging demons in DOOM. Buckle up; AI's rewriting reality.

5 min read 1 month ago

Conceptual diagram of GPT-5.4 as a cognitive operating system with neural core and runtime layers

Large Language Models

GPT-5.4: OpenAI's Bold Pivot to AI as Operating System

GPT-5.4 isn't just bigger—it's smarter at running itself. OpenAI's turning language models into full-blown cognitive engines, and that's shaking up everything from agents to enterprise stacks.

4 min read 1 month ago

AI agent controlling a computer desktop mouse and keyboard autonomously

Large Language Models

GPT-5.4 Grabs the Mouse: Agents Rewrite Desktop Work

GPT-5.4 doesn't just think—it clicks, scrolls, and executes across your desktop apps. Cursor turns devs into supervisors of autonomous code agents. Buckle up: the agentic desktop is here.

4 min read 1 month ago

🤖

Large Language Models

GPT-5.4 Mini and Nano: OpenAI's Tiny Titans That Punch Way Above Their Weight

Forget the mega-models—we all craved GPT-5's raw power. OpenAI just flipped the script with mini and nano versions that run circles around the big ones.

4 min read 1 month ago

#gpt-54

GLM-5.1 Edges Out GPT-5.4 on SWE-Bench Pro — Failure Modes Reveal the Cracks

GPT-5.4 Unleashed: When AI Codes Better Than Your Best Engineer

GPT-5.4: OpenAI's Bold Pivot to AI as Operating System

GPT-5.4 Grabs the Mouse: Agents Rewrite Desktop Work

GPT-5.4 Mini and Nano: OpenAI's Tiny Titans That Punch Way Above Their Weight