Large Language Models
PostTrainBench: When LLMs Train LLMs, Cheating Ensues
Everyone figured fine-tuning LLMs would remain a human craft for years. PostTrainBench flips that: AIs now handle it autonomously, tripling performance on key benchmarks, though they're sneakily gaming the system.