Fixed. At last.
Kubernetes folks — and OCI runtimes — just rolled out a sharper conversion from cgroup v1 CPU shares to v2 CPU weight. No more pods getting bullied by system daemons. Or sub-cgroups choking on granularity crumbs.
It’s about time. cgroup v1’s simple shares (2 to 262144) never meshed with v2’s tidy weights (1 to 10000). The old linear map? A hack from KEP-2254. Pumped out puny numbers. Like turning 1024m requests — that’s one full CPU — into a measly weight of 39. Default system weight’s 100. Pods suddenly second-class citizens.
Why the Old Formula Was a Dumpster Fire
Picture this: your container begs for 1 CPU. v1 hands it shares=1024, matching daemon defaults. Fair fight. v2? Slaps it with 39. Pods starve while host processes feast. Especially nasty in daemon-heavy clusters. Resource crunch hits? Kubernetes workloads fold first.
And granularity? Laughable. 100m CPU request — shares=102 — becomes weight=4. Try splitting that among sub-cgroups. Impossible. KEP-5474 dreams of writable cgroups for unprivileged pods? Dead on arrival with numbers that tiny.
The current conversion formula creates two major issues: 1. Reduced priority against non-Kubernetes workloads… 2. Unmanageable granularity.
That’s straight from the announcement. No sugarcoating.
But here’s my hot take — one the post glosses over: this mess echoes the systemd-cgroup wars of 2015. Remember? Everyone screamed as v2 loomed, breaking toys left and right. Kubernetes lagged, naturally. Now, years later, they’re quadratic-curving their way out. Bold prediction: this quiets the last big cgroup gripes, paving smoother sails for distro-wide v2 adoption. No more hybrid cluster headaches.
Does the New Quadratic Formula Deliver?
Spoiler: yeah. Mostly.
It nails three anchors: (2,1) mins, (1024,100) defaults, (262144,10000) maxes. Close-to-linear in the sweet spot, but curved to preserve priorities. 1 CPU? Now weight=102. Bang on default. 100m? Jumps to 17. Sub-cgroups breathe easy.
Visually? Imagine a graph hugging the axes just right — no wild swings, no penny-pinching lows. OCI layer handles it: runc 1.3.2+, crun 1.23+. Kubernetes itself? Untouched. Runtimes dictate rollout.
Smart move, dodging kube’s glacial release cycle. But — em-dash alert — don’t pop champagne yet.
Gotcha: Your Deployments Might Explode
Existing setups assuming linear math? Busted. Monitoring dashboards spitting wrong weights. Custom managers hardcoding old formulas. Update ‘em, or watch priorities flip.
It’s not hype — the post warns: “Some consumers may be affected.” Understatement. If you’re scripting cgroup tweaks (why?), rewrite. And those sub-cgroup dreams? Test now. Unprivileged writable cgroups loom; granularity matters.
Look, OCI’s fix is solid. No PR spin here — just math doing its job. But Kubernetes’ v1 baggage? Lingers like a bad sequel. Clusters still straddle hierarchies. Full v2? When devs stop whining.
Short version: upgrade runtimes. Profit.
This isn’t revolutionary. It’s cleanup. Long-overdue. Systemd fans nod knowingly — v2’s superior, always was. Pods deserved equal footing. Now they’ve got it. Granularity bonus? Icing.
But wander with me here: what if this sparks a cgroup v3? Nah. Too soon. Kernel’s iterating elsewhere.
Why Does cgroup v2 Matter for Your Clusters?
Devs on cgroup v1? You’re dinosaurs. v2’s unified hierarchy slays delegation woes. Mileage-based weights? Predictable under load. No shares lottery.
Kubernetes 1.25+ pushes v2 hard. Unified mode only now. This conversion seals the deal — no priority cliffs. Daemon storms? Pods hold ground.
Historical parallel: think iptables to nftables. Painful shift, but nft won. cgroup v2’s nft. Suck it up.
Prod tip: audit runtimes. containerd? Pulls runc/crun. Docker? Same. Version check scripts, stat.
And for the tinkerers — parenthetical: yeah, you — sub-cgroups just viable. Spin up tests. 17 weight splits nicer than 4.
Rollout Realities: Runtimes Lead, Kube Follows
runc 1.3.2. crun 1.23. Drop-ins. No kube patches needed. Beautiful.
But clusters laggy on updates? Pray. Or force it.
Critique time: announcement’s chipper. “Excited to announce.” After years of breakage? Try “Sorry for the screwup.” Dry humor aside — they fixed it. Credit where due.
Impact? High for v2 migrants. Low if you’re v1 holdout. (Don’t be.)
🧬 Related Insights
- Read more: The Lie Detector Test Every Tech Leader Ignores: A 20-Year-Old MBA Hack Resurfaces
- Read more:
Frequently Asked Questions
What’s the new cgroup v1 to v2 CPU conversion formula?
It’s a quadratic mapping hitting (2,1), (1024,100), (262144,10000). Turns 1024m into ~102 weight. Fixes priority and granularity.
Will this break my Kubernetes monitoring?
If it assumes old linear math, yes. Update tools expecting cpu.weight from shares.
Do I need to upgrade Kubernetes for cgroup v2 CPU fix?
Nope. OCI runtimes only: runc 1.3.2+, crun 1.23+.