cgroup v1 to v2 CPU Shares Conversion Fix

Picture your Kubernetes pods gasping for CPU cycles, starved by a botched conversion formula. OCI runtimes just dropped a quadratic fix – better, but not without pitfalls.

Kubernetes' cgroup v2 CPU Fix: Quadratic Magic or Half-Measure? — The AI Catchup

Key Takeaways

  • New quadratic formula fixes CPU priority loss and granularity issues in cgroup v2 transitions.
  • Implemented at OCI runtime level (runc/crun), not Kubernetes core – fast adoption possible.
  • Watch for breakage in tools hardcoding old linear weights; test thoroughly.

Your container’s begging for that 1 CPU. But under the old cgroup v2 rules, it gets shafted to a measly weight of 39. Ouch.

Now zoom out: Kubernetes, built in the cgroup v1 era, stumbled hard into v2 territory. CPU shares? Simple back then – 1024m request meant 1024 shares, matching system daemons pound for pound. v2 flips the script with weights from 1 to 10,000. The original linear map? Disaster. Pods lose priority to random host processes. Granularity vanishes for tiny requests.

Here’s the rub.

The current conversion formula creates two major issues: 1. Reduced priority against non-Kubernetes workloads… 2. Unmanageable granularity.

That’s straight from the announcement. Spot on. But let’s not pat backs yet.

Why Did Kubernetes Screw Up cgroup v2 So Badly?

Blame history. K8s launched when cgroup v1 ruled Linux kernels – think 2014 vibes, sysadmins still nursing Upstart hangovers. KEP-2254 slapped in a linear formula: cpu.weight = (1 + ((cpu.shares - 2) * 9999) / 262142). Neat on paper. Maps 2 to 262144 shares to 1-10000 weights.

Reality? A 1 CPU pod (1024 shares) drops to weight 39. Default system weight’s 100. Your workloads – deprioritized against kubelet, CRI-O, whatever daemon’s lurking outside the cluster fence. Starvation city during peaks.

And small requests? 100m CPU: shares 102, weight 4. Try subdividing that for sub-cgroups. Good luck – it’s like splitting a Tic Tac.

But wait, OCI runtimes – runc 1.3.2, crun 1.23 – roll out a quadratic savior. No K8s core change needed. Pods upgrade runtimes? Fixed.

The New Formula: Quadratic Wizardry or Overkill?

It pins three anchors: (2 shares, 1 weight), (1024, 100), (262144, 10000). Close-to-linear curve, zoomed in on the meaty bits.

1 CPU pod? Now weight 102. Bang on default. 100m? 17 – subdividable. Solves both pains dead-on.

Smart. But complicated. Here’s my hot take, absent from the post: this echoes the systemd cgroup v1-to-v2 bloodbath circa 2019. Remember? Schedulability tanked, workloads throttled weirdly till kernels caught up. Kubernetes dodged a bullet by punting to OCI – but expect the same chaos if runtimes lag. Bold prediction: by Q2 2025, 20% of enterprise clusters hit snags from stale monitoring assuming linear math.

Dry humor aside, it’s progress. Corporate spin? None here – this is raw engineering, no “revolutionary” fluff. Still, they admit breakage: tools hardcoding old weights? Update or bust.

Adoption’s runtime-tied. Docker? Via runc. Podman? Crun. K3s, kind? Check your versions. Existing deploys – mostly fine, unless you’re scripting weights directly.

Does This Actually Matter for Your Cluster?

Short answer: yes, if you’re on cgroup v2 – which is most modern distros now. Fedora, Ubuntu 22.04+, RHEL 9. Boot with systemd.unified_cgroup_hierarchy=1? You’re live.

Pain points hit mixed workloads hard. Heavy on system services? Old formula starved K8s. Now, parity restored. Granularity boost tees up KEP-5474’s writable cgroups – unprivileged pods slicing CPU like pros.

Skepticism time. Is quadratic perfect? Nah. Weights top at 10k; shares explode to 2^18. Edge cases – massive requests – might clip funny. Test it. Always test.

And that visual graph? Gold. Linear’s a straight slash. Quadratic hugs the defaults, bends smartly. If only all fixes looked this pretty.

Look, Kubernetes PR machine loves clean wins. This? Underdog fix at OCI level. Kudos to whoever crunched the math – probably some runc dev nursing coffee at 3 AM.

But don’t sleep. Monitor your weights post-upgrade. Tools like Prometheus scraping old formulas? They’ll scream false alarms.

The Long Game: cgroup v2 Maturity

This patch papers a transition wound. Full native v2 support? Coming, but slow. History parallels: Upstart to systemd took years; v1 to v2’s no different. Kubernetes’ll get there – KEP-2254 was step one.

Unique angle: it’s like fixing IPv4-to-IPv6 NAT hacks. Works, but screams for native stacks. OCI’s move buys time, exposes K8s’ v1 baggage.

Punchy truth. Upgrade runtimes. Watch for breakage. Profit.


🧬 Related Insights

Frequently Asked Questions

What is the new cgroup v2 CPU weight formula for Kubernetes?

It’s a quadratic map hitting (2,1), (1024,100), (262144,10000) – implemented in runc 1.3.2+ and crun 1.23+.

Will cgroup v1 to v2 conversion break my Kubernetes cluster?

Unlikely for most – but monitoring/tools assuming linear math need fixes. Test post-upgrade.

How do I check if my runtime uses the new formula?

Runc –version or crun –version; look for 1.3.2+ / 1.23+. Inspect pod cgroups via crictl.

James Kowalski
Written by

Investigative tech reporter focused on AI ethics, regulation, and societal impact.

Frequently asked questions

What is the new cgroup v2 CPU weight formula for Kubernetes?
It's a quadratic map hitting (2,1), (1024,100), (262144,10000) – implemented in runc 1.3.2+ and crun 1.23+.
Will cgroup v1 to v2 conversion break my Kubernetes cluster?
Unlikely for most – but monitoring/tools assuming linear math need fixes. Test post-upgrade.
How do I check if my runtime uses the new formula?
Runc --version or crun --version; look for 1.3.2+ / 1.23+. Inspect pod cgroups via crictl.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by Kubernetes Blog

Stay in the loop

The week's most important stories from The AI Catchup, delivered once a week.