Picture this: you’re tweaking that home LLM server late at night. Suddenly, everything crawls. SSH sessions hang like drunks at last call. Web apps time out. The box is thrashing memory, begging for mercy — but the kernel’s OOM killer? It’s asleep at the wheel, or worse, picking the wrong victim.
That’s the systemd-oomd reality check millions of Linux users dodge daily. Not some abstract kernel tweak. Real pain for devs, sysadmins, anyone shoving AI workloads or batch jobs onto modest hardware.
And here’s the kicker — systemd-oomd doesn’t wait for the funeral. It sniffs out memory pressure via PSI (that’s Pressure Stall Information, for the uninitiated) and cgroup vibes, then mercy-kills the offending subtree. Predictable fails. No more finger-crossing.
Why Bother with systemd-oomd When OOM Killer Exists?
But — hold up. We’ve had OOM killers forever. Why this userspace interloper?
Old-school kernel OOM? It’s a crapshoot. Picks processes by heuristics that haven’t aged well since 2005. Your database survives; your SSH dies. Nightmare.
systemd-oomd? Smarter. Proactive. Watches cgroup v2 hierarchies — unified, finally — and PSI metrics. Those telltale stalls: ‘some’ tasks waiting, or god forbid, ‘full’ where everything non-idle freezes.
systemd-oomd is a userspace OOM killer that uses cgroups v2 and pressure stall information (PSI) to take corrective action before a kernel-space OOM occurs.
Pulled straight from the man page. No hype. Just facts.
I call BS on distros dragging feet here. Containers nailed this years ago with Kubernetes evictions. systemd-oomd? Linux desktops and servers playing catch-up. Bold prediction: it’ll be default in Fedora 42, Ubuntu 26.04. Mark it.
First, gut check your rig.
stat -fc %T /sys/fs/cgroup
cgroup2fs? Good. Else, boot with systemd.unified_cgroup_hierarchy=1.
ls /proc/pressure
See memory? Kernel 4.20+. No PSI? Tough luck, upgrade.
cat /proc/pressure/memory
Stalls piling up? That’s your canary.
Can Your Distro Even Run This Thing?
Packaging roulette.
systemctl list-unit-files ‘systemd-oomd*’
Nada? Hunt packages. Debian: sudo apt install systemd-oomd. RHEL? dnf whatprovides. Arch? It’s baked in.
Enable: sudo systemctl enable –now systemd-oomd.service
Status: systemctl status systemd-oomd.service –no-pager
Active? Sweet. Now memory accounting — don’t skip.
systemctl show –property=DefaultMemoryAccounting
No? Drop-in time.
sudo mkdir -p /etc/systemd/system.conf.d sudo tee /etc/systemd/system.conf.d/60-memory-accounting.conf <<EOF [Manager] DefaultMemoryAccounting=yes EOF
sudo systemctl daemon-reexec
Verify. Swap? Crank it on. oomd needs breathing room.
Defaults: SwapUsedLimit=90%, MemoryPressureLimit=60% over 30s. Sane for batches. Latency hawks? Tweak to 40% at 10s. Experiment — or regret.
Policy: Slice It Right, or Die Trying
Don’t spray ManagedOOM= everywhere. That’s amateur hour.
Target slices. system.slice for servers — bundles daemons, workers. Your runaway PostgreSQL or tensor job? Descendant cgroup gets the axe.
sudo systemctl edit system.slice
[Slice] ManagedOOMMemoryPressure=kill ManagedOOMMemoryPressureLimit=50% ManagedOOMMemoryPressureDurationSec=20s
daemon-reload.
Why 50/20? My twist: mirrors cloud bursting. AWS kills at similar thresholds. Less aggressive than kernel panic, but bites before users rage-quit.
user.slice for desktops. Or slices for AI: sudo systemctl edit ai.slice — if you made one.
Pro tip: ManagedOOMSwap=kill too. Dual threat.
Inspect: systemd-oomd –dump-config
Monitored cgroups listed. Test? Stress-ng –vm 4 –vm-bytes 80% -t 60s
Watch logs: journalctl -u systemd-oomd -f
Boom. Victim selected. Predictable.
The Hidden Gotcha: Why This Isn’t Bulletproof
Swap recommended? Understatement. No swap, oomd races kernel OOM. Livelock city.
cgroup v2 only. Hybrids? Mess. Migrate.
And PSI overhead — negligible, but penny-pinchers whine.
Historical parallel: early Android OOM adj scores. Same chaos, fixed by groups. systemd-oomd? Linux’s low-memory killer 2.0. But corporate spin (Lennart’s crew) paints it flawless. Nah. Tune wrong, kill innocents.
For self-hosted AI? Gold. Llama.cpp spiking? oomd prunes. No host swap hell.
Batch? Jenkins gone wild? Slices save the day.
Desktops? Risky. Gaming rig? Maybe not.
Testing Without Production Poker
Mock it.
Create test.slice.
sudo systemctl edit test.slice
Same policy.
systemd-run –slice=test.slice –pty stress-ng –vm 2 –vm-bytes 2G -t 30s
oomd watches. Kills.
Logs spill: which cgroup, why.
Scale up. Real workloads.
Tuning for Your Mess
Interactive? Drop limit to 30%, duration 10s.
Batch? 70%, 60s.
Per-slice: override.
[Slice] ManagedOOMMemoryPressureLimit=40%
Reload.
Metrics: /sys/fs/cgroup/system.slice/oomd.state
JSON bliss. Pressure history, decisions logged.
Critique: docs bury this. systemd maze. Newbies bail.
🧬 Related Insights
- Read more: The Blank Page Is the Enemy: How One Developer Built an AI Journal App That Actually Gets Used
- Read more: Prompt Engineering? That’s Cute. Context Is Where the Magic Happens
Frequently Asked Questions
What is systemd-oomd and does it replace OOM killer?
systemd-oomd preempts kernel OOM using PSI and cgroups, killing targeted processes early for predictable fails.
How do I install systemd-oomd on Ubuntu?
sudo apt install systemd-oomd; sudo systemctl enable –now systemd-oomd.service.
Will systemd-oomd kill my important services?
Set policy on slices like system.slice; it picks stressed descendants, not the whole group.