Terminal frozen. ‘Segmentation fault (core dumped).’ Sixty nodes in a linked list, and PHP just… dies.
I’ve seen this movie before—20 years chasing Silicon Valley’s wildest crashes. But this one’s fresh: php-ext-deepclone, a week-old C extension meant to serialize PHP object graphs into arrays without the usual deep-clone headaches. Developer Jérôme Tamarelle hits a wall; Claude Code (Anthropic’s AI) steps in, instruments the code, and we hunt the beast together. No C mastery required, they say. Yeah, right.
Look, extensions are PHP’s dark underbelly. You write C, compile to .so, load it up—blazing fast, until you scribble outside the lines. Then? No friendly exception. Just segfault city.
The Recursion Trap That Wasn’t
The function deepclone_to_array() chews through objects recursively. Hit a ‘next’ property on your list node? Boom, dive deeper. Forty-seven nodes? That’s 47 stack frames in C’s call stack—not PHP’s wimpy userland one.
They’d already bumped recursion depth from 12 to 512. Still crashes. Quick math with otool: 480 bytes per frame times 47? Twenty-two kilobytes. PHP’s stack laughs at that—8MB default. So, not overflow.
But here’s the cynical truth: C programmers always blame the stack first. It’s the easy out. Real killer? Memory corruption. You overwrite some innocent struct way back, program chugs along smiling… until it reads the poison. Crash miles from the crime scene.
Why a Simple Array Read Killed Everything
No debug symbols in Homebrew PHP. No GDB joy. Old-school fix: fprintf everywhere, C’s var_dump.
They pepper dc_build_output()—the cleanup function post-recursion—with prints:
fprintf(stderr, “avant cidx\n”); uint32_t cidx = e->cidx; // ← le crash se passe ici fprintf(stderr, “après cidx\n”);
First message fires. Second? Silence. e->cidx derefs garbage. e itself? 0x11. Seventeen decimal. Not a pointer, genius—a random int.
e hails from ctx->entries[], an array tracking every object visited. More prints:
WRITE entries[33] = 0x10468e4d0 ← adresse correcte lors de l’écriture … READ entries[33] = 0x11 ← valeur corrompue lors de la lecture
Entry 33: pristine on write, trash on read. Classic overwrite. Something clobbered the array between recursion and assembly.
A single sentence: Ghost pointer.
But wait—why 0x11 specifically? Dig deeper (pun intended). Turns out, during recursion, a property scan mistakes a PHP integer for a pointer. In PHP’s Zend engine, objects pack zvals—union of types. Pointer to next node? Fine. But a scalar slips in, gets cast wrong, and boom: writes 17 (0x11) over entry 33’s pointer.
They type-check harder: ensure next is object or null, not int. Fixed. Tests pass. List serializes clean.
How Do You Debug PHP C Extensions Without Losing Your Mind?
Homebrew PHP? Stripped. Valgrind? Sloooow on macOS. AddressSanitizer? Recompile PHP—hours.
Printf wins. Stamp every array access, every deref. Run the crashing input. Last print before death? Ground zero.
I’ve printf’d my way through worse—remember PHP 4’s Zend bugs? Buffer overflows everywhere, remote exploits galore. Back then, no AI sidekicks. Now Claude Code parses Jérôme’s panic, adds the traces, spots the pattern. It’s like pair-programming with a memory whisperer.
Here’s my unique hot take, unspun by PR: This isn’t just a bug hunt; it’s a flashback to why PHP extensions scare off devs. C’s freedom is poison without guardrails. AI changes that—tools like Claude Code (or whatever Anthropic spins next) could auto-instrument low-level code, flag ghost pointers before they haunt. Predict this: In two years, we’ll laugh at manual debugs. But until then? Stack overflows will keep claiming noobs.
Cynical? Sure. Who profits? Open source purists get free fixes; Anthropic demos their agent’s chops. Win-win, if you’re not the one segfaulting at 2 AM.
Short para: Test your extensions. Always.
Deeper dive: The extension’s job—deep-clone object graphs to arrays. PHP’s built-in clone()? Shallow. Serialize? Bloated strings. This zips through properties, recurses smartly, spits JSON-ready arrays. Nifty for APIs, configs. But untested on real graphs? Recipe for ghosts.
They upped recursion limits, ruled out stack. Traced the array. Found the bad cast. Patch:
if (Z_TYPE_P(next_zval) != IS_OBJECT && Z_TYPE_P(next_zval) != IS_NULL) {
// skip or error
}
Clean. But imagine 100-node graphs. Or cycles. That’s next week’s crash.
Is PHP C Extension Development Still Worth the Pain?
Twenty years in, I say maybe—if you’re masochistic. Speed trumps all for hot paths. But costs? Crashes, exploits (hello, historical CVEs). Rise of Rust crates for PHP? Looming. Or WASM. C’s throne wobbles.
Yet deepclone proves value: Serialize complex structures fast, no userland loops. Skeptical me asks: Who’s cashing in? Hobbyists? Sure. Production? Tread light.
One sprawling thought: We’ve instrumented writes/reads, but what about frees? Double-free? Use-after-free? That’s ghost-pointer hell proper. This bug was tame—overwrite by bad cast. Real phantoms linger post-free, pointing to reclaimed heap. Valgrind eats those for breakfast. Run it.
And yeah, AI helped here. Claude didn’t just suggest prints; it simulated paths, predicted the 0x11 origin (PHP int tag?). Human + AI > human alone. Buzzword-free truth.
🧬 Related Insights
- Read more: Postgres Gasps at Chat Scale — Time for ScyllaDB’s Ring?
- Read more: AI Agents Pump Out Production Code in CI – Dream or Developer Nightmare?
Frequently Asked Questions
What causes segfaults in PHP C extensions?
Usually memory corruption—bad pointers, overflows, or type mismatches writing over structs. Crashes far from the bug.
How to debug segfault without GDB on macOS PHP?
Fprintf traces everywhere. Print addresses before/after ops. Find the mismatch.
Is php-ext-deepclone safe for production?
Now fixed, yes for acyclic graphs. Test your depths; watch for cycles.