E2E Tests Signals vs Guarantees Mismatch

Your e2e tests pass, but users rage-quit. That's the signal-guarantee trap. One dev's Playwright rethink could fix it.

E2E Tests: Why Green Lights Don't Mean Your App Works — The AI Catchup

Key Takeaways

  • E2E tests often check UI signals (visibility, text) instead of true app guarantees, leading to brittle suites.
  • Distinguish signals, state, and promises for quieter, resilient tests using Playwright helpers.
  • Shift mirrors unit testing evolution; promise-focused tools will dominate by 2026.

E2E tests lie.

A month back, one dev open-sourced a Playwright helper. Downloaded 300 times—all by him. Ouch. But the r/Playwright crowd called it: too bloated. He stripped it down, boiled it to essence. Now? A sharp matcher that flips e2e tests from brittle UI spies to resilient behavior guardians. Here’s the core gripe, straight from his post:

tldr: Most e2e tests encode the current UI representation of behavior, not behavior itself. They check signals (visibility, text content, enabled states) instead of the facts the test is actually promising to protect.

That’s the mismatch. Signals are not guarantees. Your tests scan DOM pixels—visible? Text matches? Button enabled?—but miss the app’s true state. Login works? They check the welcome banner pops up. Not that you’re actually logged in.

What Even Are ‘Signals’ in E2E Land?

Look, e2e tests—end-to-end, Playwright-style—crawl your app like a user on caffeine. Click here, type there, assert outcomes. But those assertions? They’re hooked to today’s UI dress-up. Button says ‘Submit’? Cool, test passes. Dev team swaps it to ‘Send’? Boom, false failure. That’s a signal: superficial, fickle.

State’s deeper. User’s authenticated? Cart totals correct? Data persisted? Those are facts, promises your app vows to keep. The dev—call him Abel—spots three layers: signals (UI blips), state (app guts), promises (what tests swear by). His helper lets you assert promises directly. No more chasing DOM ghosts.

And here’s my twist, the one Abel doesn’t hit: this echoes unit testing’s messy youth. Back in the 2000s, JUnit zealots mocked every dependency, testing isolated fakes. Result? Green tests, broken deploys. Then behavior-driven dev (BDD) rose—Cucumber, anyone?—pushing for intent over implementation. E2E’s having its BDD moment now. Signals? Mockery by another name. Time to evolve.

But why’s this baked in? Playwright’s power—speed, reliability—lures devs to snapshot the page. Easy asserts: expect(page.locator('h1')).toHaveText('Welcome'). Feels solid. Until redesign hits. Suddenly, h1’s a div. Test flakes. Teams add getByRole, getByTestId—band-aids. Abel’s matcher skips the hunt: define the promise (e.g., ‘user sees dashboard after login’), let it probe multiple signals. Fails only if promise breaks.

Short para: Resilient.

Why Do E2E Tests Obsess Over UI Noise?

Blame history. E2e rose as black-box saviors post-frontend boom—React, Vue, infinite re-renders. Unit tests couldn’t touch integration hell. So e2e became king: puppeteer, cypress, playwright. But they inherited web fragility. Pages load async, CSS flips, A/B tests meddle. Assert a className? Tomorrow’s Tailwind purge nukes it.

Devs know this—flaky suites are hell. Retries pile up, CI burns cash. Abel’s been there: his first lib was overkill because he chased every edge. Now? Minimal. Gist in comments shows it: a custom matcher wrapping expect, querying smarter.

Here’s the architecture shift I see brewing—and my bold callout on the hype. Companies like Microsoft (Playwright’s home) tout ‘reliable e2e’ in blogs, but their docs still push locator hunts. PR spin: ‘Trace viewer fixes flakes!’ Nah. Real fix? Decouple from UI. This helper’s no silver bullet—apps vary—but it’s the contract-testing vibe for UIs. Predict: next Playwright release bakes similar primitives. Or forks it.

Wander a sec: remember Selenium’s death spiral? XPath hell, 10-second waits. Playwright killed that. Now signals-killing is next.

How Does This Matcher Actually Work?

Curious? Abel drops the gist. It’s a Playwright expect extension: await expect(page).toHavePromise('userIsLoggedIn', {userId: 123}). Under hood? Maps promise to signal checks—text, url, storage—ORs them. UI swaps one signal? Still passes. All fail? Real bug.

Medium dive. Setup’s npm install, one import. Tests slim: no 20-line locators. Why resilient? Abstraction layer. Like GraphQL over REST—schema shields schema from backend churn. Here, promise shields from frontend churn.

One-line para: Genius, simple.

But critique time—Abel’s candid, admits first version bloated. Good. No corporate gloss. Open Source Beat loves that skepticism.

Why Does This Matter for Developers?

You’re shipping JS apps daily. E2e suite’s 50% of CI time, 80% of gripes. This? Cuts noise. Faster greens, fewer false alarms. Teams focus bugs that bite users.

Broader: testing’s architecture war. From pyramid (units base) to ice cream cone (e2e heavy). Shift back? Maybe. Promises scale—define once, assert anywhere.

Expansive thought, weaving in: Imagine CI where e2e runs parallel, signals fuzzy-matched via ML (wild? Nah, Vercel experiments). Or WebAssembly apps—signals blur across runtimes. Abel’s idea ports.

Single sentence: Future-proof.

Teams ignoring this? Risk test debt explosion. We’ve seen Netflix kill e2e for chaos engineering. Others? Cling to signals, suffer.

Can You Ditch Signals Entirely in E2E Tests?

Not yet. Hybrids rule. Use for critical paths—auth, payments. Signals for legacy. But migrate.

Abel’s post invites feedback. r/programming’s buzzing—300 downloads were him, but now? Real traction possible.


🧬 Related Insights

Frequently Asked Questions

What are signals vs guarantees in e2e tests?

Signals are UI indicators like text or visibility; guarantees are core app behaviors like ‘user is logged in’ that tests should protect.

How to make Playwright e2e tests more resilient?

Use promise-based matchers like Abel’s helper—assert behaviors, not DOM details, to survive UI changes.

Why do e2e tests fail on UI updates?

They couple to current UI representations (signals), breaking when designs or locators shift, ignoring true app state.

James Kowalski
Written by

Investigative tech reporter focused on AI ethics, regulation, and societal impact.

Frequently asked questions

What are signals vs guarantees in e2e tests?
Signals are UI indicators like text or visibility; guarantees are core app behaviors like 'user is logged in' that tests should protect.
How to make Playwright e2e tests more resilient?
Use promise-based matchers like Abel's helper—assert behaviors, not DOM details, to survive UI changes.
Why do e2e tests fail on UI updates?
They couple to current UI representations (signals), breaking when designs or locators shift, ignoring true app state.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by Reddit r/programming

Stay in the loop

The week's most important stories from The AI Catchup, delivered once a week.