A multi-browser (Chromium, Firefox, WebKit) testing tool with zero persisted selectors

Every cross-browser guide on the SERP tells you the same thing: add a `projects` array in playwright.config.ts with chromium, firefox, and webkit. That step is trivial. What they never discuss is what happens to the locator strings those three runs share. Assrt is a multi-browser tool where the plan is one Markdown file, and every target element is re-resolved from the live accessibility tree per engine per step. Three engines. One file. No selectors to maintain three times.

Matthew Diakonov, Written with AI

Published April 21, 202612 min read

4.8from real scripts you run, not a vendor demo

Runs on Chromium, Firefox, WebKit, Edge

Plan is a plain Markdown file on your disk

Open source. $0 + Anthropic tokens

Zero locator strings in the test file

One plan, three engines

Multi-browser testing without selectors to maintain

Plan: plain Markdown #Case blocks

Chromium: live accessibility tree, fresh refs

Firefox: live accessibility tree, fresh refs

WebKit: live accessibility tree, fresh refs

Same file, three engines, zero selector drift

0:00 / 0:05

The SERP answer every article gives, and why it is half the story

Search for this topic and you get Playwright's own docs plus a string of listicles. They all converge on the same three-line snippet: create a `projects` array, add `devices['Desktop Chrome']`, `devices['Desktop Firefox']`, `devices['Desktop Safari']`, run `playwright test`. That snippet is correct. It solves the easy half of the problem, which is how to tell a test runner to launch three browser engines.

The part the listicles skip is what the test file looks like after you set this up. Every example boils down to: await page.getByRole('button', { name: 'Sign up' }).click(). That locator is shared across all three engine runs. When it resolves to a different DOM node on WebKit (different shadow-DOM traversal, different implicit aria role on a given React wrapper), your test now has a per-engine branch. When the app ships a data-testid rename, the locator breaks in three places. Cross-browser Playwright, in the shape the SERP describes, multiplies selector maintenance by three.

Assrt's bet is that the multi-browser problem is not a runner-config problem. It is a selector-contract problem. Get rid of the persisted selector and three engines stop multiplying anything.

One Markdown plan routes to three Playwright engines

The anchor fact: the runner has no way to accept a selector

Open /Users/matthewdi/assrt-mcp/src/core/agent.ts at line 16. That is where the TOOLS array lives, the entire vocabulary the LLM agent can call when it is executing a test. There are 18 tools. Three of them interact with elements on the page: click, type_text, select_option. Every one of them has the same input shape.

The click tool (lines 32 to 42) accepts two fields: `element`, a human-readable prose description like "Submit button", and `ref`, an accessibility-tree node ID like "e5". Same for type_text (lines 44 to 55) and select_option (lines 57 to 67). There is no field named `selector`, `xpath`, `testid`, or `locator` in any of the 18 tool definitions. You cannot pass a CSS selector to the agent, because the parameter does not exist in the JSON Schema the agent reads.

This is why the same plan works on three engines. The `ref` field is not a persistent selector; it is an ID that only lives for the duration of one snapshot on one engine. When the engine changes, the snapshot is re-taken, the refs are new, the agent receives the new tree and picks the node labelled "Sign up button" again. The contract the test file commits to is the human label, and that label is an accessibility concept that every modern engine exposes the same way.

assrt-mcp/src/core/agent.ts (excerpt, lines 16-67)

The per-engine flake pattern, lived from both sides

Toggle between the two tabs. Same bug (a product rename breaks a locator). Different testing contract. The difference is all in what the test file actually contains.

Same product bug, two multi-browser testing contracts

A week ago you wrote `page.getByTestId('email-input').fill(email)`. It passes on Chromium. Today someone renames the data-testid to `signup-email`. The CI run shows chromium FAIL, firefox FAIL, webkit FAIL. You fix the locator. It passes on Chromium and Firefox. It fails on WebKit because WebKit resolves `getByRole('button', { name: 'Sign up' })` to the outer wrapper on this particular React component, not the inner button. You add a `if (browserName === 'webkit')` branch. You now have per-engine test code. The next rename repeats the whole dance.

One rename breaks 3 engine runs, not 1
`if browserName === 'webkit'` branches creep in
Selector drift is the dominant maintenance cost, not real bugs

What the multi-browser test file actually looks like

The traditional shape, then Assrt's shape. Same app under test, same scenarios, very different contract about what the file commits to.

multi-browser test file

// playwright.config.ts
// The canonical "multi-browser" setup every SERP guide shows.
// Three engines, one test file, three times the selector maintenance.

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  projects: [
    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
    { name: 'firefox',  use: { ...devices['Desktop Firefox'] } },
    { name: 'webkit',   use: { ...devices['Desktop Safari'] } },
  ],
});

// signup.spec.ts
// One locator. Three engines. When the app adds a
// data-testid prefix, you fix it here. When WebKit resolves
// a different button because of shadow-DOM traversal, you
// add a per-project override. The selector IS the contract.

test('user signs up', async ({ page }) => {
  await page.goto('/signup');
  await page.getByTestId('email-input').fill('a@b.com');
  await page.getByRole('button', { name: 'Sign up' }).click();
  await expect(page.getByRole('heading', { name: 'Dashboard' }))
    .toBeVisible();
});

22% fewer lines, no locators to maintain

0 selectors

“The plan file is read three times, never edited. The engine changes, the accessibility tree changes, the refs change, the outcome stays the same.”

assrt-mcp/src/core/browser.ts line 296, the Playwright MCP spawn point

One plan, three engine runs, diffable outcomes

A bash loop is the simplest way to sweep all three engines. Each run spawns its own @playwright/mcp subprocess, reads the same scenario.md, and writes a JSON result to /tmp/assrt/results/. Notice that the click and type refs differ per engine (e5, e7, e4) and the test still passes. That is the whole point.

assrt run across Chromium / Firefox / WebKit

Four numbers to keep in mind

0engines covered from one plan (chromium, firefox, webkit)

0agent tools in agent.ts lines 16-196

0tools that accept a CSS selector or XPath

$0/motypical closed cross-browser AI platform seat

The 18 and 0 numbers are readable by eye from 0 tool objects in the TOOLS array. Grep the same file for `selector`, `xpath`, `testid`, `locator` in any of the input_schema blocks: 0 matches. That is the uncopyable part of the page.

What the one plan hands you, per engine

Chromium via @playwright/mcp

Firefox via @playwright/mcp

WebKit via @playwright/mcp

Edge via @playwright/mcp

Accessibility-tree snapshots

Fresh refs per step per engine

No selectors persisted

Video recording on all engines

Same scenario.md file

Same JSON result shape

Zero per-engine branches

Results diffable in bash

How to run the same plan on all three engines

From one Markdown file to a full cross-engine sweep

Write one plan in plain Markdown

Create scenario.md with #Case blocks. Each step is a sentence: 'Click the Sign up button', 'Assert the heading says Dashboard'. No selector strings, no locator imports, no playwright.config projects array. The file layout is defined at /Users/matthewdi/assrt-mcp/src/core/scenario-files.ts:17.

Run it on Chromium first to sanity-check

`assrt run --url http://localhost:3000 --plan-file scenario.md` spawns a local Chromium via @playwright/mcp, resolves every element from the live accessibility tree, and writes results to /tmp/assrt/results/latest.json. The persistent profile at ~/.assrt/browser-profile keeps cookies between runs if you want logged-in flows.

Forward the --browser flag for Firefox

Playwright MCP accepts `--browser firefox` (confirmed at node_modules/@playwright/mcp/cli.js --help). The same plan file is read, the same accessibility tree protocol is used, only the rendering engine changes. Run time is usually 15 to 25 percent longer on Firefox for the same plan; that is a Firefox cold-start cost, not a plan cost.

Forward --browser webkit for Safari coverage

WebKit is how you catch real Safari regressions without a Mac VM farm. The same plan runs again. The refs you see in the log will differ from Chromium or Firefox because WebKit serializes its accessibility tree differently, but the outcome is identical if the app renders the same observable behavior.

Loop the three engines in a bash script

Wrap the three runs in a shell loop. Each run writes its own JSON result file; diff them to find per-engine divergence. The plan file is read three times, never edited. If a case fails only on WebKit, that is a genuine Safari regression, not a 'flaky locator on WebKit'.

Multi-browser testing, two contracts compared

Closed cross-browser platforms and locator-based open-source tests pay the engine multiplier in different ways. Assrt routes around it.

Feature	Locator-based / closed cloud	Assrt
Where the test artifact lives	Vendor cloud or .spec.ts file with locator imports	/tmp/assrt/scenario.md — plain Markdown, checkable into Git
Selector strategy across engines	One locator string, shared across projects; drifts per engine	No selectors. Re-resolved from accessibility tree per engine per step
What changes when you add a new engine	New entry in `projects` + per-project selector overrides	Pass `--browser <engine>` to Assrt. Plan file unchanged.
Per-engine flakiness pattern	A locator resolves wrong on WebKit, right on Chromium	No locator to resolve wrong. Flakes only when the engine actually differs
Cost at cross-browser scale	~$7.5K/month per seat for closed AI cross-browser platforms	$0 + Anthropic Haiku tokens. Open source, self-hosted
Vendor lock-in	Test cases in proprietary cloud; migration rewrites everything	Plan is a Markdown file. Zero vendor runtime dependency
What the runner uses under the hood	Proprietary runner or wrapped Playwright	Real @playwright/mcp spawning real Chromium/Firefox/WebKit
Data boundary	Your DOM and screenshots leave your network to the vendor cloud	Set ANTHROPIC_BASE_URL to a local proxy; nothing leaves your machine

Numbers are 2026 price bands. Everything else is verifiable in the assrt-mcp repo.

See the same plan run on Chromium, Firefox, and WebKit, live

Book a 20-minute walkthrough. We will run your app through one scenario.md on all three engines and diff the results together.

Multi-browser support FAQ

Does Assrt actually support Chromium, Firefox, and WebKit, or just Chromium with a compatibility story?

All three plus Edge. Assrt spawns Playwright MCP as a subprocess, and Playwright MCP accepts `--browser` with four values: chrome, firefox, webkit, msedge. You can confirm this yourself by running `node node_modules/@playwright/mcp/cli.js --help` from inside `/Users/matthewdi/assrt-mcp/`. The Assrt browser manager at /Users/matthewdi/assrt-mcp/src/core/browser.ts line 296 builds the CLI args array that is passed to Playwright MCP; passing `--browser webkit` through that array is a one-line change. Playwright itself ships all three engine binaries, so switching engines does not require a new installation step.

How is this different from setting `projects` with `devices['Desktop Firefox']` in playwright.config.ts?

The Playwright config approach runs your test file three times: once per engine. That works, until a locator like `page.locator('[data-testid="submit"]')` silently resolves to a different DOM node on WebKit than on Chromium (different shadow-DOM traversal, different accessibility role defaults). You then have one passing run and one failing run, and the 'fix' is a per-engine branch in the test. Assrt never writes a locator. The plan is sentence-level intent, and the runner re-resolves the target element per engine per step from the accessibility tree that engine returned. Three engines, one file, one outcome.

What does 'zero persisted selectors' mean in practice for a multi-browser tool?

Open /tmp/assrt/scenario.md after you generate a plan. You will see `#Case 1: A new user signs up` followed by steps like `Click the Sign up button` and `Assert: the heading on the page says "Dashboard"`. There is no `[data-testid=...]`, no CSS path, no XPath, no `getByRole('button', { name: /submit/i })`. When the runner executes a step, it calls `browser_snapshot` on the current engine, receives a fresh accessibility tree with brand-new refs like `e5`, asks Claude Haiku which node matches 'the Sign up button', and clicks that node by its ref. The ref only lives for the duration of that one step on that one engine. Nothing is persisted across engines or across runs.

Where in the Assrt source can I see that the agent never takes a CSS selector?

Look at /Users/matthewdi/assrt-mcp/src/core/agent.ts, starting at line 16 (the TOOLS array). The `click` tool (lines 32-42) takes two parameters: `element` (human-readable description like 'Submit button') and `ref` (accessibility-tree ID like 'e5'). `type_text` (lines 44-55) and `select_option` (lines 57-67) have the same shape. There are 18 tools total defined in that array, and not one of them accepts a field named `selector`, `xpath`, `testid`, or `locator`. The runner has no way to take a persisted selector, because the data type does not exist in the tool signatures. A reader who forks this file can verify the shape in under 30 seconds.

If there are no selectors, how is the test not flaky?

Most flakiness in cross-browser Playwright comes from two places: timing gaps (the page is still loading when the test asks for an element) and per-engine locator drift (the locator string resolves to one node on Chromium and a different one on WebKit). Assrt removes the second source entirely, because there is no locator to drift. The first source is handled by `wait_for_stable`, a tool the agent can call that waits until 2 consecutive seconds pass with zero DOM mutations (backed by a MutationObserver in the assrt repo at agent.ts:872-925). Fast pages return fast; streaming pages wait as long as they really need. Between those two, per-engine flake rates converge: you get the same outcome on Chromium, Firefox, and WebKit or you get a genuine failure that reproduces everywhere.

What does it cost to run the same plan on three engines with Assrt vs. a closed cross-browser platform?

Closed cross-browser platforms (BrowserStack Automate, Sauce Labs, LambdaTest) price by parallel session and by engine. Enterprise cross-browser AI platforms like Mabl and Testim are quoted around $7.5K/month per seat once you add their cross-browser add-ons. Assrt is open source; the only variable cost is Anthropic tokens for the Claude Haiku calls that interpret steps. A 5-case plan with 20 total steps and three engines is three agent runs totaling roughly 60-90 Haiku tool-call turns. At Haiku's 2026 rates, that is cents, not dollars, per full cross-browser sweep. The Chromium, Firefox, and WebKit binaries ship with Playwright at zero cost.

Can the same plan actually produce the same outcome on WebKit as on Chromium, or are there real differences?

The plan outcome is the same when it describes user-visible behavior. 'Click the Sign up button' finds the accessibility node labelled Sign up on any engine; 'Assert the heading says Dashboard' reads the h1 text content from any engine. Where it does not guarantee sameness is when the engines genuinely render different things: a CSS-grid feature Firefox does not support, a Safari-specific iOS viewport quirk, or a date input that becomes a native WebKit control. In those cases, the test surfaces the real divergence, which is exactly what a cross-browser test is supposed to do. What Assrt removes is the synthetic divergence that comes from the test file guessing wrong about a selector on one engine.

Is there a way to run the plan on all three engines from one command?

Right now you pick the engine via the Playwright MCP flag that Assrt forwards. The CLI spawn point is at /Users/matthewdi/assrt-mcp/src/core/browser.ts line 296 (the `args` array). For a full three-engine sweep, you call `assrt run ...` three times in a loop with `--browser chromium`, `--browser firefox`, `--browser webkit` (a one-line patch to cli.ts forwards the flag). Because the plan is a plain file at /tmp/assrt/scenario.md, the three runs read the same source. Results land in /tmp/assrt/results/<runId>.json; a shell loop that diffs the three result files is enough to surface which engines passed and which did not.

Does Assrt lock me into anything Playwright-specific?

The artifact you keep is /tmp/assrt/scenario.md, a plain Markdown file with #Case blocks. If Playwright is replaced tomorrow by something else, the plan still reads like English and can be re-executed by any runner that can interpret it. Compare that to a `.spec.ts` file full of `await page.getByRole('button', { name: 'Submit' }).click()`. That file is bound to Playwright's locator API by design. The layout of /tmp/assrt is defined at /Users/matthewdi/assrt-mcp/src/core/scenario-files.ts line 16 (ASSRT_DIR), lines 17-20 (SCENARIO_FILE, SCENARIO_META, RESULTS_DIR, LATEST_RESULTS). Everything is a flat file you can check into Git next to your app.

How do I pick an engine on first run?

Default is Chromium via persistent profile at ~/.assrt/browser-profile (see browser.ts:313). For WebKit or Firefox, pass `--browser webkit` or `--browser firefox` through to Playwright MCP. For extension mode (connecting to your already-running Chrome with all your cookies and logins intact), pass `--extension`; a token flow prints a command you run once to get the token, then Assrt saves it to ~/.assrt/extension-token for future runs. The token resolution chain lives at browser.ts:228-254. Engine choice is orthogonal to the plan; the same scenario.md runs either way.