QA automation tutorial: automating the signup + OTP flow every other guide skips
Search this keyword and every article in the top ten stops at login with a fixture user. That test is nice; it is also a toy. The flow that actually breaks in production, the one your founder team keeps clicking through manually every release, is signup with a verification email. This tutorial automates that flow in two plain-English lines using three tools shipped inside Assrt. Every number, path, and regex below is a line of code you can open.
What every QA automation tutorial leaves out
Open the top six tutorials for this phrase. TestGrid, BDG, GeeksforGeeks, Edureka, TutorialsPoint, Sauce Labs. They all describe the same loop: install a runner, write a login test with a pre-created fixture, run it, call the day done. None of them attempts the one flow that fails first in production and is the hardest to maintain: a new user signs up, receives a verification email, types the code, and lands on the dashboard. The reason they skip it is honest. Automating a real verification email is ugly without the right tools. You either mock your SMTP provider (and regret it the first time the template changes), hand-roll a Gmail IMAP fixture (and regret it when Google 2FA expires), or pay a vendor $7.5K a month (and regret it when you cancel).
The rest of this tutorial is the fourth option. A disposable mailbox, a code extractor, three tools the agent can call, and a plan you type in English. Everything below cites a file in /Users/matthewdi/assrt-mcp/src/core/ so you can verify it before you trust it.
One #Case block in scenario.md. Ten lines. No selectors, no waits, no assertions-as-code.
api.internal.temp-mail.io issues a 10-character address per run. No SMTP, no IMAP, no fixture pool to babysit.
Seven patterns at email.ts:101-109. First match wins. Covers the four common keyword framings plus raw 6-digit, 4-digit, 8-digit fallbacks.
/tmp/assrt/<runId>/ holds the video, screenshots, events.json, and scenario.md. Tar it and attach to any bug report.
Step 1. Write the plan
Save this as scenario.md in your current directory. The two lines that matter are step 2 (create_temp_email) and step 7 (wait_for_verification_code). Everything else is the usual navigate, type, click. The agent decides which of the 18 Playwright MCP tools to call for every body line.
Notice the absence of selectors. No getByRole('button', { name: 'Verify' }), no data-testid="otp-input". The planner calls snapshot to get a fresh accessibility tree and picks a ref by the English description. When your designer renames the button, the test still passes because Verify is still the accessible name.
Step 2. See the pipeline
One hub in the middle (the planner) and two destinations worth pointing at: the browser page for the signup form, and api.internal.temp-mail.io for the inbox. Code extraction runs client-side inside the agent. Artifacts land on your disk. Nothing about the pipeline requires a vendor account.
scenario.md → Claude Haiku 4.5 → temp-mail.io + browser
Step 3. Run it and watch the tool calls
One command. Every tool call is logged. The verification code usually arrives by the second poll; the first pattern in the priority list (/code[:\s]+(\d{4,8})/) matches most transactional templates on the first try.
Step 4. Read the code extractor for yourself
This is the file a QA automation tutorial should always show you. Every claim about reliability lives or dies by what happens when the email arrives. Assrt's extractor is 9 lines. It prefers keyword-anchored matches (code, verification, OTP, PIN) before falling back to raw digit lengths. The pattern that finally matches is also returned in the run log, so debugging a false match is a grep away.
The mailbox itself is created by a six-line method. The important detail is the min_name_length: 10 request. A 10-character prefix gives you enough entropy to run dozens of parallel scenarios without collisions, while still being short enough to type into a cramped signup form.
The three tools, and the pipeline behind them
Everything in this section comes from agent.ts and email.ts. The tools are declared between agent.ts:115 and agent.ts:170; the handlers that execute them are at agent.ts:800 through agent.ts:842. No magic, no SDK abstractions.
create_temp_email
Hits api.internal.temp-mail.io/api/v3/email/new with a 10-character prefix request. Result is stored on the agent as this.tempEmail and usable in every following step without repeating the address. One call per scenario.
wait_for_verification_code
Polls /email/{address}/messages every 3000ms. Default timeout 60s, ceiling 120000ms. Returns { code, from, subject, body } on first regex match.
check_email_inbox
Fetches the full current inbox without parsing. Use when the email is not a code, e.g. a welcome message whose subject you want to assert on.
The regex pipeline
Seven patterns in priority order. First four key on the words code / verification / OTP / PIN. Last three are raw digit fallbacks (6, 4, 8). HTML emails are stripped to plain text first.
Why not use a fixture account?
Fixture accounts fail the actual signup flow by definition. They test login. They do not test email delivery, the verification template, the expiry window, or the callback URL. The disposable flow exercises every stage your real users go through.
State lifecycle
The tempEmail object lives for the duration of one scenario. Starting a new #Case block creates a fresh mailbox. Parallel scenarios get parallel inboxes automatically.
Step 5. Open the run directory
The filesystem is the debugger. One directory per run at /tmp/assrt/<runId>/. The video player opens at 5x playback by default; the 1 / 2 / 3 / 5 / 0 keyboard bindings rebind the speed so you can slow down when the verification step looks wrong. No separate reporter UI, no vendor dashboard to log into.
Before and after this tutorial
The before is the test suite most teams ship: a login smoke test with a fixture user. The after is what you walk away with when signup and OTP are also automated. Click to flip the card.
Login smoke test → Real signup + OTP regression
Most tutorials stop here and tell you to come back for the 'advanced' lesson that never gets written.
- Creates a fixture user ahead of time
- Logs in with the fixture, asserts on /dashboard
- Silently skips signup, email delivery, OTP
- Breaks the first time the verification template changes
- Cannot test rate limits, expiry, or resend
Fixture email vs disposable mailbox, side by side
The fixture approach is what every Playwright tutorial at this keyword teaches. The disposable approach is what this tutorial teaches. The code delta is small; the coverage delta is most of your auth funnel.
Fixture user (Playwright) vs disposable mailbox (#Case)
import { test, expect } from '@playwright/test';
// Pre-seeded user created by a fixture script.
// The real signup flow is never exercised.
test('login works', async ({ page }) => {
await page.goto('/login');
await page.getByLabel('Email').fill('demo@example.com');
await page.getByLabel('Password').fill('hunter2');
await page.getByRole('button', { name: 'Sign in' }).click();
await expect(page).toHaveURL('/dashboard');
});
// Verification email template can rot for weeks
// before anyone notices.Wire it into your own QA suite in five steps
The whole tutorial, compressed into one checklist you can hand to another engineer. Each step is independently verifiable.
Install assrt-mcp and store your LLM key
npm i -g @assrt/mcp (or npx). The agent reads your Anthropic or Gemini key from macOS Keychain in dev and from ANTHROPIC_API_KEY / GEMINI_API_KEY in CI. Nothing about this tutorial requires a vendor portal or a paid account beyond the LLM token.
Write scenario.md with one #Case block
Copy the ten-line plan above. The header `#Case 1: name` is parsed by the regex /(?:#?\s*(?:Scenario|Test|Case))\s*\d*[:.]\s*/gi at agent.ts line 569. Body lines can be any English a junior tester would follow.
Run it against a local app
npx assrt-mcp --url http://localhost:3000 --plan scenario.md. The agent spawns @playwright/mcp, takes an accessibility snapshot, calls create_temp_email, and starts typing. You will see every tool call in the terminal as it happens.
Open the run directory and watch the video
/tmp/assrt/<runId>/video/player.html opens in any browser. Default playback is 5x so you see a twelve-second run in under three seconds. Keyboard bindings 1, 2, 3, 5, 0 remap the speed. This is the whole debugger, no reporter UI to install.
Move scenario.md into your repo and commit
The plan file is the artifact. Commit it under tests/scenarios/signup.md or similar. Add a CI job that runs `npx assrt-mcp --plan tests/scenarios/signup.md` and uploads /tmp/assrt/ as a GitHub Actions artifact on failure. Your signup flow is now under regression for zero infra cost.
Versus the QA automation tools that charge for this
A commercial QA automation platform that bolts verification onto your test suite typically lands between a few hundred and several thousand dollars a month. The table below is the honest split between what you rent and what you own with the #Case approach.
| Feature | Typical SaaS QA platform | Assrt (#Case) |
|---|---|---|
| Plan format | Proprietary YAML or dashboard recorder | Plaintext #Case blocks in scenario.md |
| Email verification built in | Requires custom webhook or paid add-on | Three tools shipped by default |
| How codes are extracted | Hidden inside a vendor service | 7 regex patterns in email.ts:101-109 |
| Mailbox provider | Managed vendor mailbox | temp-mail.io (swappable by forking email.ts) |
| Debug surface | Web dashboard behind a login | Local /tmp/assrt/<runId>/ tarball |
| Monthly cost at team scale | ~$7.5K/mo typical (Testim, mabl, QA Wolf) | $0 + LLM tokens |
| Cancel and keep the tests | No. Tests live in their cloud. | Yes. scenario.md is already in your repo. |
Want this running against your signup flow this week?
20 minutes with the team. We watch the run together, then you leave with a scenario.md you can commit.
Book a call →Frequently asked questions
Why do most QA automation tutorials skip the signup and OTP flow?
Because it is genuinely the hardest part of a real test suite to automate, and it is easier to write a tutorial about login with a fixture user. The signup flow usually involves a real mail server, a verification email, a time-sensitive code, a cross-session paste, and a callback URL. Most tutorials hand-wave this with 'use a test fixture account' or 'mock your email provider', and the reader walks away with a toy test that collapses on the first production use case. This tutorial fixes that gap by leaning on three Assrt tools (create_temp_email, wait_for_verification_code, check_email_inbox) that wrap a real disposable mailbox and a deterministic regex code extractor. Every step is checkable: the tools are defined in assrt-mcp/src/core/agent.ts between lines 115 and 130, and the regex patterns live in assrt-mcp/src/core/email.ts between lines 101 and 109.
What exactly is a disposable email, and why do I need one for QA automation?
A disposable email is a temporary inbox at a provider like temp-mail.io that accepts any message addressed to it and exposes those messages over an HTTP API. You do not run a mail server, you do not create a Gmail account, you do not have to clean up state between runs. Assrt calls `POST https://api.internal.temp-mail.io/api/v3/email/new` (email.ts line 44) with `{ min_name_length: 10, max_name_length: 10 }` so every run gets a unique 10-character prefix like aQ9mZ2cP7r@1secmail.net. You type that address into the signup form, your app sends the verification email, the mailbox receives it, and the tutorial reads it back with `GET /email/{address}/messages`. No SMTP stub, no fixture pool to maintain.
How does the verification code get extracted from the email body?
email.ts lines 101-109 run seven regex patterns in priority order against the plain-text body (HTML emails are stripped first). The patterns are, in order: /(?:code|Code|CODE)[:\s]+(\d{4,8})/, /(?:verification|Verification)[:\s]+(\d{4,8})/, /(?:OTP|otp)[:\s]+(\d{4,8})/, /(?:pin|PIN|Pin)[:\s]+(\d{4,8})/, then raw 6-digit, 4-digit, and 8-digit number matches as fallbacks. The first pattern that matches wins, so a well-formatted email like 'Your code: 482197' resolves before the extractor ever considers random 6-digit numbers elsewhere in the body. If no pattern matches, the full body is returned so the agent can reason about the content instead of failing hard.
What are the three disposable-email tools I call from my scenario file?
create_temp_email creates a new disposable address and stores it on the agent as `this.tempEmail` (agent.ts line 801). wait_for_verification_code polls the inbox every 3000ms for up to 60 seconds by default, capped at 120000ms by Math.min on agent.ts line 810, and returns `{ code, from, subject, body }`. check_email_inbox is a raw fetch of all current messages, useful when you want to assert on a welcome email that is not a verification code. You call them by name inside a plain-English `#Case` block, exactly as you would tell a careful junior tester: 'Call create_temp_email, type that address into the Email field, submit, then call wait_for_verification_code and type the code into the OTP field.'
Does this replace Playwright, or work with it?
It runs on top of Playwright. Assrt spawns @playwright/mcp under the hood, which is the official Playwright MCP server. The agent still drives a real Chromium, Firefox, or WebKit process via real Playwright APIs, and every interaction (navigate, click, type, snapshot) is a real Playwright command. The QA automation tutorial value-add is that you write the plan as English, not as code, and the agent translates the English into Playwright calls using the 18 tools defined in agent.ts between lines 16 and 196. If you already have a Playwright suite, the two can live side by side: keep the high-traffic tests in TypeScript, and use plain-English `#Case` blocks for flows that would otherwise never get automated (like signup + OTP).
How long does it take to run a signup + OTP test end to end?
For a typical SaaS signup flow with a 6-digit email code, expect about 8 to 14 seconds total. Breakdown: navigate and snapshot (1-2s), type email and password (1s each), click sign up (1-2s), create_temp_email and type it (usually already cached, under 1s), wait_for_verification_code polls every 3 seconds and most providers deliver in 3-6 seconds, type the code (1s), submit (1-2s), wait for the success page and assert. The poll interval of 3000ms is a deliberate cost-vs-latency trade: tight enough to catch fast providers, wide enough not to rate-limit temp-mail.io during a dozen parallel tests.
What happens if the verification email never arrives?
The waitForVerificationCode method at email.ts line 82 returns null after the timeout expires. Back in agent.ts line 822-827, the tool handler converts that into a failed step with description 'Verification email timeout' and records it in the scenario result. The scenario is not aborted automatically; the agent sees the failure in its conversation history and can call suggest_improvement to log it as a product bug (maybe your transactional email provider is down), or retry with a fresh mailbox. In CI, you treat the failed scenario like any other red test: the run directory at /tmp/assrt/<runId>/ still contains the video, screenshots, and events.json so a human can review what the agent was waiting for.
Can I use my own SMTP server or mail provider instead of temp-mail.io?
Yes, but not through the bundled tools. The DisposableEmail class in email.ts is coupled to the temp-mail.io REST API (BASE constant on line 9). If you need to swap providers, fork that file and implement the same three methods: create(), getMessages(), and waitForVerificationCode(). The agent.ts handlers at lines 800-842 only use the public interface, so any object that returns the same shape works. For most tutorials and smoke tests temp-mail.io is fine; the reason to swap is typically compliance (you want an inbox on your own infra) or testing your own transactional templates with your own allow-list.
How is this different from the QA automation tools that charge seven thousand per month?
Three ways. First, the plan lives in your repo as scenario.md, not in a vendor dashboard, so you keep it when you cancel. Second, the runner is npx assrt-mcp, so nothing is rented. Third, the disposable-email integration is open-source code you can read, which is how you know the regex patterns, the polling interval, and the 10-character prefix rule in this tutorial are real. A commercial platform that builds the same feature behind a closed dashboard is selling you confidence in a black box. A tutorial that leans on open-source tools can cite email.ts line 49 for the prefix length and agent.ts line 810 for the timeout ceiling, and you can verify both in about 30 seconds.
Will this work in a headless CI environment?
Yes. The disposable-email flow is pure HTTP, so it runs the same on a CI worker as it does on your laptop. Playwright MCP launches Chromium headless by default, and temp-mail.io does not care whether the request comes from GitHub Actions or your terminal. The one thing to remember is that your LLM key needs to be available as an environment variable (ANTHROPIC_API_KEY or GEMINI_API_KEY); Assrt reads from keychain in dev and from env in CI. Upload /tmp/assrt/<runId>/ as a workflow artifact and a failing run is still inspectable from the Actions page.
Can I test signup + OTP for a flow that texts an SMS instead of sending email?
Out of the box, no. The three disposable tools are all email-flavored. For SMS you can use the http_request tool (agent.ts line 172) to poll a service like Twilio, a disposable SMS provider, or your own /debug endpoint that surfaces recent codes. Write the plan as 'Poll https://myservice.example/dev/last-sms and assert the body contains a 6-digit code, then type that code into the OTP field.' The agent will call http_request, pull the JSON, extract the code with a prompt-level regex, and continue. The pattern is the same as email; only the source of the code changes.
Who is this QA automation tutorial actually for?
People who have already done 'my first Playwright test' once and want to automate the flow that actually matters in production: a new user signing up, verifying their email, and landing on the dashboard. It is useful for small teams that cannot justify a Testim subscription, for solo founders that need to protect their signup funnel, for staff engineers that want a regression guard on auth changes without paying a vendor, and for anyone who has tried to mock a mail server in CI and lost a week to it.
How did this page land for you?
React to reveal totals
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.