Visual regression testing (Playwright)#

QuantEcon’s two theme repositories use Playwright for visual-regression testing: they render a small fixture site, take screenshots, and compare them pixel-for-pixel against committed baseline images. A failing test means the theme changed how something looks — which is exactly what you want to catch (and review) when working on a theme.

Repository

What it themes

Playwright role

quantecon-book-theme

The Sphinx / Jupyter Book theme (static HTML)

Snapshot every theme surface against a curated fixtures site

quantecon-theme-src

The MyST runtime theme (Remix server)

Snapshot the rendered fixture + a WebKit FOUC guard

Note

Visual tests catch what unit tests cannot: a CSS change that shifts spacing, a markup change that breaks an admonition, or a dependency bump that re-renders math. Because the output is an image, “did this look right?” is answered by a diff a human can review, not by an assertion someone had to write in advance.

How it works (both repos)#

The shared pattern is:

  1. Serve a fixture site locally on a fixed port (Playwright’s webServer).

  2. Visit each page and call expect(page).toHaveScreenshot(...).

  3. Playwright compares the screenshot to the committed baseline under tests/visual/__snapshots__/ and fails on a diff larger than the configured tolerance.

  4. On an intentional change you regenerate the baselines (--update-snapshots) and commit the new images.

Both repos run desktop Chrome (1280-wide) and mobile (Pixel 5) viewports, store the test code in tests/visual/, and keep baselines in tests/visual/__snapshots__/<project>/. Where they differ is in how the fixture is served, because the two themes are fundamentally different artifacts.

quantecon-book-theme — static HTML fixtures#

The Sphinx theme produces static HTML, so the fixture is a pre-built site served with a plain HTTP server:

  • The fixture site lives in a separate repo, quantecon-book-theme-fixtures: a landing page, ~12 synthetic pages (one per theme surface — typography, code blocks, math, admonitions, exercises, proofs, tables, figures, …) plus real-world lecture captures that previously exposed theme bugs.

  • It is pinned to a specific commit (FIXTURES_SHA in the workflows) so the input doesn’t move between theme PRs.

  • playwright.config.ts serves it with python -m http.server and points baseURL at it.

  • theme.spec.ts loops over the fixture pages taking a full-page screenshot plus header (.qe-page__header) and sidebar (.qe-sidebar) region snapshots, and adds targeted tests for dark mode, code blocks, f-string token styling, MathJax, typography, and definition lists.

Note

Because the fixtures site is rendered ahead of time (notebooks are not executed), every page is deterministic — there is no flaky plot output to skip. Pages with rendered math use a looser tolerance (maxDiffPixelRatio: 0.05, vs. the strict 0.01 default) because MathJax font metrics shift slightly between Ubuntu CI and macOS.

quantecon-theme-src — runtime MyST server#

The MyST theme is consumed by lectures as a built bundle served by a live Remix/myst server, so the fixture is rendered at run time:

  • The fixture is a minimal MyST project in tests/visual/fixture/ (intro.md, features.md, notebook.ipynb).

  • serve.sh injects the theme under test into the fixture’s myst.yml via the THEME_TEMPLATE environment variable, then runs myst start. THEME_TEMPLATE accepts either a local theme build directory (the candidate) or a GitHub archive zip URL (a deployed/released bundle) — so you can diff any two theme versions against identical content.

  • theme.spec.ts takes one full-page snapshot per surface; the notebook page matters most because it exercises the @myst-theme output-node rendering.

The FOUC guard (WebKit-only)#

quantecon-theme-src also ships fouc.spec.ts, a flash-of-unstyled-content guard for Safari/WebKit (#66). It aborts all external stylesheets so only the inline critical CSS in app/root.tsx can reach the first paint, then asserts the layout and font are already correct. It is snapshot-free (it checks computed display / font-family, not pixels), so it is robust across myst and CI versions. It runs on the webkit-fouc project only — Chromium paint-holds and cannot exhibit the flash.

Running the tests locally#

The tox environments handle everything, including cloning the fixtures repo into ./fixtures/:

# Run visual tests (clones fixtures if not present, builds them, runs Playwright)
tox -e visual

# Update baselines after an intentional change
tox -e visual-update

# Pin to a specific fixtures commit for local testing
FIXTURES_REF=<sha> tox -e visual

Prefer the manual route? tests/visual/README.md documents the by-hand setup (git clone the fixtures, pip install -e ., jb build, npm install, npx playwright install chromium, npm run test:visual).

Build the candidate theme, then point THEME_TEMPLATE at it:

# Build the candidate theme from this branch into .deploy/quantecon-theme
make build-theme

# Diff it against the committed baselines
THEME_TEMPLATE="$PWD/.deploy/quantecon-theme" npm run test:visual

# Refresh baselines once the diffs are confirmed intentional
THEME_TEMPLATE="$PWD/.deploy/quantecon-theme" npm run test:visual:update

# Run just the WebKit FOUC guard
THEME_TEMPLATE="$PWD/.deploy/quantecon-theme" npm run test:fouc

Requires the mystmd CLI (myst) on PATH and the browser binaries (npx playwright install --with-deps chromium webkit).

Warning

Snapshots are platform-specific — font rendering differs between macOS and Linux. In quantecon-book-theme, the committed tests/visual/__snapshots__/ baselines are the Ubuntu/CI images; local macOS runs use a separate, gitignored tests/visual/macos/ directory (selected via SNAPSHOT_DIR). This means the authoritative baselines are regenerated in CI, not from your laptop — see below.

Updating baselines#

quantecon-book-theme — via PR slash commands#

Because the committed baselines are the Ubuntu/CI images, you regenerate them on the PR, not locally. Comment on the pull request:

Comment

Effect

/update-snapshots

Regenerates all baselines with --update-snapshots, uploads a before/after diff artifact, and commits the new images to your PR branch. Use when the theme styling legitimately changed.

/update-new-snapshots

Generates only missing baselines — existing images are left untouched. Use when you’ve added a new test/fixture page.

The .github/workflows/update-snapshots.yml workflow performs the regeneration and pushes the commit.

Tip

First-time CI setup for a new branch: CI will fail because no Ubuntu baselines exist yet. Comment /update-new-snapshots to seed them, then push your real changes.

quantecon-theme-src — locally#

quantecon-theme-src regenerates baselines locally with npm run test:visual:update (see the tab above) and you commit the resulting images. Its CI runs only the FOUC guard, not the full snapshot suite — the pixel snapshots are a local/manual validation step, so review playwright-report/ before committing updated baselines.

CI integration#

.github/workflows/ci.yml splits into three jobs so the preview deploys even when visual tests fail — reviewers need the preview to decide whether a failure is an intentional design change or a regression:

build  →  visual   (gate; may fail)
       ↘
         preview   (always runs; Netlify deploy reviewers eyeball)

build renders the pinned fixtures site once; visual and preview consume it in parallel. The visual job posts a “🎭 Visual Regression Test Results” summary comment and uploads the playwright-report artifact (open its index.html for diffs). Separately, tests.yml runs the pre-commit and pytest gates.

.github/workflows/ci.yml runs Build & Typecheck (npm run compile, npm run prod:build) and the FOUC guard (WebKit) job, which builds the candidate theme with make build-theme and runs npm run test:fouc. The full Chromium snapshot suite is not run in CI; it is the local validation step described above.

Bumping the fixtures pin (quantecon-book-theme)#

The fixtures repo is pinned by FIXTURES_SHA, which appears in two workflow files — .github/workflows/ci.yml and .github/workflows/update-snapshots.yml. To adopt a new fixtures commit (e.g. after adding a fixture page upstream):

  1. Merge the new page in quantecon-book-theme-fixtures.

  2. Open a PR on quantecon-book-theme updating FIXTURES_SHA in both files (and add a matching test in theme.spec.ts if the new page needs one).

  3. Comment /update-new-snapshots on that PR to seed baselines for the new page.

When you change a theme — checklist#

  1. Make your change and run the visual tests locally (tox -e visual, or make build-theme + npm run test:visual).

  2. If pixels changed, decide whether the change is intentional.

    • Intentional → regenerate baselines (/update-snapshots on the PR for quantecon-book-theme; npm run test:visual:update for quantecon-theme-src) and commit them.

    • Unintentional → it’s a regression; fix it.

  3. Review the diff (the playwright-report artifact / Netlify preview) and reference it in the PR so a reviewer can confirm the visual change is what you intended.

See also

The canonical, repo-local details live alongside the tests: quantecon-book-theme/tests/visual/README.md and quantecon-theme-src/tests/visual/README.md. This page is the cross-repo overview; those READMEs are the source of truth for each setup.