Tired of eyeballing Figma vs Storybook? Here’s how I gate design fidelity in CI

This content originally appeared on DEV Community and was authored by Kazunori Osaki

I got tired of eyeballing Figma vs Storybook on every PR and trying to remember:

“Is this actually what the designer meant?”

So I built uiMatch – a CLI that compares a Figma frame directly with your implementation (Storybook iframe or any URL), generates diff images, and gives you a 0–100 “Design Fidelity Score” you can gate in CI.

By the end of this post, you’ll have

Automated Figma vs Storybook comparison running locally
A clear pass/fail quality gate you can understand and tune
CI that blocks PRs when designs drift too far

All in about 15 minutes.

Demo

Step 1: Run your first comparison locally

First, install uiMatch and Playwright:

# Install globally
npm install -g @uimatch/cli playwright
npx playwright install chromium

# or as a dev dependency
npm install -D @uimatch/cli playwright
npx playwright install chromium

Export your Figma token. uiMatch CLI doesn’t load .env files automatically, so it has to be in the environment. Get one from Figma Settings → Personal access tokens:

export FIGMA_ACCESS_TOKEN=figd_xxx

Now run a minimal comparison:

npx -p @uimatch/cli uimatch compare \
  figma=FILE_KEY:NODE_ID \
  story=http://localhost:6006/iframe.html?id=button--primary \
  selector="#root button" \
  outDir=./uimatch-reports \
  profile=component/strict

Tip: I recommend using FILE_KEY:NODE_ID format, but if you pass a full Figma URL, quote it in the shell (e.g., figma='https://www.figma.com/...') so ? and & don’t get split into separate arguments.

What this does:

Fetches the Figma frame via the Figma REST API
Captures your implementation using Playwright (Chromium)
Compares them using the component/strict profile
Saves figma.png, impl.png, diff.png, and report.json into ./uimatch-reports

What the output looks like

In the screenshots below:

Figma uses uiMatch with a dark button
Implementation uses UI Match with a green button
uiMatch detects the color / layout / text differences and highlights them in red on the diff

Figma	Implementation

Diff

Visually, the implementation still “looks right”, but the diff + metrics make it obvious where it deviates and whether it passes the quality gate.

Step 2: Understanding the report (DFS + quality gate)

uiMatch writes a JSON report that looks roughly like:

{
  "metrics": {
    "pixelDiffRatio": 0.0583,
    "colorDeltaEAvg": 0,
    "dfs": 97
  },
  "dimensions": {
    "figma": { "width": 1892, "height": 560 },
    "impl": { "width": 1892, "height": 560 },
    "compared": { "width": 1892, "height": 560 },
    "sizeMode": "pad",
    "adjusted": false
  },
  "qualityGate": {
    "pass": false,
    "cqi": 40,
    "cqiBreakdown": {
      "components": [
        { "name": "pixel", "rawValue": 0.0583, "threshold": 0.01, "penalty": 60, "weight": 0.6 }
        // ...
      ],
      "totalPenalty": 60,
      "baseScore": 100
    },
    "reasons": ["pixelDiffRatio 5.83% > 1.00%"],
    "thresholds": {
      "pixelDiffRatio": 0.01,
      "deltaE": 3
    }
  },
  "styleDiffs": [],
  "meta": {
    "figmaAutoRoi": { "applied": false }
  }
  // ... other fields omitted for brevity
}

Two key fields for CI:

metrics.dfs: A 0–100 “how close did we get?” score that blends pixel, color, and layout signals
qualityGate.pass: A strict pass/fail decision based on the active profile’s thresholds

In practice you’ll usually just read those two fields and ignore most of the rest in CI.

That’s why you can see dfs: 97 and qualityGate.pass: false at the same time:
the overall fidelity is high, but the current profile (e.g., component/strict with pixelDiffRatio <= 1%) still decides “this is above the allowed diff, so fail the gate”.

Choosing the right profile

uiMatch ships with a few built-in quality gate profiles:

profile=component/strict   # pixelDiffRatio: 0.01 (1%), deltaE: 3.0
profile=component/dev      # pixelDiffRatio: 0.08,       deltaE: 5.0
profile=page-vs-component  # padded / letterboxed comparisons
profile=lenient            # prototypes, rough drafts

Note on component/strict:
It’s intentionally very strict. Even with a visually “perfect” implementation,
font rendering differences and anti-aliasing can easily produce 2–3% pixel differences.
That’s normal.

In practice:

For day-to-day CI I’d default to component/dev or lenient

I reserve component/strict for design-system components in controlled environments (fixed fonts, consistent rendering stack, etc.)

Step 3: Wiring into CI (GitHub Actions)

uiMatch is built as a plain CLI: it reads env vars, talks to the Figma API, runs Playwright, and writes a JSON report + PNGs. That means it slots nicely into CI.

Here’s a basic GitHub Actions workflow:

name: uiMatch QA
on: [pull_request]

jobs:
  compare:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '22'

      - name: Install uiMatch + Playwright
        run: |
          npm install -g @uimatch/cli playwright
          npx playwright install --with-deps chromium

      - name: Run uiMatch
        env:
          FIGMA_ACCESS_TOKEN: ${{ secrets.FIGMA_TOKEN }}
        run: |
          npx -p @uimatch/cli uimatch compare \
            figma=${{ secrets.FIGMA_FILE }}:${{ secrets.FIGMA_NODE }} \
            story=https://your-storybook.com/iframe.html?id=button--primary \
            selector="#root button" \
            outDir=uimatch-reports \
            profile=component/strict

      - name: Enforce quality gate
        run: |
          node - <<'EOF'
          const fs = require('fs');
          const report = JSON.parse(fs.readFileSync('uimatch-reports/report.json', 'utf8'));
          const dfs = report.metrics?.dfs ?? 0;
          const pass = report.qualityGate?.pass ?? false;

          if (!pass) {
            console.error(`❌ Quality gate failed (DFS=${dfs})`);
            console.error(`Reasons: ${report.qualityGate?.reasons?.join(', ')}`);
            process.exit(1);
          }

          console.log(`✅ Quality gate passed (DFS=${dfs})`);
          EOF

      - name: Upload artifacts
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: uimatch-reports
          path: uimatch-reports/

Replace figma=..., story=..., and selector=... with your own setup.

I’m still dogfooding this workflow myself, so treat it as a starting point rather than a battle-tested template.

Advanced: Text matching (catching `uiMatch` vs `UI Match`)

Pixel comparison is great, but copy/typo issues are easy to miss.

uiMatch has:

a separate text-diff subcommand
an experimental text-matching mode for compare

`text-diff` CLI

npx uimatch text-diff "Sign in" "SIGN  IN"
# → kind: "whitespace-or-case-only", similarity: 1.0

It classifies differences into buckets like:

exact-match
whitespace-or-case-only
normalized-match
mismatch

and gives you a similarity score (0–1) so you can decide how picky you want to be.

Text matching in `compare` (experimental)

You can also ask compare to look at the text inside the target element and compare it to the Figma text:

npx -p @uimatch/cli uimatch compare \
  figma=FILE_KEY:NODE_ID \
  story=http://localhost:6006/iframe.html?id=hero \
  selector="[data-testid='hero']" \
  text=true \
  textMode=descendants \
  outDir=./uimatch-reports

That’s what I use to catch things like:

uiMatch vs UI Match
View docs → View on GitHub
accidental whitespace / casing issues

The text diff results are included in report.json alongside the pixel metrics.

Advanced: Targeting the right element (`selector`)

The selector you pass to uiMatch is forwarded straight to Playwright,
so you can reuse whatever locators you already use in your tests:

# Simple CSS
selector="#root button"

# Data attributes / test IDs
selector="[data-testid='hero']"

# Playwright role / text selectors
selector="role=button[name='Sign in']"
selector="button:has-text('View docs')"

That means you don’t need a second selector system just for visual diffing —
you can piggyback on your existing Playwright setup.

For more refactor-resistant targeting, uiMatch also has an experimental
@uimatch/selector-anchors plugin that resolves “anchors” in your code
to real selectors via AST. I’ll cover that in a separate post.

What uiMatch does under the hood

For those curious about the implementation:

Figma: Fetch PNG via Figma REST API (or use cached PNG in CI)
Implementation: Capture with Playwright + grab computed styles
Core engine:
- pixelmatch for pixel diffs
- Perceptual color difference (ΔE2000) for color comparisons
- Multiple size modes: strict, pad, crop, scale
- Quality gate profiles with configurable thresholds
- Design Fidelity Score (0–100) combining pixel, color, and layout signals

Links & status

GitHub: https://github.com/kosaki08/uimatch
Docs: https://kosaki08.github.io/uimatch/

It’s still 0.x / experimental and I’m mostly using it on Storybook setups right now.
I’m very curious whether this kind of “Figma vs implementation in CI” flow would be useful to you, or if you’re solving it in a totally different way.

If you have ideas, weird edge cases, or “this would only be useful if it also did X”, I’d love to hear them.