Skip to Content
New: @blazediff/agent - agentic visual regression your coding agent can judge. Read more →
DocsImage Difference Analysis

Image Difference Analysis

Takes a raw pixel diff and tells you what changed, where, and how much. No ML models - a deterministic pipeline that runs in the same binary as the diff itself. Available in both the native Node binding (@blazediff/core-native) and the WebAssembly build (@blazediff/core-wasm), so you can run it server-side or in the browser.

How it works

  1. Pixel diff → binary change mask
  2. Morphological close → bridge small gaps
  3. Connected components → isolate regions
  4. Per-region evidence extraction:
    • Dual-image gradients - edges in both images + spatial correlation to detect structural preservation
    • Color delta distribution - mean, max, and stddev of YIQ distance to separate uniform recolors from patchy texture changes
    • Background distance - how much changed pixels blend with local unchanged pixels in each image
  5. Six-label decision tree classifies each region
  6. Post-hoc shift detection matches Addition+Deletion pairs

Demo

Image 1

Image 1

Image 2

Image 2

Moderate visual change detected (1.87% of image, 10 regions). Content changed: 3 regions (bottom, center). Content added: 3 regions (right, bottom, bottom-left). Content removed: 2 regions (bottom, center). Content shifted: 2 regions (bottom, top-left).

medium1.87% changed

Regions (10)Hover a region to highlight

bottom·deletion·mixed-region·(670, 977, 199×89) · 0.58%
bottom·content-change·sparse-distributed·(558, 738, 193×173) · 0.34%
bottom·content-change·mixed-region·(726, 910, 208×51) · 0.30%
center·content-change·sparse-distributed·(368, 663, 323×88) · 0.27%
bottom·shift·sparse-distributed·(432, 937, 50×151) · 0.11%
right·addition·sparse-distributed·(1004, 626, 71×130) · 0.08%
top-left·shift·edge-dominated·(343, 102, 39×126) · 0.07%
center·deletion·sparse-distributed·(505, 717, 49×88) · 0.05%
bottom·addition·sparse-distributed·(694, 1085, 80×33) · 0.04%
bottom-left·addition·edge-dominated·(253, 1191, 89×37) · 0.03%

Usage

Both bindings return the same result shape - summary, regions[] (with position, changeType, percentage, …), and severity. The native binding reads files; the WASM binding takes pre-decoded RGBA buffers.

import { interpret } from "@blazediff/core-native"; const result = await interpret("fixtures/3a.png", "fixtures/3b.png"); console.log(result.summary); for (const region of result.regions) { console.log(`${region.position}: ${region.changeType} (${region.percentage.toFixed(2)}%)`); }

Identical images

Image 1

Image 1

Image 2

Image 2

Images are identical

low0.00% changed

When nothing changed, regions is empty and summary reports no differences - identical for both bindings.

Change types

TypeMeaning
AdditionContent appeared - blends with background in before image, distinct in after
DeletionContent removed - distinct in before, blends with background in after
ShiftContent moved - matched Addition+Deletion pair with similar luminance
ColorChangeRecolor - edge structure preserved across both images, uniform color shift
ContentChangeStructural change - edges differ between images
RenderingNoiseSub-pixel artifacts - filtered from output

Accuracy

Measured against datasets with hand-labeled change regions (see crates/blazediff-interpret-verify/BENCHMARKS.md  for the full breakdown).

DatasetWhat it testsClassifier-only macro F1End-to-end macro F1
addition_deletionClean object insert / remove on photographs0.9980.888
shiftSub-region translations with pixel-perfect ground truth0.8130.628
inpaintcocoInpaint edits that mix recolor and texture replacement0.4400.260
html_color_pairsRecolors on rendered Tailwind UI screenshots0.9930.786

Read this as: on clear add/delete edits the classifier almost never picks the wrong label (0.998 means 4 mistakes in 904 regions). On synthetic shifts the post-pass pairs about two thirds of moved-block events with near-perfect precision. On real inpainted photos it lands the right label in about four of every nine regions - the ColorChange vs ContentChange boundary is the dominant confusion and the focus area for the next iteration. html_color_pairs isolates that same boundary on 100 rendered Tailwind UI screenshots that differ only in color classes: a dedicated chromatic-recolor branch admits same-luminance hue swaps that YIQ otherwise scores as low-delta, lifting classifier F1 to 0.993 with zero false positives.

End-to-end runs the full detector pipeline before classifying, so the score also reflects detection misses and spurious small regions; classifier-only isolates labeling quality from detection.

Last updated on