Image Difference Analysis

Takes a raw pixel diff and tells you what changed, where, and how much. No ML models - a deterministic pipeline that runs in the same binary as the diff itself. Available in both the native Node binding (@blazediff/core-native) and the WebAssembly build (@blazediff/core-wasm), so you can run it server-side or in the browser.

How it works

Pixel diff → binary change mask
Morphological close → bridge small gaps
Connected components → isolate regions
Per-region evidence extraction:
- Dual-image gradients - edges in both images + spatial correlation to detect structural preservation
- Color delta distribution - mean, max, and stddev of YIQ distance to separate uniform recolors from patchy texture changes
- Background distance - how much changed pixels blend with local unchanged pixels in each image
Six-label decision tree classifies each region
Post-hoc shift detection matches Addition+Deletion pairs

Demo

Image 1

Image 2

Moderate visual change detected (1.87% of image, 10 regions). Content changed: 3 regions (bottom, center). Content added: 3 regions (right, bottom, bottom-left). Content removed: 2 regions (bottom, center). Content shifted: 2 regions (bottom, top-left).

medium1.87% changed

Regions (10)Hover a region to highlight

bottom·deletion·mixed-region·(670, 977, 199×89) · 0.58%

bottom·content-change·sparse-distributed·(558, 738, 193×173) · 0.34%

bottom·content-change·mixed-region·(726, 910, 208×51) · 0.30%

center·content-change·sparse-distributed·(368, 663, 323×88) · 0.27%

bottom·shift·sparse-distributed·(432, 937, 50×151) · 0.11%

right·addition·sparse-distributed·(1004, 626, 71×130) · 0.08%

top-left·shift·edge-dominated·(343, 102, 39×126) · 0.07%

center·deletion·sparse-distributed·(505, 717, 49×88) · 0.05%

bottom·addition·sparse-distributed·(694, 1085, 80×33) · 0.04%

bottom-left·addition·edge-dominated·(253, 1191, 89×37) · 0.03%

Usage

Both bindings return the same result shape - summary, regions[] (with position, changeType, percentage, …), and severity. The native binding reads files; the WASM binding takes pre-decoded RGBA buffers.

core-native (Node)


import { interpret } from "@blazediff/core-native";
 
const result = await interpret("fixtures/3a.png", "fixtures/3b.png");
 
console.log(result.summary);
for (const region of result.regions) {
  console.log(`${region.position}: ${region.changeType} (${region.percentage.toFixed(2)}%)`);
}

Identical images

Image 1

Image 2

Images are identical

low0.00% changed

When nothing changed, regions is empty and summary reports no differences - identical for both bindings.

Change types

Type	Meaning
`Addition`	Content appeared - blends with background in before image, distinct in after
`Deletion`	Content removed - distinct in before, blends with background in after
`Shift`	Content moved - matched Addition+Deletion pair with similar luminance
`ColorChange`	Recolor - edge structure preserved across both images, uniform color shift
`ContentChange`	Structural change - edges differ between images
`RenderingNoise`	Sub-pixel artifacts - filtered from output

Accuracy

Measured against datasets with hand-labeled change regions (see crates/blazediff-interpret-verify/BENCHMARKS.md for the full breakdown).

Dataset	What it tests	Classifier-only macro F1	End-to-end macro F1
`addition_deletion`	Clean object insert / remove on photographs	0.998	0.888
`shift`	Sub-region translations with pixel-perfect ground truth	0.813	0.628
`inpaintcoco`	Inpaint edits that mix recolor and texture replacement	0.440	0.260
`html_color_pairs`	Recolors on rendered Tailwind UI screenshots	0.993	0.786

Read this as: on clear add/delete edits the classifier almost never picks the wrong label (0.998 means 4 mistakes in 904 regions). On synthetic shifts the post-pass pairs about two thirds of moved-block events with near-perfect precision. On real inpainted photos it lands the right label in about four of every nine regions - the ColorChange vs ContentChange boundary is the dominant confusion and the focus area for the next iteration. html_color_pairs isolates that same boundary on 100 rendered Tailwind UI screenshots that differ only in color classes: a dedicated chromatic-recolor branch admits same-luminance hue swaps that YIQ otherwise scores as low-delta, lifting classifier F1 to 0.993 with zero false positives.

End-to-end runs the full detector pipeline before classifying, so the score also reflects detection misses and spurious small regions; classifier-only isolates labeling quality from detection.