A semantic recording layer for web applications.

PageBlueprint captures machine-readable behavioral maps of running web apps — producing structured .appspec files that give AI agents the deep context they need to reason about your UI accurately and act on it reliably.

Add to Chrome → See how it works

See It In Action

Three states, one workflow.

Record, annotate, and export — all from a compact extension popup.

Ready to record

One click starts capturing DOM, network requests, and interaction events on any web page.

Live session

Snapshot on demand, or pause to annotate ephemeral states like modals and dropdowns.

Export your .appspec

Choose exactly the formats you need — semantic tree, accessibility tree, simplified HTML, screenshot, or SVG. Visual formats export as a second .resources.json file.

The Problem

AI agents are working blind.

Raw HTML and screenshots are either too heavy or too shallow for LLMs to reason about interactive UI accurately.

Token bloat from raw exports

Full SVG trees and raw HTML dumps consume 50k–200k tokens — burning budget on structure, not meaning.

Dynamic states go unrecorded

Modals, dropdowns, and overlays open and close in milliseconds. Standard recorders miss them entirely.

No machine-readable narrative

When a user clicks through a flow, the sequence of intent is lost. There's no structured record of what happened and why.

Format inflexibility forces waste

One-size-fits-all exports force teams to ship full representations when all they needed was a compact semantic tree.

The Solution

Capture meaning, not just pixels.

Pain Point	PageBlueprint Feature
LLM token bloat	Multi-format export: semantic tree, accessibility tree, simplified HTML, screenshot, SVG — choose exactly what you need
Missed dynamic states	Pause-to-annotate + on-demand snapshots capture ephemeral UI states reliably
Lost interaction context	Event recording with contextual narrative generation ("Clicked 'Save' in Settings form")
Annotation friction	In-page annotation toolbar with shapes, freehand, and text — exported as an SVG vector overlay in the .appspec
AI ambiguity	Component inventory with selectors, roles, labels, and semantic purpose for every interactive element

Differentiators

Built for AI agents, not screen recorders.

Six properties that make PageBlueprint a fundamentally different kind of capture tool.

🌐

AI-first format design

.appspec is purpose-built for LLM consumption. Multiple compact representations tuned to minimize tokens while maximizing reasoning fidelity.

🔍

Semantic depth

Every interactive element is inventoried with its selector, ARIA role, label, and inferred purpose — not just its pixel position. AI agents can act on this data.

▶

Interaction narratives

Records what users did — clicks, inputs, navigation — and produces contextual narratives AI agents can use to replay, test, or extend workflows.

✏

In-page annotation

Freehand drawing, shapes, and text overlay are built into the recording session. Annotations export as an SVG vector overlay, embedded directly in the .appspec alongside structured element data.

⚙

Flexible export control

Select exactly the representations you need per export. No forced 200k-token SVG payloads when a 2k-token semantic tree will do.

🛠

Modern MV3 architecture

TypeScript + React 19 + Vite with a clean four-context message-passing model. Maintainable, auditable, and extensible.

Export Formats

One capture. Five representations.

Every snapshot includes five output formats — semantic tree, accessibility tree, simplified HTML, screenshot, and SVG. Pick the ones that fit your token budget and task.

semantic-tree

Semantic Tree

Compact hierarchical summary of interactive elements. Ideal for LLM reasoning at minimal token cost.

accessibility-tree

Accessibility Tree

ARIA roles, labels, and states. Enables agents to navigate UI the same way assistive technology does.

simplified-html

Simplified HTML

Stripped, de-noised HTML preserving structure and interactivity. Readable by both agents and humans.

svg

SVG Render

Full visual representation with spatial layout. For agents that reason about position and visual hierarchy.

screenshot

Screenshot

Pixel-accurate JPEG capture of the visible tab at snapshot time. Gives agents visual grounding alongside structured data.

Who It's For

Purpose-built for builders working at the AI frontier.

Three primary use cases, one tool.

Primary

AI / LLM Engineers

Building automated testing, UI modification, and web interaction agents that need structured, token-efficient UI representations to act reliably.

Primary

QA Automation Teams

Recording test scenarios and generating verifiable .appspec behavioral specs that describe UI state precisely and reproducibly.

Primary

Web Developers

Debugging complex interactions or generating semantic documentation of running applications for AI-assisted development workflows.

A semantic recording layer for web applications.

Three states, one workflow.

AI agents are working blind.

Token bloat from raw exports

Dynamic states go unrecorded

No machine-readable narrative

Format inflexibility forces waste

Capture meaning, not just pixels.

Built for AI agents, not screen recorders.

AI-first format design

Semantic depth

Interaction narratives

In-page annotation

Flexible export control

Modern MV3 architecture

One capture. Five representations.

Semantic Tree

Accessibility Tree

Simplified HTML

SVG Render

Screenshot

Purpose-built for builders working at the AI frontier.

AI / LLM Engineers

QA Automation Teams

Web Developers

Give your AI agents eyes that actually see.