build · quickstart
Build with Harbor.
Install the reference extension. Bring your own model. Write three lines. The whole loop — Firefox, Ollama, your code — runs in about ten minutes.
1. Install Harbor
Harbor is two browser extensions and a small native bridge (Rust). Both extensions need to be loaded for the Web Agent API to work end-to-end.
Prerequisites
- Node.js 18+ · nodejs.org
- Rust ·
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh - Ollama · ollama.com or
brew install ollama - Firefox 109+ or Chrome 120+ — Firefox recommended
Clone and build
git clone --recurse-submodules https://github.com/r/harbor.git cd harbor # Build the Harbor extension cd extension && npm install && npm run build && cd .. # Build the Web Agents API extension cd web-agents-api && npm install && npm run build && cd .. # Build and install the native bridge cd bridge-rs && cargo build --release && ./install.sh && cd ..
Start a model
ollama serve & ollama pull llama3.2
Load both extensions in Firefox
- Open
about:debugging#/runtime/this-firefox - "Load Temporary Add-on…" →
extension/dist-firefox/manifest.json - "Load Temporary Add-on…" again →
web-agents-api/dist-firefox/manifest.json - Open the Harbor sidebar (Cmd + B). Verify Bridge: Connected and LLM: Ollama.
Chrome works the same way with chrome://extensions →
Load unpacked → the two dist-chrome/ directories.
One extra step for Chrome's native messaging is documented in the repo.
2. Hello, browser AI
The minimum viable use of the API: ask permission, prompt the model, render the answer.
<button id="ask">Ask AI</button>
<div id="out"></div>
<script>
document.getElementById("ask").onclick = async () => {
await window.agent.requestPermissions({
scopes: ["model:prompt"],
reason: "To answer your question",
});
const session = await window.ai.createTextSession();
const reply = await session.prompt("What is 2 + 2?");
document.getElementById("out").textContent = reply;
};
</script>
3. Run an agent
Real agents create an explicit session in plan mode, propose
what they want to do, and only ask the user to upgrade to execute
when they have something to apply. The session carries a capability
token that bounds everything the agent can do — when you upgrade, the
token is re-minted with the new actions. agent.run remains
gated by the toolCalling feature flag in the Web Agents API
sidebar — flip it on for development.
const session = await agent.requestCapabilities({
name: "Research assistant",
reason: "Research assistant needs search and memory.",
mode: "plan",
require: [
{ action: "model.prompt.local" },
{ action: "tool.call", server: "brave-search", toolNames: ["search"] },
{ action: "tool.call", server: "memory", toolNames: ["save"] },
],
budget: { maxToolCalls: 30, ttlMinutes: 15 },
});
for await (const ev of agent.run({
sessionId: session.id,
task: "Find recent press on AI safety",
maxToolCalls: 10,
})) {
if (ev.type === "thinking") console.log(ev.content);
if (ev.type === "tool_call") console.log("→", ev.tool);
if (ev.type === "final") console.log(ev.output);
}
// Capability tokens are one-way: upgradeSession can narrow but not
// widen. To gain write authority later, mint a fresh session with a
// scoped, reason-tagged prompt the user can grant in one click:
const writeSession = await agent.requestCapabilities({
name: "Apply suggested edit",
reason: "Apply the changes you approved",
mode: "execute",
require: [{ action: "browser.write.interact" }],
});
4. Expose your own tools (W3C WebMCP)
Pages can declare JavaScript-backed tools the user's agent can call. No server, no auth — the function lives in the page.
navigator.modelContext.addTool({
name: "search_archive",
description: "Search our 20-year archive",
inputSchema: { type: "object", properties: { query: { type: "string" } } },
handler: ({ query }) => searchArchive(query),
});
Going deeper
The full developer documentation lives in the repo. The starter guide is designed to drop into any AI coding assistant's context (Cursor, Claude Code, Copilot) so it can write Web Agent API code with you.
- Quickstart (zero → working code in 15 min) ↗
- BUILDING_ON_WEB_AGENTS_API — drop into your AI assistant's context ↗
- Full API reference ↗
- Testing — unit tests with mocks, E2E with Playwright ↗
- MCP server authoring guide (build your own tools) ↗
- Working demos — chat, summarizer, research agent, multi-agent ↗
Tell us what you build
We want sites that demo the API, not the API itself. If you build something — even a half-broken sketch — let us know. The architecture lives or dies on the use cases that show up.