Documentation
Everything you need to install Ghost, understand its tools, and integrate it with your AI agent. Ghost turns any website into structured, callable MCP tools — no code, no cloud, no screenshots.
Quick Start
Get Ghost running in under 30 seconds. Three steps, no account required.
Install dependencies
Clone the repo and install packages. Node.js 20+ required.
git clone https://github.com/ajsai47/ghost.git
cd ghost && npm installBuild the Chrome extension (optional)
The extension enables visible browsing with login sessions. Ghost also works headlessly via Playwright without it.
cd packages/extension && node build.mjsThen load in Chrome: chrome://extensions → Developer Mode → Load unpacked → packages/extension/dist/
Register with your MCP client
Works with Claude Code, Codex, Cursor, Roo Code, or any MCP-compatible client.
claude mcp add ghost -- npx tsx ~/ghost/packages/mcp-server/src/index.tsSet your Anthropic API key (required for tool generation):
echo "ANTHROPIC_API_KEY=sk-ant-..." > ~/ghost/.envYou are ready. Open your MCP client and try:
ghost_go("extract the top stories from hacker news")Ghost navigates to the site, analyzes the DOM, generates typed extraction tools, executes them, and returns structured JSON — all in a single tool call.
Core Tools
Ghost ships with 7 primary tools that are always available. These are the building blocks for all web automation.
ghost_go
The primary tool. Tell it what you want in plain English and it executes immediately — navigating, extracting, clicking, searching, whatever the task requires. This is the only tool most users will ever need.
| Parameter | Type | Description |
|---|---|---|
instructionrequired | string | What you want to do, e.g. "go to hacker news and extract the top stories" |
preview | boolean | If true, shows the execution plan without running it. Default: false (executes immediately). |
// Navigate and extract
ghost_go("extract the top 30 stories from Hacker News with scores")
// Search the web
ghost_go("search for AI startups and extract the top results")
// Fill a form
ghost_go("go to example.com/contact and fill the form with name 'Alex', email 'alex@co.com'")
// Preview without executing
ghost_go("extract repos from github.com/trending", preview: true)How it works: ghost_go uses fast-path pattern matching for common commands (navigate, extract, search) and falls back to an LLM decomposer for complex multi-step instructions. Steps execute sequentially with error recovery.
ghost_navigate
Navigate the browser to a specific URL. Automatically checks the tool registry cache and generates new tools if the site has not been visited before. Use this when you know the exact URL.
| Parameter | Type | Description |
|---|---|---|
urlrequired | string | The URL to navigate to. |
tab_id | string | Target a specific tab by ID (from ghost_tab_list). Optional. |
ghost_navigate("https://news.ycombinator.com")
// Returns: Navigated to news.ycombinator.com — 6 tools available (cache).ghost_search
Search the web via the Exa API and return results with optional content snippets. Can auto-navigate to the top result and generate extraction tools in a single call.
| Parameter | Type | Description |
|---|---|---|
queryrequired | string | The search query. |
num_results | number | Number of results to return. Default: 10. |
type | "auto" | "neural" | "instant" | Search type. Default: auto. |
contents | boolean | Include text content snippets in results. Default: false. |
category | "news" | "company" | "research paper" | "tweet" | "personal site" | Filter results by content category. |
auto_navigate | boolean | Automatically navigate to the top result and generate tools. Default: false. |
ghost_search("MCP server implementations", auto_navigate: true, num_results: 5)ghost_analyze
Manually trigger page analysis on the current browser tab. Generates typed MCP tools for the current page. Useful when you have already navigated somewhere and want to regenerate tools (for example, after scrolling to load more content).
| Parameter | Type | Description |
|---|
ghost_analyze()
// Returns: Generated 8 tools for github.com:
// - github_extract_repository_stats (extract)
// - github_extract_files (extract)
// - github_click_tab (click)
// ...ghost_do
Execute a browser automation task from a natural language instruction. Unlike ghost_go, this tool gives you a plan preview before execution by default — useful when you want to review what Ghost will do before it does it.
| Parameter | Type | Description |
|---|---|---|
instructionrequired | string | Natural language instruction, e.g. "go to hacker news and extract the top stories". |
confirm | boolean | If true, execute the plan immediately. If false/omitted, return the plan for review. |
// Step 1: Preview the plan
ghost_do("navigate to github trending and extract repos")
// Returns: plan with steps [ghost_navigate, ghost_analyze, auto_extract]
// Step 2: Execute
ghost_do("navigate to github trending and extract repos", confirm: true)ghost_setup
Diagnostic tool that checks your Ghost environment and provides step-by-step instructions to fix any issues. Run this if something is not working.
| Parameter | Type | Description |
|---|
Checks: Node.js version (≥20), ANTHROPIC_API_KEY, ~/.ghost/ directory, registry cache, Chrome extension connection, Playwright availability.
ghost_status
Get Ghost connection status, registered tools, and registry stats. On first use, shows a getting-started guide. For returning users, shows a compact status table.
| Parameter | Type | Description |
|---|
## Ghost Status
| Component | Status |
|-----------------|----------------------------------|
| Extension | Connected |
| API Key | Set |
| Registry | 12 domains, 87 tools cached |
| Session tools | 6 site tools active |
| Auth | 3 saved sessions |Dynamic Tools
When Ghost visits a website, it analyzes the DOM and auto-generates site-specific tools. These are typed MCP tools with CSS selectors, input schemas, and descriptions — all cached locally in ~/.ghost/registry/ for instant reuse.
How tools are generated
DOM Analysis
Ghost reads the live DOM, identifying interactive elements (buttons, links, forms), data blocks (tables, lists, cards), and navigation patterns.
Heuristic Generation
Initial tools are generated using fast heuristics based on element types, ARIA roles, and data attributes. Available in milliseconds.
LLM Refinement (background)
Claude Opus 4.6 refines the heuristic tools in the background using extended thinking — improving selectors, adding descriptions, and merging duplicates. Updates arrive automatically.
Registry Cache
Tools are saved as JSON in ~/.ghost/registry/{domain}/{pattern}.json. Every future visit loads from cache — zero generation cost.
Example generated tools
| Site | Generated Tool | Type | What It Does |
|---|---|---|---|
| Hacker News | hackernews_extract_stories | extract | Extract story titles, scores, users, and comment counts |
| Hacker News | hackernews_click_story | click | Click a story link to navigate to it |
| GitHub | github_extract_repository_stats | extract | Extract stars, forks, description, language |
| GitHub | github_click_tab | click | Click a repo tab (Code, Issues, PRs, etc.) |
| Wikipedia | wikipedia_extract_article | extract | Extract article content, sections, and references |
| Wikipedia | wikipedia_click_link | click | Click an internal wiki link |
| Amazon | amazon_extract_products | extract | Extract product name, price, rating, reviews |
| Any site | {domain}_submit_search | form | Submit the search form with a query |
Tool naming convention
Dynamic tools follow the pattern {domain_prefix}_{action}_{target}. The domain prefix is derived from the hostname (e.g., hackernews for news.ycombinator.com). Actions include extract, click, submit, scroll, and navigate.
Advanced Features
Macros
Chain multiple Ghost tools into reusable, multi-step workflows. Macros support variable interpolation, conditional execution, and looping.
{
"name": "hn_paginated_extract",
"description": "Extract stories from multiple pages of Hacker News",
"steps": [
{ "tool": "ghost_navigate", "args": { "url": "https://news.ycombinator.com" }, "output_name": "nav" },
{ "tool": "hackernews_extract_stories", "args": {}, "output_name": "page_data" },
{ "tool": "hackernews_click_more", "args": {}, "output_name": "next" }
],
"is_loop": true,
"max_iterations": 3
}| Parameter | Type | Description |
|---|---|---|
namerequired | string | Name for the macro tool (becomes callable like any other tool). |
descriptionrequired | string | Human-readable description of what the macro does. |
stepsrequired | string (JSON) | JSON array of steps. Each step has tool, args, output_name, and optional condition. |
is_loop | boolean | If true, steps repeat until condition fails or max_iterations. Default: false. |
max_iterations | number | Max loop iterations. Default: 10. |
Variable interpolation: Use $input.param_name for user inputs and $prev.result.field for referencing previous step outputs.
Page Monitoring
Track pages for changes over time. Ghost navigates to the monitored URL, extracts data, diffs against the previous result, and records the history.
// Add a monitor
ghost_monitor_add(url: "https://news.ycombinator.com", schedule: "1h", notify: "on_change")
// Run a check (extracts data, diffs against baseline)
ghost_monitor_check()
// List all monitors
ghost_monitor_list()
// Remove a monitor
ghost_monitor_remove(id: "mon_abc123")Schedules are manual, 5m, 1h, or 1d. All check history is stored locally in ~/.ghost/monitors/.
Auth Persistence
Save login sessions so Ghost can access authenticated pages across restarts. Cookies and localStorage are persisted locally.
// After logging into a site in the Ghost browser:
ghost_auth_save(domain: "github.com")
// List saved sessions
ghost_auth_list()
// Returns: { domains: ["github.com", "notion.so"], count: 2 }Multi-Tab Management
Open, switch between, and close browser tabs programmatically. Each tab auto-analyzes its page and generates tools independently.
// Open a new tab
ghost_tab_open(url: "https://github.com/trending")
// List all tabs
ghost_tab_list()
// Returns: { active_tab: "tab_1", tabs: [...], count: 3 }
// Close a tab
ghost_tab_close(tab_id: "tab_2")Performance and Quality Tools
Ghost includes built-in observability for monitoring tool health, execution speed, and cache efficiency.
ghost_speedPerformance dashboard with per-domain breakdown, cache hit rates, and speed comparison vs. screenshot agents. Use report: true for shareable markdown output, save: true to persist snapshots, and history: true for trend tracking.
ghost_qualityQuality scores, per-tool health breakdown (healthy / degraded / broken / unused), and execution metrics for all registry entries. Degraded tools trigger auto-regeneration.
ghost_analyticsDetailed tool usage analytics with filtering by domain and health tier. Sortable by calls, success rate, latency, or name.
File Downloads
Download files from any URL to the local filesystem.
ghost_download(url: "https://example.com/report.pdf", filename: "q4-report.pdf")
// Saved to ~/.ghost/downloads/q4-report.pdfArchitecture
Ghost is an MCP (Model Context Protocol) server that bridges AI agents and the web. It uses a dual-executor architecture with intelligent caching and self-healing capabilities.
System Overview
Claude Code / Codex / Any MCP Client
| MCP/stdio
v
Ghost MCP Server (packages/mcp-server/src/index.ts)
|
+--- Executor Router (picks fastest path)
| |
| +- Tier 1: API Replay → ~226ms (direct HTTP, no browser)
| +- Tier 2: Browser Fetch → ~350ms (authenticated API via cookies)
| +- Tier 3: DOM Extract → ~436ms (full browser with selectors)
|
+--- Registry Cache (~/.ghost/registry/)
| Cached tools as JSON — zero-cost on repeat visits
|
+--- Chrome Extension (WebSocket :3456)
| Visible browsing, login sessions, DOM analysis
|
+--- Playwright (headless fallback)
Background tasks, parallel extraction3-Tier Execution
Ghost automatically selects the fastest execution strategy for every request. No configuration needed.
Skips the browser entirely. Direct HTTP calls to discovered API endpoints. Known domains with public APIs (HN, Reddit, GitHub).
Authenticated API calls using saved session cookies. Works behind logins without opening a full browser.
Full browser with typed CSS selectors. Still 100x faster than Computer Use. Used when no API endpoint is available.
Self-Healing Selectors
When a website changes its DOM structure, Ghost detects degraded selectors via quality scoring. Tools with low success rates are automatically flagged and regenerated.
- Each tool has a quality score based on success rate, latency, and execution count
- Tools scoring below 0.5 with >5 executions trigger auto-regeneration
- When rollback fails, Opus 4.6 with extended thinking reasons about the DOM change and generates repaired CSS selectors
- Maximum 3 automatic regenerations per tool to prevent loops
- Quality scores and health tiers are visible via
ghost_quality
Registry Structure
~/.ghost/
├── registry/ # Cached tools (JSON per domain/pattern)
│ ├── news.ycombinator.com/
│ │ └── _news.json
│ ├── github.com/
│ │ └── _owner_repo.json
│ └── ...
├── auth/ # Saved session cookies and localStorage
│ ├── github.com.json
│ └── ...
├── monitors/ # Page change tracking data
│ └── ...
├── downloads/ # Downloaded files
├── speed-history.json # Performance snapshots
└── debug.log # Server debug logProject Structure
ghost/
├── packages/
│ ├── mcp-server/src/ # MCP server — the brain
│ │ ├── index.ts # Entry point, all tool registrations
│ │ ├── executor.ts # Executor router (API → Bridge → Playwright)
│ │ ├── playwright-executor.ts # Headless browser, DOM analysis, tool execution
│ │ ├── api-replay-executor.ts # Direct HTTP for API-backed tools
│ │ ├── tool-generator.ts # LLM-powered tool refinement
│ │ ├── nl-decomposer.ts # Natural language → tool steps
│ │ ├── registry.ts # Local JSON registry management
│ │ ├── quality.ts # Tool health scoring
│ │ ├── macro-executor.ts # Multi-step workflow engine
│ │ ├── monitor-store.ts # Page change tracking
│ │ └── domain-api-map.ts # Known domain → public API mappings
│ ├── extension/src/ # Chrome extension (Manifest V3)
│ │ ├── background/ # Service worker, tool generation via Claude
│ │ ├── content/ # DOM analysis, tool execution in page
│ │ └── popup/ # Extension dashboard UI
│ └── shared/src/ # Shared types (GhostTool, WsProtocol, Registry)
└── BENCHMARKS.md # Performance benchmark resultsConfiguration
Ghost requires minimal configuration. One environment variable is needed for tool generation; everything else is optional.
Environment Variables
| Variable | Required | Description |
|---|---|---|
ANTHROPIC_API_KEY | Yes | Your Anthropic API key. Used for LLM-powered tool generation and natural language decomposition. |
GHOST_DOWNLOAD_DIR | No | Custom download directory. Default: ~/.ghost/downloads/ |
Set your API key in one of two ways:
echo "ANTHROPIC_API_KEY=sk-ant-..." > ~/ghost/.envexport ANTHROPIC_API_KEY=sk-ant-...MCP Client Registration
Ghost works with any MCP-compatible client. Here is how to register it with popular clients:
claude mcp add ghost -- npx tsx ~/ghost/packages/mcp-server/src/index.ts{
"mcpServers": {
"ghost": {
"command": "npx",
"args": ["tsx", "/path/to/ghost/packages/mcp-server/src/index.ts"],
"env": {
"ANTHROPIC_API_KEY": "sk-ant-..."
}
}
}
}Chrome Extension Setup
The Chrome extension is optional but enables visible browsing with login sessions. Ghost uses Playwright as a fallback when the extension is not connected.
1. Build: cd packages/extension && node build.mjs
2. Open chrome://extensions in Chrome
3. Enable Developer Mode (top right toggle)
4. Click "Load unpacked" → select packages/extension/dist/
5. Set your Anthropic API key in the extension popup
Registry Location
All Ghost data is stored locally in ~/.ghost/. This includes the tool registry, auth sessions, monitoring data, download files, and debug logs. No data ever leaves your machine.
FAQ
Does Ghost require a cloud service or account?
Do I need the Chrome extension?
What happens when a website changes its layout?
How does Ghost compare to Computer Use (screenshot-based agents)?
Can Ghost handle authenticated/login-required pages?
What MCP clients are supported?
How do I see what tools Ghost has generated?
Can I use Ghost for scraping?
I get "ANTHROPIC_API_KEY not set" — what do I do?
ghost_go says "no page loaded" — what's wrong?
Ready to get started?
Ghost is open source, MIT licensed, and free to use. Install it in 30 seconds and give your AI agent structured web tools today.