🤖
Agent Browser
用于 AI 代理的 Headless 浏览器自动化 CLI。在与网站交互时使用 - 导航页面、填写表单、单击按钮、截取屏幕截图、提取数据。
安全通过
💬Prompt
技能说明
name: agent-browser description: Headless browser automation CLI for AI agents. Use when interacting with websites — navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, scraping, testing web apps, downloading files, or automating any browser task. Triggers on requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data", "test this web app", "login to a site", "monitor a page", or any task requiring programmatic web interaction.
Browser Automation with agent-browser
Setup
Run scripts/setup.sh to install agent-browser and Chromium. Requires Node.js.
Core Workflow
Every browser automation follows this pattern:
- Navigate:
agent-browser open <url> - Snapshot:
agent-browser snapshot -i(get element refs like@e1,@e2) - Interact: Use refs to click, fill, select
- Re-snapshot: After navigation or DOM changes, get fresh refs
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
Command Chaining
Chain with && when you don't need intermediate output:
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
Run separately when you need to parse output first (e.g., snapshot to discover refs).
Essential Commands
# Navigate
agent-browser open <url>
agent-browser close
# See the page (always do this first)
agent-browser snapshot -i # Interactive elements with refs
agent-browser snapshot -i -C # Include onclick divs
# Interact using @refs
agent-browser click @e1
agent-browser fill @e2 "text"
agent-browser select @e1 "option"
agent-browser press Enter
agent-browser scroll down 500
# Get info
agent-browser get text @e1
agent-browser get url
agent-browser get title
# Wait
agent-browser wait @e1 # For element
agent-browser wait --load networkidle # For network idle
# Capture
agent-browser screenshot page.png
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf
For the full command reference, see references/commands.md.
Ref Lifecycle (Important)
Refs (@e1, @e2) are invalidated when the page changes. Always re-snapshot after:
- Clicking links/buttons that navigate
- Form submissions
- Dynamic content loading (dropdowns, modals)
Common Patterns
Form Submission
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser click @e5
agent-browser wait --load networkidle
Login with State Persistence
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME" && agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Reuse later
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
Data Extraction
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5
agent-browser get text body > page.txt
Screenshot & Diff
agent-browser screenshot baseline.png
# ... changes happen ...
agent-browser diff screenshot --baseline baseline.png
Parallel Sessions
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser session list
Security (Optional)
export AGENT_BROWSER_CONTENT_BOUNDARIES=1 # Wrap output for AI safety
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com" # Domain allowlist
export AGENT_BROWSER_MAX_OUTPUT=50000 # Prevent context flooding
Cleanup
Always close sessions when done: agent-browser close
如何使用「Agent Browser」?
- 打开小龙虾AI(Web 或 iOS App)
- 点击上方「立即使用」按钮,或在对话框中输入任务描述
- 小龙虾AI 会自动匹配并调用「Agent Browser」技能完成任务
- 结果即时呈现,支持继续对话优化