🤖
XPR Web Scraping
Tools for fetching and extracting cleaned text, metadata, and links from single or multiple web pages with format options and link filtering.
安全通过
💬Prompt
技能说明
name: web-scraping description: Web scraping tools for fetching and extracting data from web pages
Web Scraping
You have web scraping tools for fetching and extracting data from web pages:
Single page:
scrape_url— fetch a URL and get cleaned text content + metadata (title, description, link count)- Use format="text" (default) for most tasks — strips all HTML
- Use format="markdown" to preserve headings, links, lists, bold/italic
- Use format="html" only when you need raw HTML
Link discovery:
extract_links— fetch a page and extract all links with text and type (internal/external)- Use the
patternparameter to filter by regex (e.g."\\.pdf$"for PDF links) - Links are deduplicated and resolved to absolute URLs
- Use the
Multi-page research:
scrape_multiple— fetch up to 10 URLs in parallel for comparison/research- One failure doesn't block others (uses Promise.allSettled)
Best practices:
- Prefer "text" format for content extraction, "markdown" for preserving structure
- Don't scrape the same domain more than 5 times per minute
- Combine with
store_deliverableto save scraped content as job evidence - For very large pages, the content is limited to 5MB
如何使用「XPR Web Scraping」?
- 打开小龙虾AI(Web 或 iOS App)
- 点击上方「立即使用」按钮,或在对话框中输入任务描述
- 小龙虾AI 会自动匹配并调用「XPR Web Scraping」技能完成任务
- 结果即时呈现,支持继续对话优化