🤖

Intelligent Model Router

Intelligent model routing for sub-agent task delegation. Choose the optimal model based on task complexity, cost, and capability requirements. Reduces costs...

下载1.6k

星标0

版本3.0.1

安全通过

⚙️脚本

在 App 中使用在 ClawHub 查看 ↗

技能说明

name: intelligent-router description: Intelligent model routing for sub-agent task delegation. Choose the optimal model based on task complexity, cost, and capability requirements. Reduces costs by routing simple tasks to cheaper models while preserving quality for complex work. version: 3.2.0 core: true

Intelligent Router — Core Skill

CORE SKILL: This skill is infrastructure, not guidance. Installation = enforcement. Run bash skills/intelligent-router/install.sh to activate.

What It Does

Automatically classifies any task into a tier (SIMPLE/MEDIUM/COMPLEX/REASONING/CRITICAL) and recommends the cheapest model that can handle it well.

The problem it solves: Without routing, every cron job and sub-agent defaults to Sonnet (expensive). With routing, monitoring tasks use free local models, saving 80-95% on cost.

MANDATORY Protocol (enforced via AGENTS.md)

Before spawning any sub-agent:

python3 skills/intelligent-router/scripts/router.py classify "task description"

Before creating any cron job:

python3 skills/intelligent-router/scripts/spawn_helper.py "task description"
# Outputs the exact model ID and payload snippet to use

To validate a cron payload has model set:

python3 skills/intelligent-router/scripts/spawn_helper.py --validate '{"kind":"agentTurn","message":"..."}'

❌ VIOLATION (never do this):

# Cron job without model = Sonnet default = expensive waste
{"kind": "agentTurn", "message": "check server..."}  # ← WRONG

✅ CORRECT:

# Always specify model from router recommendation
{"kind": "agentTurn", "message": "check server...", "model": "ollama/glm-4.7-flash"}

Tier System

Tier	Use For	Primary Model	Cost
🟢 SIMPLE	Monitoring, heartbeat, checks, summaries	`anthropic-proxy-6/glm-4.7` (alt: proxy-4)	$0.50/M
🟡 MEDIUM	Code fixes, patches, research, data analysis	`nvidia-nim/meta/llama-3.3-70b-instruct`	$0.40/M
🟠 COMPLEX	Features, architecture, multi-file, debug	`anthropic/claude-sonnet-4-6`	$3/M
🔵 REASONING	Proofs, formal logic, deep analysis	`nvidia-nim/moonshotai/kimi-k2-thinking`	$1/M
🔴 CRITICAL	Security, production, high-stakes	`anthropic/claude-opus-4-6`	$5/M

SIMPLE fallback chain: anthropic-proxy-4/glm-4.7 → nvidia-nim/qwen/qwen2.5-7b-instruct ($0.15/M)

⚠️ ollama-gpu-server is BLOCKED for cron/spawn use. Ollama binds to 127.0.0.1 by default — unreachable over LAN from the OpenClaw host. The router_policy.py enforcer will reject any payload referencing it.

Tier classification uses 4 capability signals (not cost alone):

effective_params (50%) — extracted from model ID or known-model-params.json for closed-source models
context_window (20%) — larger = more capable
cost_input (20%) — price as quality proxy (weak signal, last resort for unknown sizes)
reasoning_flag (10%) — bonus for dedicated thinking specialists (R1, QwQ, Kimi-K2)

Policy Enforcer (NEW in v3.2.0)

router_policy.py catches bad model assignments before they are created, not after they fail.

Validate a cron payload before submitting

python3 skills/intelligent-router/scripts/router_policy.py check \
  '{"kind":"agentTurn","model":"ollama-gpu-server/glm-4.7-flash","message":"check server"}'
# Output: VIOLATION: Blocked model 'ollama-gpu-server/glm-4.7-flash'. Recommended: anthropic-proxy-6/glm-4.7

Get enforced model recommendation for a task

python3 skills/intelligent-router/scripts/router_policy.py recommend "monitor alphastrike service"
# Output: Tier: SIMPLE  Model: anthropic-proxy-6/glm-4.7

python3 skills/intelligent-router/scripts/router_policy.py recommend "monitor alphastrike service" --alt
# Output: Tier: SIMPLE  Model: anthropic-proxy-4/glm-4.7  ← alternate key for load distribution

Audit all existing cron jobs

python3 skills/intelligent-router/scripts/router_policy.py audit
# Scans all crons, reports any with blocked or missing models

Show blocklist

python3 skills/intelligent-router/scripts/router_policy.py blocklist

Policy rules enforced

Model must be set — no model field = Sonnet default = expensive waste
No blocked models — ollama-gpu-server/* and bare ollama/* are rejected for cron use
CRITICAL tasks — warns if using a non-Opus model for classified-critical work

Installation (Core Skill Setup)

Run once to self-integrate into AGENTS.md:

bash skills/intelligent-router/install.sh

This patches AGENTS.md with the mandatory protocol so it's always in context.

CLI Reference

# ── Policy enforcer (run before creating any cron/spawn) ──
python3 skills/intelligent-router/scripts/router_policy.py check '{"kind":"agentTurn","model":"...","message":"..."}'
python3 skills/intelligent-router/scripts/router_policy.py recommend "task description"
python3 skills/intelligent-router/scripts/router_policy.py recommend "task" --alt  # alternate proxy key
python3 skills/intelligent-router/scripts/router_policy.py audit     # scan all crons
python3 skills/intelligent-router/scripts/router_policy.py blocklist

# ── Core router ──
# Classify + recommend model
python3 skills/intelligent-router/scripts/router.py classify "task"

# Get model id only (for scripting)
python3 skills/intelligent-router/scripts/spawn_helper.py --model-only "task"

# Show spawn command
python3 skills/intelligent-router/scripts/spawn_helper.py "task"

# Validate cron payload has model set
python3 skills/intelligent-router/scripts/spawn_helper.py --validate '{"kind":"agentTurn","message":"..."}'

# List all models by tier
python3 skills/intelligent-router/scripts/router.py models

# Detailed scoring breakdown
python3 skills/intelligent-router/scripts/router.py score "task"

# Config health check
python3 skills/intelligent-router/scripts/router.py health

# Auto-discover working models (NEW)
python3 skills/intelligent-router/scripts/discover_models.py

# Auto-discover + update config
python3 skills/intelligent-router/scripts/discover_models.py --auto-update

# Test specific tier only
python3 skills/intelligent-router/scripts/discover_models.py --tier COMPLEX

Scoring System

15-dimension weighted scoring (not just keywords):

Reasoning markers (0.18) — prove, theorem, derive
Code presence (0.15) — code blocks, file extensions
Multi-step patterns (0.12) — first...then, numbered lists
Agentic task (0.10) — run, fix, deploy, build
Technical terms (0.10) — architecture, security, protocol
Token count (0.08) — complexity from length
Creative markers (0.05) — story, compose, brainstorm
Question complexity (0.05) — multiple who/what/how
Constraint count (0.04) — must, require, exactly
Imperative verbs (0.03) — analyze, evaluate, audit
Output format (0.03) — json, table, markdown
Simple indicators (0.02) — check, get, show (inverted)
Domain specificity (0.02) — acronyms, dotted notation
Reference complexity (0.02) — "mentioned above"
Negation complexity (0.01) — not, never, except

Confidence: 1 / (1 + exp(-8 × (score - 0.5)))

Config

Models defined in config.json. Add new models there, router picks them up automatically. Local Ollama models have zero cost — always prefer them for SIMPLE tasks.

Auto-Discovery (Self-Healing)

The intelligent-router can automatically discover working models from all configured providers via real live inference tests (not config-existence checks).

How It Works

Provider Scanning: Reads ~/.openclaw/openclaw.json → finds all models
Live Inference Test: Sends "hi" to each model, checks it actually responds (catches auth failures, quota exhaustion, 404s, timeouts)
OAuth Bypass: Providers with sk-ant-oat01-* tokens (Anthropic OAuth) are skipped in raw HTTP — OpenClaw refreshes these transparently, so they're always marked available
Thinking Model Support: Models that return content=None + reasoning_content (GLM-4.7, Kimi-K2, Qwen3-thinking) are correctly detected as available
Auto-Classification: Tiers assigned via tier_classifier.py using 4 capability signals
Config Update: Removes unavailable models, rebuilds tier primaries from working set
Cron: Hourly refresh (cron id: a8992c1f) keeps model list current, alerts if availability changes by >2

Usage

# One-time discovery
python3 skills/intelligent-router/scripts/discover_models.py

# Auto-update config with working models only
python3 skills/intelligent-router/scripts/discover_models.py --auto-update

# Set up hourly refresh cron
openclaw cron add --job '{
  "name": "Model Discovery Refresh",
  "schedule": {"kind": "every", "everyMs": 3600000},
  "payload": {
    "kind": "systemEvent",
    "text": "Run: bash skills/intelligent-router/scripts/auto_refresh_models.sh",
    "model": "ollama/glm-4.7-flash"
  }
}'

Benefits

✅ Self-healing: Automatically removes broken models (e.g., expired OAuth) ✅ Zero maintenance: No manual model list updates ✅ New models: Auto-adds newly released models ✅ Cost optimization: Always uses cheapest working model per tier

Discovery Output

Results saved to skills/intelligent-router/discovered-models.json:

{
  "scan_timestamp": "2026-02-19T21:00:00",
  "total_models": 25,
  "available_models": 23,
  "unavailable_models": 2,
  "providers": {
    "anthropic": {
      "available": 2,
      "unavailable": 0,
      "models": [...]
    }
  }
}

Pinning Models

To preserve a model even if it fails discovery:

{
  "id": "special-model",
  "tier": "COMPLEX",
  "pinned": true  // Never remove during auto-update
}

⚠️ Known Gap — Proactive Health-Based Routing (2026-03-04)

Current router is reactive not proactive:

Fallback only fires AFTER a 429 is received
No awareness of concurrent sessions on same proxy
No cooldown tracking after rate-limit events

Needed improvements:

Track last-429 timestamp per provider → skip if within cooldown window
Track active concurrent spawns per provider → if >1 active, route to OAuth
Before spawning N parallel agents, check if single provider can handle N concurrent
Expose router.get_best_available(n_concurrent=2) API

如何使用「Intelligent Model Router」？

打开小龙虾AI（Web 或 iOS App）
点击上方「立即使用」按钮，或在对话框中输入任务描述
小龙虾AI 会自动匹配并调用「Intelligent Model Router」技能完成任务
结果即时呈现，支持继续对话优化