跳至主要内容
小龙虾小龙虾AI
🤖

LLM Supervisor

Graceful rate limit handling with Ollama fallback. Notifies on rate limits, offers local model switch with confirmation for code tasks.

下载2.2k
星标2
版本0.3.0
开发工具
安全通过
💬Prompt

技能说明


name: llm-supervisor description: Graceful rate limit handling with Ollama fallback. Notifies on rate limits, offers local model switch with confirmation for code tasks.

LLM Supervisor 🔮

Handles rate limits and model fallbacks gracefully.

Behavior

On Rate Limit / Overload Errors

When I encounter rate limits or overload errors from cloud providers (Anthropic, OpenAI):

  1. Tell the user immediately — Don't silently fail or retry endlessly
  2. Offer local fallback — Ask if they want to switch to Ollama
  3. Wait for confirmation — Never auto-switch for code generation tasks

Confirmation Required

Before using local models for code generation, ask:

"Cloud is rate-limited. Switch to local Ollama (qwen2.5:7b)? Reply 'yes' to confirm."

For simple queries (chat, summaries), can switch without confirmation if user previously approved.

Commands

/llm status

Report current state:

  • Which provider is active (cloud/local)
  • Ollama availability and models
  • Recent rate limit events

/llm switch local

Manually switch to Ollama for the session.

/llm switch cloud

Switch back to cloud provider.

Using Ollama

# Check available models
ollama list

# Run a query
ollama run qwen2.5:7b "your prompt here"

# For longer prompts, use stdin
echo "your prompt" | ollama run qwen2.5:7b

Installed Models

Check with ollama list. Configured default: qwen2.5:7b

State Tracking

Track in memory during session:

  • currentProvider: "cloud" | "local"
  • lastRateLimitAt: timestamp or null
  • localConfirmedForCode: boolean

Reset to cloud at session start.

如何使用「LLM Supervisor」?

  1. 打开小龙虾AI(Web 或 iOS App)
  2. 点击上方「立即使用」按钮,或在对话框中输入任务描述
  3. 小龙虾AI 会自动匹配并调用「LLM Supervisor」技能完成任务
  4. 结果即时呈现,支持继续对话优化

相关技能