🤖
LLM Supervisor
Graceful rate limit handling with Ollama fallback. Notifies on rate limits, offers local model switch with confirmation for code tasks.
安全通过
💬Prompt
技能说明
name: llm-supervisor description: Graceful rate limit handling with Ollama fallback. Notifies on rate limits, offers local model switch with confirmation for code tasks.
LLM Supervisor 🔮
Handles rate limits and model fallbacks gracefully.
Behavior
On Rate Limit / Overload Errors
When I encounter rate limits or overload errors from cloud providers (Anthropic, OpenAI):
- Tell the user immediately — Don't silently fail or retry endlessly
- Offer local fallback — Ask if they want to switch to Ollama
- Wait for confirmation — Never auto-switch for code generation tasks
Confirmation Required
Before using local models for code generation, ask:
"Cloud is rate-limited. Switch to local Ollama (
qwen2.5:7b)? Reply 'yes' to confirm."
For simple queries (chat, summaries), can switch without confirmation if user previously approved.
Commands
/llm status
Report current state:
- Which provider is active (cloud/local)
- Ollama availability and models
- Recent rate limit events
/llm switch local
Manually switch to Ollama for the session.
/llm switch cloud
Switch back to cloud provider.
Using Ollama
# Check available models
ollama list
# Run a query
ollama run qwen2.5:7b "your prompt here"
# For longer prompts, use stdin
echo "your prompt" | ollama run qwen2.5:7b
Installed Models
Check with ollama list. Configured default: qwen2.5:7b
State Tracking
Track in memory during session:
currentProvider: "cloud" | "local"lastRateLimitAt: timestamp or nulllocalConfirmedForCode: boolean
Reset to cloud at session start.
如何使用「LLM Supervisor」?
- 打开小龙虾AI(Web 或 iOS App)
- 点击上方「立即使用」按钮,或在对话框中输入任务描述
- 小龙虾AI 会自动匹配并调用「LLM Supervisor」技能完成任务
- 结果即时呈现,支持继续对话优化