跳至主要内容
小龙虾小龙虾AI
🤖

Monitoring

Set up observability for applications and infrastructure with metrics, logs, traces, and alerts.

下载1.7k
星标3
版本1.0.0
开发工具
安全通过
💬Prompt

技能说明


name: Monitoring description: "Set up observability for applications and infrastructure with metrics, logs, traces, and alerts."

Complexity Levels

LevelToolsSetup TimeBest For
MinimalUptimeRobot, Healthchecks.io15 minSide projects, MVPs
StandardUptime Kuma, Sentry, basic Grafana1-2 hoursSmall teams, startups
ProfessionalPrometheus, Grafana, Loki, Alertmanager1-2 daysProduction systems
EnterpriseDatadog, New Relic, or full OSS stackOngoingLarge-scale operations

The Three Pillars

PillarWhat It AnswersTools
Metrics"How is the system performing?"Prometheus, Grafana, Datadog
Logs"What happened?"Loki, ELK, CloudWatch
Traces"Why is this request slow?"Jaeger, Tempo, Sentry

Quick Start by Use Case

"I just want to know if it's down" → UptimeRobot (free) or Uptime Kuma (self-hosted). See simple.md.

"I need to debug production errors" → Sentry with your framework SDK. 5-minute setup. See apm.md.

"I want real observability" → Prometheus + Grafana + Loki. See prometheus.md.

"I need to centralize logs" → Loki for simple, ELK for complex queries. See logs.md.

What to Monitor

Applications (RED Method)

  • Rate — requests per second
  • Errors — error rate by endpoint
  • Duration — latency (p50, p95, p99)

Infrastructure (USE Method)

  • Utilization — CPU, memory, disk usage
  • Saturation — queue depth, load average
  • Errors — hardware/system errors

Alerting Principles

DoDon't
Alert on symptoms (user impact)Alert on causes (CPU high)
Include runbook linkRequire investigation to understand
Set appropriate severityMake everything P1
Require actionAlert on "interesting" metrics

Alert fatigue kills monitoring. If alerts are ignored, you have no monitoring.

For alert configuration, severities, and on-call setup, see alerting.md.

Cost Comparison

SolutionMonthly Cost (small)Monthly Cost (medium)
UptimeRobotFree$7
Uptime Kuma$5 (VPS)$5 (VPS)
SentryFree / $26$80
Grafana CloudFree tier$50+
Datadog$15/host$23/host + features
Self-hosted stack$10-20 (VPS)$50-100 (VPS)

Common Mistakes

  • Starting with Prometheus/Grafana when Uptime Kuma would suffice
  • No alerting (dashboards nobody watches)
  • Too many alerts (alert fatigue → ignored)
  • Missing runbooks (alert fires, nobody knows what to do)
  • Not monitoring from outside (only internal checks)
  • Storing logs forever (cost explodes)

如何使用「Monitoring」?

  1. 打开小龙虾AI(Web 或 iOS App)
  2. 点击上方「立即使用」按钮,或在对话框中输入任务描述
  3. 小龙虾AI 会自动匹配并调用「Monitoring」技能完成任务
  4. 结果即时呈现,支持继续对话优化

相关技能