跳至主要内容
小龙虾小龙虾AI
🤖

MLOps

Deploy ML models to production with pipelines, monitoring, serving, and reproducibility best practices.

下载340
星标4
版本1.0.0
开发工具
安全通过
💬Prompt

技能说明


name: MLOps slug: mlops version: 1.0.0 description: "Deploy ML models to production with pipelines, monitoring, serving, and reproducibility best practices." metadata: {"clawdbot":{"emoji":"🤖","requires":{"bins":[]},"os":["linux","darwin","win32"]}}

Quick Reference

TopicFileKey Trap
CI/CD and DAGspipelines.mdCoupling training/inference deps
Model servingserving.mdCold start with large models
Drift and alertsmonitoring.mdOnly technical metrics
Versioningreproducibility.mdNot versioning preprocessing
GPU infrastructuregpu.mdGPU request = full device

Critical Traps

Training-Serving Skew:

  • Preprocessing in notebook ≠ preprocessing in service → silent bugs
  • Pandas in notebook → memory leaks in production (use native types)
  • Feature store values at training time ≠ serving time without proper joins

GPU Memory:

  • requests.nvidia.com/gpu: 1 reserves ENTIRE GPU, not partial memory
  • MIG/MPS sharing has real limitations (not plug-and-play)
  • OOM on GPU kills pod with no useful logs

Model Versioning ≠ Code Versioning:

  • Model artifacts need separate versioning (MLflow, W&B, DVC)
  • Training data version + preprocessing version + code version = reproducibility
  • Rollback requires keeping old model versions deployable

Drift Detection Timing:

  • Retraining trigger isn't just "drift > threshold" → cost/benefit matters
  • Delayed ground truth makes concept drift detection lag weeks
  • Upstream data pipeline changes cause drift without model issues

Scope

This skill ONLY covers:

  • CI/CD pipelines for models
  • Model serving and scaling
  • Monitoring and drift detection
  • Reproducibility practices
  • GPU infrastructure patterns

Does NOT cover: ML algorithms, feature engineering, hyperparameter tuning.

如何使用「MLOps」?

  1. 打开小龙虾AI(Web 或 iOS App)
  2. 点击上方「立即使用」按钮,或在对话框中输入任务描述
  3. 小龙虾AI 会自动匹配并调用「MLOps」技能完成任务
  4. 结果即时呈现,支持继续对话优化

相关技能