🤖
PyTorch
Avoid common PyTorch mistakes — train/eval mode, gradient leaks, device mismatches, and checkpoint gotchas.
安全通过
💬Prompt
技能说明
name: PyTorch description: Avoid common PyTorch mistakes — train/eval mode, gradient leaks, device mismatches, and checkpoint gotchas. metadata: {"clawdbot":{"emoji":"🔥","requires":{"bins":["python3"]},"os":["linux","darwin","win32"]}}
Train vs Eval Mode
model.train()enables dropout, BatchNorm updates — default after initmodel.eval()disables dropout, uses running stats — MUST call for inference- Mode is sticky — train/eval persists until explicitly changed
model.eval()doesn't disable gradients — still needtorch.no_grad()
Gradient Control
torch.no_grad()for inference — reduces memory, speeds up computationloss.backward()accumulates gradients — calloptimizer.zero_grad()before backwardzero_grad()placement matters — before forward pass, not after backward.detach()to stop gradient flow — prevents memory leak in logging
Device Management
- Model AND data must be on same device —
model.to(device)andtensor.to(device) .cuda()vs.to('cuda')— both work,.to(device)more flexible- CUDA tensors can't convert to numpy directly —
.cpu().numpy()required torch.device('cuda' if torch.cuda.is_available() else 'cpu')— portable code
DataLoader
num_workers > 0uses multiprocessing — Windows needsif __name__ == '__main__':pin_memory=Truewith CUDA — faster transfer to GPU- Workers don't share state — random seeds differ per worker, set in
worker_init_fn - Large
num_workerscan cause memory issues — start with 2-4, increase if CPU-bound
Saving and Loading
torch.save(model.state_dict(), path)— recommended, saves only weights- Loading: create model first, then
model.load_state_dict(torch.load(path)) map_locationfor cross-device —torch.load(path, map_location='cpu')if saved on GPU- Saving whole model pickles code path — breaks if code changes
In-place Operations
- In-place ops end with
_—tensor.add_(1)vstensor.add(1) - In-place on leaf variable breaks autograd — error about modified leaf
- In-place on intermediate can corrupt gradient — avoid in computation graph
tensor.databypasses autograd — legacy, prefer.detach()for safety
Memory Management
- Accumulated tensors leak memory —
.detach()logged metrics torch.cuda.empty_cache()releases cached memory — but doesn't fix leaks- Delete references and call
gc.collect()— before empty_cache if needed with torch.no_grad():prevents graph storage — crucial for validation loop
Common Mistakes
- BatchNorm with
batch_size=1fails in train mode — use eval mode ortrack_running_stats=False - Loss function reduction default is 'mean' — may want 'sum' for gradient accumulation
cross_entropyexpects logits — not softmax output.item()to get Python scalar —.numpy()or[0]deprecated/error
如何使用「PyTorch」?
- 打开小龙虾AI(Web 或 iOS App)
- 点击上方「立即使用」按钮,或在对话框中输入任务描述
- 小龙虾AI 会自动匹配并调用「PyTorch」技能完成任务
- 结果即时呈现,支持继续对话优化