模拟退火 · Simulated Annealing · Claude Code 工程师手册

模拟退火 · Simulated Annealing

物理学 / 优化算法 · 用于 Prompt 调优协议

物理

核心公式

P(接受更差的解) = exp(−ΔE / T) ΔE = 新prompt得分 − 当前prompt得分 (负数 = 更差) T = 当前温度 (从1.0降到接近0) T 高 → P 大 → 接受差解的概率高 → 大幅跳跃 T 低 → P 小 → 几乎只接受更好的解 → 精细微调

关键直觉：允许"走错路"才能跳出局部最优。纯贪心 = 永远卡在第一个山顶。

Claude Code 场景

真实场景

你在调优社媒内容 agent 的 system prompt，当前输出质量 7.5/10，微调措辞已经一周没有提升。你陷入了局部最优。

识别信号 — 什么时候用这个模式

你观察到	诊断	应对
微调措辞 → 无变化	局部最优	进入高温阶段，换结构
加 example → 变好 → 再加 → 更差	过拟合到 example	降温，只调 example 数量
换 role/persona → 质量波动很大	温度仍然高，探索空间大	继续高温，记录最高分结构

代码模式 — CLAUDE.md 中的调优协议

prompt_optimizer.ts Claude Code 项目

// 模拟退火 Prompt 调优器
// 在 Claude Code 中运行：cc run optimize

interface PromptCandidate {
  content: string;
  score: number;
}

async function simulatedAnnealingOptimize(
  initialPrompt: string,
  evaluate: (p: string) => Promise<number>,
  iterations = 100
) {
  let current: PromptCandidate = {
    content: initialPrompt,
    score: await evaluate(initialPrompt)
  };
  let best = { ...current };
  let T = 1.0; // 初始温度

  for (let i = 0; i < iterations; i++) {
    T *= 0.97; // 降温速率

    // 高温：大变动；低温：小变动
    const candidate = T > 0.5
      ? await mutateStructure(current.content)  // 换段落/结构/persona
      : await mutatePhrasing(current.content);   // 只换措辞/词语

    const newScore = await evaluate(candidate);
    const delta = newScore - current.score;

    // 接受条件：更好 OR 概率性接受更差的
    if (delta > 0 || Math.random() < Math.exp(delta / T)) {
      current = { content: candidate, score: newScore };
    }

    if (current.score > best.score) best = { ...current };

    console.log(`T=${T.toFixed(2)} score=${newScore.toFixed(1)} best=${best.score.toFixed(1)}`);
  }

  return best;
}

⚠ 反模式 只看当前得分微调措辞，没有"高温阶段"。这是纯贪心，必然卡在局部最优。如果你的 prompt 调了两天得分没超过 7/10，这就是信号——需要退火，不是微调。

实操记忆法

高温期（T > 0.5）：换 persona、换结构、加删整段 example，接受暂时变差
低温期（T < 0.3）：只改单词、语气、标点，不接受退步
判断标准：同一类型的变动已经连续 3 次没有提升 → 升温，换变动类型

拿走就能用 — 粘贴进你的 CLAUDE.md

CLAUDE.md Prompt 调优协议

## Prompt 调优协议（模拟退火模式）

### 当前处于哪个阶段？
- 高温期：新任务、新场景、效果还差。此时大改，不计较单次变差。
- 低温期：效果已经 7+/10，微调收敛。此时小改，不接受退步。

### 升温触发条件（任意一条）
- 同类改法连续 3 次无效
- 被同一个问题卡超过 2 天
- 输出得分长期在 6-7 震荡，无法突破

### 高温期允许的动作
- 完全换掉 persona / role 设定
- 增删整个 example 段落
- 改变输出格式（JSON vs Markdown vs 纯文本）
- 换 task framing（从"生成"改成"检查"）

### 低温期禁止的动作
- 同时改超过 2 处
- 接受比上一版得分低的版本

01模拟退火 · Simulated Annealing

模拟退火 · Simulated Annealing

实操记忆法