卡尔曼滤波 · Kalman Filter · Claude Code 工程师手册

卡尔曼滤波 · Kalman Filter

控制论 · 用于 Agent 真实质量状态估计

控制

核心公式

x̂ₙ = x̂ₙ₋₁ + K · (zₙ − x̂ₙ₋₁) x̂ₙ = 当前估计的"真实质量" x̂ₙ₋₁ = 上一步的估计 zₙ = 本次实际观测值（单次评分） K = 卡尔曼增益 = 信任观测 vs 信任预测的权重 K 高（接近 1）→ 更信任新观测，估计变化快 K 低（接近 0）→ 更信任历史，估计变化慢

关键直觉：单次评分不等于真实质量。真实质量 = 历史估计 × (1-K) + 当前观测 × K

为什么需要卡尔曼滤波

真实场景

你的内容 agent 昨天评分 8/10，今天突然 4/10——是模型真的变差了，还是那篇内容本来就难？卡尔曼滤波帮你区分"噪声"和"真实状态变化"。

观测序列	原始判断	卡尔曼估计（K=0.3）
8, 8, 8, 4, 8, 8	看到 4 → 系统崩了	x̂ = 7.9 → 4 是噪声，系统仍然正常
8, 7, 6, 5, 4, 3	还好，最新是 3	x̂ = 5.8 → 趋势恶化，需要干预
4, 5, 4, 5, 4, 5	震荡，不知道咋办	x̂ ≈ 4.5 → 系统性问题，不是噪声

代码模式 — Agent 质量状态跟踪

quality-tracker.ts Agent 监控层

class KalmanQualityTracker {
  private estimate: number;      // 当前质量估计
  private uncertainty: number;   // 估计的不确定性
  private processNoise: number;  // 系统变化速度
  private observationNoise: number; // 测量噪声大小

  constructor(initialEstimate = 7, options = {
    processNoise: 0.1,      // 质量不会突然大变
    observationNoise: 2,    // 单次评分误差约 ±2
  }) {
    this.estimate = initialEstimate;
    this.uncertainty = 1;
    this.processNoise = options.processNoise;
    this.observationNoise = options.observationNoise;
  }

  update(observed: number): {
    estimate: number;
    kalmanGain: number;
    status: 'normal' | 'degrading' | 'noise';
  } {
    // 预测步骤：不确定性随时间增长
    const predictedUncertainty = this.uncertainty + this.processNoise;

    // 卡尔曼增益：平衡历史估计 vs 新观测
    const K = predictedUncertainty / (predictedUncertainty + this.observationNoise);

    // 更新估计
    const prevEstimate = this.estimate;
    this.estimate = this.estimate + K * (observed - this.estimate);
    this.uncertainty = (1 - K) * predictedUncertainty;

    // 判断是噪声还是趋势
    const deviation = Math.abs(observed - prevEstimate);
    const trend = this.estimate - prevEstimate;

    return {
      estimate: Math.round(this.estimate * 10) / 10,
      kalmanGain: Math.round(K * 100) / 100,
      status: deviation > 3 && Math.abs(trend) < 0.5 ? 'noise'
            : trend < -0.3 ? 'degrading'
            : 'normal',
    };
  }
}

// 用法：在 agent pipeline 中接入
const tracker = new KalmanQualityTracker(7.5);

const result = await agent.run(task);
const score = await evaluate(result);
const { estimate, status } = tracker.update(score);

if (status === 'degrading') alertDegradation(estimate);
if (status === 'noise') console.log('单次异常，忽略');
// status === 'normal' → 继续，不过度反应

⚠ 反模式 直接用原始评分驱动决策。agent 输出 3/10 → 立刻触发告警 → 重写 system prompt。但这可能只是一次难度特别高的任务带来的噪声，真实质量仍然是 7.8/10。过度响应噪声会导致系统不稳定。

K 值选择指南

K = 0.1-0.2：评估成本高（每次人工打分），保守更新
K = 0.3-0.4：自动评估，中等信任度，推荐起点
K = 0.5-0.7：高频低成本评估，快速响应质量变化
K 不固定（动态卡尔曼）：初期高 K 快速学习，稳定后降低 K

拿走就能用 — 粘贴进你的 CLAUDE.md

CLAUDE.md 质量状态跟踪规则

## Agent 质量跟踪规则（卡尔曼滤波原则）

### 用移动平均代替单次评分
- 不用单次分数触发告警
- 用最近 5 次的加权平均（近期权重更高）判断趋势

### 区分噪声和真实退化
- 单次分数偏离历史均值 >3 → 大概率是噪声，观察下一次
- 连续 3 次低于历史均值 >1 → 真实退化，触发 I 项修正

### 响应阈值（防止过度响应）
- 不要因为一次差的输出修改 system prompt
- 要因为"移动平均持续下降超过 1 分"才修改
- 更新 system prompt 前，先判断是任务难度问题还是系统问题

09卡尔曼滤波 · Kalman Filter

卡尔曼滤波 · Kalman Filter

K 值选择指南