[Senate] Per-quest budget vs. actual variance with anomaly alerts

← All Specs

Goal

scripts/backfill_resource_usage.py already populated 5.7 M tokens of resource_usage history and aa8c3204_2746 shipped a static
budget-vs-actual line chart. We have zero programmatic alerting on
budget overruns and no per-quest variance attribution — when Codex blew
through 4 days of quota (memory: project_codex_model_quota_exhaustion)
it took a human to notice. Build a daily variance computation that
diff's quest_budget_allocations.allocated_tokens against cost_ledger/resource_usage actuals per quest, attributes the delta
(model mix shift, task fan-out, retry storm), and emits Senate proposals
when |variance| > 2σ of trailing-14-day baseline.

Acceptance Criteria

scidex/senate/budget_variance.py with compute_daily_variance(date), attribute_variance(quest_id, day), and alert_if_anomaly(variance, baseline) returning a structured reason code.
☐ Table budget_variance_daily(day, quest_id, allocated_tokens, actual_tokens, variance_pct, attribution_json, alert_emitted_at) with PK (day, quest_id).
☐ Cron entry in scidex/senate/scheduled_tasks.py running at 03:10 UTC.
☐ Attribution distinguishes: model_mix_shift (more Opus%), fanout (more tasks than allocated), retry_storm (>3 retries on a single task), quota_exhaust (rows in account_model_quota_state), unknown.
☐ When |variance_pct| > 25 % AND > 2σ baseline, create a Senate proposal of type budget_variance with the attribution as evidence and a suggested corrective action (e.g. throttle retries, re-route model).
/resources page gets a "Variance" panel: heatmap (quests × last 14 days) coloured by variance_pct.
GET /api/resources/variance?days=14 returns the heatmap data.

Approach

  • Join cost_ledger (Orchestra DB, orchestra/cost.py:41) on task_id → quests.id via tasks.quest_id to bucket per quest per day.
  • Compute baseline σ from trailing 14 days; store in same row to avoid recompute.
  • Reuse Senate proposal pipeline — see how scidex/senate/epistemic_health.py::generate_improvement_proposals writes proposals.
  • Render heatmap with the same SVG idiom used by /senate quality cards (no Chart.js dependency for this panel).
  • Dependencies

    • 4acdb148 (resource_usage backfill) — provides historical depth.
    • aa8c3204_2746 — the dashboard panels we extend.

    Work Log

    Tasks using this spec (1)
    [Senate] Per-quest budget vs. actual variance with anomaly a
    File: q-ri-budget-variance-dashboard_spec.md
    Modified: 2026-05-01 20:13
    Size: 2.5 KB