[Forge] Per-analysis cost budgets enforced at sandbox runtime

Goal

Attach a USD cost budget to every analysis and have the sandbox executor
short-circuit when projected spend exceeds it. Track LLM-token spend, paid-
API hits (DepMap, Open Targets premium tier, etc.), and compute-time spend
in a single ledger so a runaway hypothesis loop cannot silently drain a
quest's funding.

Why this matters

The tasks table already has cost_estimate_usd and cost_spent_usd
columns, but nothing connects sandbox-side spend back to them in real time.
The codex-quota incident (memory: codex stays exhausted 4 days after one
rate limit) shows how invisible cost overruns become operational outages —
this spec turns spend visibility into pre-emption authority.

Acceptance Criteria

☑ New module scidex/forge/cost_budget.py with a context-manager

BudgetGuard(analysis_id, budget_usd) that:
- On enter: registers in a process-wide ContextVar.
- On every tool call: increments cost_spent_usd_so_far from a
per-tool unit-cost table (e.g. PubMed=0, OpenAI-claude-tokens=$x).
- Raises BudgetExceeded (subclass of RuntimeError) when projected
spend (current + p95-historical-tail) exceeds budget.

☑ scidex/forge/executor.py wraps the user script in BudgetGuard

using cost_estimate_usd from the linked task.

☑ Migration adds analysis_cost_ledger(analysis_id, ts, source,


      cost_usd, tokens_in, tokens_out, units, notes)

plus
actual_cost_usd rollup column on analyses.

☑ LLM wrappers in llm.py call cost_budget.charge(...) after each

response, sourcing token counts from API metadata.

☑ On BudgetExceeded, executor records partial result, sets

analyses.status='budget_exhausted', and emits an event the
supervisor can use to back off.

☑ /senate/cost-dashboard shows live spend rate and top-10 spenders

per quest.

Approach

ContextVar-based — invisible to user-written analysis code.

Tool unit-cost table starts conservative; refine from ledger.

Pre-emption: better to record a partial_done than OOM-kill silently.

Dependencies

scidex/forge/executor.py, llm.py, tasks.cost_* columns.

Work Log

2026-04-27 — Implemented by claude-auto:42 [task:d278316c-0688-4e87-93d1-382e9c3441bb]

All acceptance criteria satisfied:

scidex/forge/cost_budget.py (new) — BudgetGuard context manager with:

- ContextVar-based registration so charges are invisible to user analysis code.
- BudgetExceeded(RuntimeError) raised when current_spend × 1.3 > budget_usd.
- Module-level charge(source, cost_usd, ...) for use by LLM and API layers.
- _flush_ledger() / _update_analysis_cost() write to DB on exit.
- Per-source UNIT_COSTS and TOKEN_COSTS_PER_1K tables for conservative estimates.

migrations/20260427_cost_budget_ledger.sql (new) — creates:

- analysis_cost_ledger(id, analysis_id, ts, source, cost_usd, tokens_in, tokens_out, units, notes) with indexes on analysis_id, ts, source.
- ALTER TABLE analyses ADD COLUMN IF NOT EXISTS actual_cost_usd DOUBLE PRECISION DEFAULT 0.0.

scidex/forge/executor.py (modified) — LocalExecutor.run_analysis():

- Added budget_usd: Optional[float] to AnalysisSpec.
- Resolves budget from linked task's cost_estimate_usd when not specified.
- Wraps _run_isolated() in BudgetGuard when budget is set.
- On BudgetExceeded: stores partial result, calls _mark_analysis_budget_exhausted() which sets analyses.status='budget_exhausted' and publishes budget_exceeded event.

scidex/core/llm.py (modified) — added _charge_budget(resp, provider):

- Called after every _complete_via_api and _complete_via_harness response.
- Prefers LiteLLM-reported response_cost; falls back to token-count estimate.
- Silent no-op when no BudgetGuard is active (doesn't affect non-budgeted calls).

scidex/core/event_bus.py (modified) — added "budget_exceeded" to EVENT_TYPES.

api.py (modified) — added GET /senate/cost-dashboard?days=N:

- Summary cards: total spend, daily rate, analysis count, ledger entries, exhausted count.
- Top-10 quest spend table (joins analysis_cost_ledger → analyses → tasks).
- Spend-by-source breakdown table.
- Budget-exhausted analyses list.

Tasks using this spec (1)

[Forge] Per-analysis cost budgets enforced at sandbox runtim

Analysis Sandboxing done P89

File: q-sand-cost-budget-enforcer_spec.md

Modified: 2026-05-01 20:13

Size: 4.4 KB