[Senate] Cross-account model router — Opus to highest-EV tasks only

← All Specs

Goal

Right now the dispatcher picks the next available slot regardless of
which model that slot runs and which task is next. Opus 4.7 (15× the
$/token of Haiku) routinely gets handed link-checker tasks while a
strategic-spec-design task waits for a free slot of any model. Build
a routing layer that, for each pending task, picks the *cheapest model
class capable of meeting the task's --requires bar* and reserves
expensive slots (Opus, GPT-5) for tasks above an EV threshold.

Acceptance Criteria

scidex/senate/model_router.py::route(task, available_slots) -> slot_id | None.
☐ Ranking input: task.priority, task.requires (capability dict), ev_scorer.score_task, current per-model utilisation from quota_throttle.
☐ Decision rule: choose the slot whose model class is the minimum tier that satisfies requires, breaking ties by lowest current utilisation. Override for tasks with EV > P90 or quest.priority >= 95 — they can claim a higher tier.
☐ New table model_routing_decisions(decision_id, task_id, chosen_slot_id, chosen_model, alternatives_json, reason, decided_at) so we can audit "why did Opus get a typo fix?".
☐ Block Opus 4.7 from any task whose EV is below the 50th percentile and whose priority is < 90, unless no other model is rateable for the task's requires.
☐ Wire into the Orchestra slot loop before services.next_task; if no slot fits, mark task await_slot not blocked.
/resources "Routing" panel shows last-24h matrix (model × layer) of decisions with $/decision.
☐ Test: synthetic queue of 10 tasks with mixed priorities + 4 slots (1 opus, 1 sonnet, 2 haiku) routes the top-2 EV tasks to opus/sonnet and the rest to haiku.

Approach

  • Read AGENTS.md "Capability-Based Routing" table (line 706+) for canonical model-tier mapping; codify into MODEL_TIER constant.
  • Use ev_scorer.score_task as the EV input but cache per-task for 60 s.
  • Persist decision JSON for explainability — wiki pages already use crosslink_emitter audit pattern, follow that.
  • Add --dry-run mode that logs decisions but does not actually route, so we can validate before flipping the dispatcher.
  • Dependencies

    • q-ri-quota-aware-throttle — provides per-model utilisation.
    • AGENTS.md capability tags on tasks (existing convention).

    Work Log

    2026-04-27 — Slot 44 (claude-auto)

    • Reviewed codebase: orchestra/models.py for MODEL_CAPABILITIES/MODEL_TIER/MODEL_PROVIDER_COST;
    scidex/exchange/ev_scorer.py for scoring; scidex/senate/daily_budget.py for pricing patterns;
    api.py resources_dashboard for injection points.
    • Created scidex/senate/model_router.py:
    - MODEL_TIER dict (tier 1=haiku/minimax/glm, tier 2=sonnet/codex, tier 3=opus/gpt-5)
    - SlotInfo / TaskInfo / RoutingDecision dataclasses
    - route(task, available_slots, *, dry_run=False) -> slot_id | None with full decision logic
    - EV cache with 60 s TTL, P50/P90 percentile cache with 120 s TTL
    - Opus-block gate: skip tier-3 if EV < P50 AND priority < 90 AND cheaper slot capable
    - _persist_decision() writes to model_routing_decisions audit table
    - _ensure_table() idempotent CREATE TABLE IF NOT EXISTS
    - get_routing_matrix(hours=24) for the dashboard API
    • Added score_task(task_id: str) -> float | None to scidex/exchange/ev_scorer.py
    • Added /api/resources/routing-matrix JSON endpoint to api.py
    • Added routing panel to /resources dashboard (JS fetch + table render)
    • Created scidex/senate/test_model_router.py with 14 tests — all pass:
    - Synthetic queue: 10 tasks × 4 slots (1 opus, 1 sonnet, 2 haiku);
    top-2 EV tasks routed to opus/sonnet, rest to haiku ✓
    - Capabilities matching, utilization tie-break, dry_run no-persist ✓
    • Accepted criteria status:
    - [x] route(task, available_slots) -> slot_id | None
    - [x] EV scorer + priority + tier ranking
    - [x] Min-tier + EV-override decision rule
    - [x] model_routing_decisions table (created at runtime on first use)
    - [x] Opus block for low-EV + low-priority tasks
    - [ ] Wire into Orchestra slot loop (requires Orchestra-side integration)
    - [x] /resources Routing panel
    - [x] Test: synthetic queue passes

    Tasks using this spec (1)
    [Senate] Cross-account model router - Opus to highest-EV tas
    File: q-ri-cross-account-model-router_spec.md
    Modified: 2026-05-01 20:13
    Size: 4.3 KB