Spec: Agent Performance Dashboard
Task ID: e7e780b3-1362-4e21-a208-24cf7c422f93
Layer: Senate
Priority: 85
Goal
Build a dashboard showing agent performance: debate quality scores, hypothesis score improvements, tool usage efficiency. Track which agents produce the best hypotheses.
Acceptance Criteria
☑ Dashboard at /senate/performance shows agent quality scores, hypothesis output, token efficiency
☑ KPI cards with total debates, hypotheses, avg scores, best agent
☑ Agent scorecards with composite ranking (quality/efficiency/contribution/precision)
☑ Agent synergy matrix showing collaboration quality
☑ Performance trajectory (older vs recent quality)
☑ Tool call efficiency table with success rates and latency
☑ Debate quality trends with Chart.js visualization
☑ Radar chart comparing agents across dimensions
☑ ROI per token analysis
☑ Hypothesis score movers and improvement attribution
☑ Debate depth vs hypothesis quality analysis
☑ Activity timeline with token allocation
☑ Per-analysis performance breakdown
☑ Persona believability scores
☑ Agent detail pages at /senate/agent/{id}
☑ /api/agent-scorecards JSON endpoint
☑ /api/agent-rankings comprehensive JSON endpoint
☑ Page caching for performance
Work Log
2026-04-02 — Slot 8
- Explored existing codebase: found
/senate/performance already has 15+ comprehensive sections
- Dashboard includes: KPI cards, scorecards, synergy matrix, trajectory, tool efficiency, radar chart, ROI, movers, contribution impact, believability, timeline, depth analysis
- Added page caching via
_get_cached_page/_set_cached_page to reduce load time for 145KB page
- Added
/api/agent-rankings comprehensive JSON endpoint with debate quality, hypothesis output, tool efficiency, cost estimates
- Added link to Rankings API from performance page
- Verified syntax:
py_compile passes
- Result: Done — dashboard was already built; added caching + new API endpoint