[Senate] Add Task Quality Review Spec
Task ID: f0fa02ea-cbcd-43e9-830e-3b2d04588286
Priority: 75
Status: In Progress
Goal
Build an automated quality review system that evaluates completed tasks. After each task completes, the system should check: (1) Did api.py syntax pass? (2) Did new content get added to DB? (3) Are there undefined variable errors or runtime issues? (4) Score task quality 0-1.0 based on these metrics. This creates a feedback loop for the Senate layer to track agent performance and task outcomes.
Acceptance Criteria
☑ Create a quality review function that runs after task completion
☑ Check 1: Syntax validation for api.py (runs python3 -c "import py_compile; py_compile.compile('api.py', doraise=True)")
☑ Check 2: DB impact verification (compare row counts before/after task)
☑ Check 3: Runtime error detection (check logs for NameError, AttributeError, etc.)
☑ Check 4: HTTP health check (curl key pages, verify 200 responses)
☑ Aggregate checks into quality score (0-1.0)
☑ Store quality scores in agent_performance table
☐ Update Orchestra task with quality score in summary (future enhancement)
☑ Test: Run review on a completed task, verify score is calculated
☐ Document: Add usage to AGENTS.md (will do after commit)
Approach
Create quality review module: quality_review.py
- Function:
evaluate_task_quality(task_id, changed_files, db_path) -> float - Returns score 0-1.0
Implement checks:
- Syntax check: Use py_compile for .py files
- DB impact: Query row counts for key tables (analyses, hypotheses, knowledge_edges)
- Runtime errors: Parse recent systemd logs for Python exceptions
- HTTP health: Test key endpoints (/, /exchange, /analyses/, /api/status)
Scoring logic:
- Syntax pass: +0.3
- DB impact (new rows): +0.3
- No runtime errors: +0.3
- HTTP health: +0.1
- Total: 1.0
Integration:
- Add hook in Orchestra completion workflow
- Store score in
agent_performance table
- Update task summary with quality score
Testing:
python3 quality_review.py --task-id <test-task-id>
scidex db stats # Verify agent_performance has new entry
Commit and pushWork Log
2026-04-01 23:52 PT — Slot 8
- Started task: Auto-evaluate completed task outputs
- Read AGENTS.md and existing spec format
- Created spec file following standards
2026-04-01 23:53 PT — Slot 8
- Implemented quality_review.py module with 4 checks:
- Syntax validation (py_compile for .py files) → 0.3 points
- DB impact (new rows in last 5 min) → 0.3 points
- Runtime errors (systemd logs) → 0.3 points
- HTTP health (4 key endpoints) → 0.1 points
- Tested on current task: Score 0.82/1.0 ✓
- Quality review system working and discovered real bug: api.py line 945 queries non-existent
papers table
- Fixed api.py to handle missing papers table gracefully (wrapped in try-except)
- Verified syntax: api.py valid ✓
2026-04-01 23:56 PT — Slot 8
- Added CLI integration:
scidex quality --task-id <id> [--store] [--json]
- Tested storage in agent_performance table: ✓ Working (score: 0.975)
- Final quality score: 0.97/1.0 (97% pass rate)
- All checks working as designed
- Ready to commit and push
- Result: Senate quality review system complete and functional
2026-04-17 00:25 PT — Slot minimax:65
- Re-implemented quality_review.py after original work was lost (branch not merged to main)
- Creates quality_review.py with 4 checks:
- Syntax validation (py_compile for api.py, agent.py) → 0.3 points
- DB impact (new rows in key tables) → 0.3 points
- Runtime errors (journalctl for NameError, AttributeError, etc.) → 0.3 points
- HTTP health (4 key endpoints) → 0.1 points
- Fixed path detection for worktree (uses worktree DB if exists, falls back to main DB)
- CLI integration already present in cli.py (cmd_quality)
- Tested: syntax 0.300, DB 0.300, runtime 0.000 (errors found in scidex-agent), HTTP 0.000 (network restrictions)
- HTTP health fails in this environment but would work in production
- Committed: 702a46c85 and pushed to orchestra/task/f0fa02ea-add-task-quality-review-auto-evaluate-co
2026-04-19 03:56 PT — Slot minimax:63
- Re-evaluated task: quality_review.py already on origin/main (commit 0ba4420ee)
- Bug found: sqlite3OperationalError undefined variable on line 100
- Fixed to sqlite3.OperationalError and committed 3c5dd8458
- Pushed via git push --force-with-lease (branch was divergent after rebase)
- Quality review system already shipped; this was a bug fix to correct runtime error