[Senate] Audit watchdog auto-complete behavior (weekly)

← All Specs

[Senate] Audit watchdog auto-complete behavior (weekly)

Task

  • ID: a60ddedb-b559-4b62-9e32-a930fdbe15d3
  • Type: recurring
  • Frequency: weekly
  • Layer: Senate
Effort: quick

> Restoration note (2026-05-18): This spec was originally landed
> via SciDEX PR #1225 (branch watchdog-followups) on 2026-04-28.
> That PR was closed without merging; the spec file was dropped.
> The corresponding Orchestra task
> af07897f-6af5-4f87-b850-6294fcdd5a5c is still registered and
> active. Restoring this file fixes the broken spec_path reference.

Goal

Once a week, audit how the abandonment watchdog has been auto-completing
tasks and confirm:

  • Every parent task behind a [Watchdog] Fix: repair-task auto-complete
is in a healthy terminal state (done for one-shots, done/open
for recurring, archived/failed if explicitly closed).
  • Real work tasks that completed with commits>0 have a corresponding
on-main commit (cross-check last_commit_sha against git log).
  • The 0-commit auto-complete population is dominated by the two
legitimate reason="repair_clean_exit" / reason="repair_periodic_scan"
shapes, NOT by some new pattern that's masking work loss.

If the audit finds an unexpected pattern (e.g. one-shot tasks
auto-completed with 0 commits and parent status=open), open a Senate
triage task with the affected run IDs.

Background

The 2026-04-27/28 manual audit caught the operator-facing confusion
("Auto-completed by abandonment watchdog: 0 commit(s) on") and verified
no work was actually being lost. Once Orchestra PR #223 lands, the
message confusion goes away, but the behavior could still drift —
e.g. a future regression where a real task gets the reason="repair_clean_exit" path applied incorrectly.

This recurring task is the periodic check that the watchdog stays
honest.

What it does

The audit script (scripts/audit_watchdog_auto_completes.py,
shipped as part of this task) queries Orchestra's task_runs DB:

  • Find every task_runs row with status=completed in the last 7 days
  • where the result_summary starts with Auto-completed.
  • For each, parse the reason (now embedded in the summary text post
  • PR #223).
  • Categorize:
  • - commits_landed: verify last_commit_sha is set (or, until
    the populate-last_commit_sha task lands, verify a [task:<id>]
    commit exists in git log --all).
    - repair_clean_exit / repair_periodic_scan: verify the parent
    task (extracted from the repair task description) is in
    {done, archived, failed} for one-shots, or {open, done} for
    recurring.
  • Emit a structured report to
  • /home/ubuntu/scidex/logs/watchdog_auto_complete_audit-latest.json.
  • If any category contains anomalies (e.g. one-shot parent in open
  • state when the repair-task auto-completed without commits, or a
    commits_landed row with no matching git commit), open a Senate
    triage task linking the run IDs.

    Acceptance criteria

    ☐ Recurring task fires weekly; orchestra runs shows a completed
    run within the last 8 days >95% of the time.
    ☐ Audit report exists, no older than 8 days during normal operation.
    ☐ Anomalies (if any) produce Senate triage tasks rather than silent
    log-only outcomes.
    ☐ Cap 3 triage tasks per cycle to prevent flood; if the audit finds
    >3 distinct anomaly classes, dedupe and open one summary task.

    Approach

  • Write scripts/audit_watchdog_auto_completes.py with the algorithm
  • above. Use the read-only Orchestra DB connection (the audit is
    pure SELECT).
  • Match parent tasks via the [Watchdog] Fix: description pattern
  • (the description carries the parent short-id in backticks, as we
    discovered during the 2026-04-28 manual audit).
  • Match commits_landed runs to git commits via
  • git log --all --grep='[task:<id>]'.
  • Output a JSON report:

  • {
         "generated_at": "...",
         "window_days": 7,
         "total_auto_completes": 0,
         "by_reason": {"commits_landed": 0, "repair_clean_exit": 0, ...},
         "anomalies": [{"run_id": "...", "reason": "...", "issue": "..."}]
       }

  • If anomalies is non-empty, open Senate triage tasks (cap 3).
  • Dependencies

    • Orchestra PR #223 (watchdog message clarity) — merged 2026-04-28.
    The reason= field is now embedded in the summary text and parsable.
    • (Optional but improves the audit) Task
    ecead4e7-d198-4fd2-9bd1-299abc4de6a3 (populate last_commit_sha)
    — once landed, the audit's commits_landed verification becomes
    trivial instead of a git log --grep fallback.

    Dependents

    None — this is a leaf operational task.

    Why priority 65

    Lower than the abandoned-run watchdog (95) because this audits an
    already-audited pattern. Higher than typical research tasks because
    operational integrity drift is hard to diagnose retroactively.

    File: a60ddedb-b559-4b62-9e32-a930fdbe15d3_audit_watchdog_auto_completes_spec.md
    Modified: 2026-05-20 18:03
    Size: 5.0 KB