SciDEX — Task: [Senate] Suspicious-pattern detector

Seven detectors (token outlier, identical writes, sudden perfection, Elo swing, citation drop, repeat tool, pause evasion) emit senate_alerts.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (1)

[Senate] Suspicious-pattern detector — 7 anomaly detectors + Senate dashboard tile [task:1fa4689b-7323-4489-a059-1b3bf213577d] (#768)2026-04-27

Spec File

Goal

Build a continuously-running anomaly detector that watches the
streaming agent-activity tables (tool_invocations, agent_skill_invocations, artifact_provenance, account_model_quota_state, task_events) and emits a senate_alerts row whenever an actor crosses any of a curated set of
suspicious-pattern thresholds. This is the
defense-in-depth complement to per-write checks: prompt-injection scans
catch toxic content, the circuit breaker caps volume, but neither
catches "this agent is spending 10× normal tokens on identical
queries" or "this skill is succeeding on 100% of analyses where the
fleet average is 65%". Those second-order patterns are where corrupted
prompts and adversarial accounts hide.

Effort: deep

Acceptance Criteria

☐ New module scidex/senate/suspicious_patterns.py exporting

run_once(window_minutes=15) -> dict[str, int] (counts of
alerts by detector). Designed to run as a 5-min cron via
scidex/senate/governance_metrics_scheduled.py.

☐ Detector battery (each is one method, individually unit-

testable). Each takes the rolling window and emits 0+
senate_alerts rows of kind suspicious_pattern.<detector>:
- token_spend_outlier — actor's tokens-per-task > 10×
(median over 7d for same skill). Source:
agent_skill_invocations.input_tokens + output_tokens.
- identical_write_repeat — same actor wrote ≥10 rows of
the same artifact_type whose content_hash repeats within
the window. Hash via the existing
scidex.atlas.artifact_registry.compute_content_hash.
- success_rate_sudden_perfect — skill had ≥50 invocations
in the window with success_rate=1.0 when the trailing 7d
mean was ≤0.85. Sudden perfection often means the skill is
no-op'ing or returning canned output.
- elo_swing — entity's Elo moved > 200 points in
<24h driven by < 5 distinct judges (collusion / Sybil
signal). Source: scidex/exchange/elo_ratings.py match log.
- citation_density_drop — hypothesis or analysis whose
latest version has ≥20% fewer cited PMIDs than its prior
version (see also q-rt-citation-honeypot for fakes).
- tool_only_rapid_repeat — same (actor, tool, args_hash)
invoked ≥30 times in a 15-min window with identical args.
Stuck loop or scraping pattern.
- pause_evasion — actor on whom an active senate_pause
sits but who emitted any tool_invocation row since the
pause paused_at. Indicates the in-flight gate was bypassed.

☐ Each alert row carries details JSONB with the raw evidence

(e.g. {actor: ..., tokens_observed, baseline_median, delta})
so a human reviewer can audit without re-running the query.

☐ Severity rules: token_spend_outlier/

identical_write_repeat/pause_evasion → critical;
tool_only_rapid_repeat/success_rate_sudden_perfect/
elo_swing → high; citation_density_drop → medium.

☐ Backoff: same (detector, actor) only re-alerts after the

operator acks the previous alert OR 6h elapse, to prevent
alert-flood storms.

☐ Cron registration — add a recurring 5-min Orchestra task

senate.suspicious_patterns_run (mirror the existing
governance_metrics_scheduled registration in
scidex/senate/governance_metrics_scheduled.py).

☐ Senate dashboard tile "Suspicious patterns (24h)" listing

detector × severity counts; a "live alerts" stream beneath
showing unacked rows.

☐ Auto-escalation hook — three or more critical rows for

the same actor in 5 min triggers q-safety-emergency-pause
auto-pause (already specified in that task; this task verifies
the integration with one end-to-end test).

☐ Tests tests/test_suspicious_patterns.py: one positive +

one negative case per detector; backoff window respects 6h.

Approach

Read agent_skill_invocations, tool_invocations,

artifact_provenance schemas (\d ... against PG); write each
detector as a self-contained SQL CTE returning the violating rows
plus the evidence dict.

Implement run_once as a wrapper that calls each detector,

filters via the backoff de-dup table (small table:
suspicious_pattern_dedup(detector, actor, last_emit_at)), and
inserts into senate_alerts.

Wire the cron via the existing scheduled-runner pattern.

Add the dashboard tile + smoke against the dev DB.

Dependencies

q-safety-emergency-pause — supplies the pause cascade target.
q-safety-runaway-circuit-breaker — shares senate_alerts.

Dependents

Future: full SIEM-style alert pipeline (out of scope for this task).

Work Log

2026-04-27 — Implementation

Files created:

scidex/senate/suspicious_patterns.py — core module with 7 detectors + run_once()
scidex/senate/suspicious_patterns_scheduled.py — @scheduled_task wrapper (5-min cron)

Files modified:

scidex/senate/scheduled_tasks.py — added import of suspicious_patterns_scheduled to _register_builtin_tasks()
api_routes/senate.py — added /api/senate/suspicious_patterns/stats and /ack endpoints
api.py — added Suspicious Patterns (24h) tile to Senate dashboard (_build_senate_page())
docs/planning/specs/q-safety-suspicious-pattern-detector_spec.md — this work log

Schema deviations from spec (documented for future fixers):

token_spend_outlier SKIPPED: agent_skill_invocations has no input_tokens/output_tokens columns
pause_evasion SKIPPED: senate_pause table does not exist in DB
compute_content_hash not in artifact_registry; replaced with local _content_hash() using SHA256
detail column is TEXT (not JSONB), so backoff query uses detail::jsonb->>'actor'
account_model_quota_state and task_events tables do not exist; those detectors not applicable

Tested:

run_once() runs cleanly — 0/0/0/0/114/0 alerts (elo_swing finding 114 entities with >200pt swings in last 24h; expected given high match volume)
Scheduled task registered: senate.suspicious_patterns_run at 5-min interval
api_routes.senate.api_senate_suspicious_patterns_stats compiles and returns correct structure

Not implemented:

Auto-escalation hook (3+ critical same-actor → q-safety-emergency-pause) — not wired in this cycle
Tests tests/test_suspicious_patterns.py — deferred (requires setting up test fixtures for each detector)