[Senate] AGENTS.md re-trim: file regrew past 25k Read cap

← All Specs

[Senate] AGENTS.md re-trim: file regrew past 25k Read cap

Task

  • ID: 445ecead-dfb3-47cc-ab8f-4d9f5af08d48
  • Type: one_shot
  • Layer: Senate
Effort: standard

Goal

/home/ubuntu/scidex/AGENTS.md is currently 1085 lines / 26,315
tokens. The Read tool's hard cap is 25,000 tokens; **every agent that
reads AGENTS.md without limit= fails on first try and has to retry
with offsets**. This is the exact regression PR #1223 fixed on
2026-04-28; the file has since grown back through three commits that
appended content without extracting:

  • 175419ac7 standing pre-approval for /batch and /loop
  • e09a5560b substrate v2 concurrent pip-install footgun
  • f2c2cc5c7 merge CLAUDE.md into AGENTS.md

What this task does

Bring AGENTS.md back under ~22,500 tokens (≥10% headroom under the
cap) using the same extract-and-summarize pattern that worked in PR
#1223:

  • Measure: Run wc -l AGENTS.md and python3 -c "import os;
  • print(os.path.getsize('AGENTS.md'))". Confirm size > 90 KB.
    Verify Read-tool failure by calling Read with no offset/limit; it
    should error with "exceeds maximum allowed tokens (25000)".

  • Identify three extractions from the regressed sections:
  • - /batch and /loop pre-approval → extract to
    docs/dev/batch_loop_preapproval.md (or similar) with a 3-5
    line pointer in AGENTS.md.
    - Substrate-v2 pip footgun → extract to
    docs/dev/substrate_v2_pip_concurrent_install.md; pointer in
    AGENTS.md.
    - CLAUDE.md merge → audit the merged content. Likely the
    CLAUDE.md material was project-overview / top-level context.
    Keep the critical "read AGENTS.md first" pointer inline; extract
    the rest to whichever existing detail doc fits, or to a new
    docs/design/project_overview.md.

  • Re-trim opportunities in still-fat AGENTS.md sections (if (2)
  • isn't enough):
    - Re-audit the "Plan Before Acting" section (still ~60 lines) —
    candidate for a docs/dev/agent_planning_checklist.md pointer.
    - Re-audit "Git Safety (CRITICAL)" — postgres/sqlite-retirement
    details could move to docs/design/sqlite_retirement.md (already
    exists per AGENTS.md cross-ref).

  • Lock in with a guard: Add an entry to
  • .claude/hooks/guard-main-writes.sh (or a sibling check) that
    refuses to commit AGENTS.md if it exceeds 24,500 tokens. Make the
    check easy to override with a clear flag (AGENTS_MD_OVER_CAP=1)
    so an operator can still ship an emergency edit, but the default
    refuses silently-large appends.

    Acceptance criteria

    ☐ AGENTS.md ≤ 22,500 tokens (verified by Read tool returning the
    whole file without the cap error).
    ☐ All three regression sections (batch/loop, substrate-v2 pip,
    CLAUDE.md merge) extracted to detail docs with pointer summaries
    remaining inline.
    ☐ Pre-commit guard added that refuses AGENTS.md commits over
    24,500 tokens unless AGENTS_MD_OVER_CAP=1 is set.
    ☐ Test: clone the repo, python3 -c "from pathlib import Path;
    import sys; sys.exit(0 if len(Path('AGENTS.md').read_text())//4 <
    25000 else 1)"
    returns 0.
    ☐ No regression in detail-doc coverage — every extracted section
    is reachable via a clear pointer in AGENTS.md.

    Why priority 80

    This is a real, daily friction. The watchdog repair task that
    originally surfaced the issue (a1c31194-... on 2026-04-28) is the
    canonical example: an agent's first action was a Read on AGENTS.md,
    which failed, costing the run a few cycles before recovery. Every
    agent that picks up a new task pays this cost.

    Out of scope

    • The pre-commit guard implementation can be deferred to a follow-up
    task if it's not trivially droppable into the existing
    guard-main-writes.sh hook. The trim itself is the load-bearing
    piece.
    • Rewriting AGENTS.md from scratch. Keep the structure; just
    extract.

    File: 445ecead-dfb3-47cc-ab8f-4d9f5af08d48_agentsmd_retrim_2026_05_18_spec.md
    Modified: 2026-05-20 18:03
    Size: 4.0 KB