[Senate] AGENTS.md re-trim: file regrew past 25k Read cap
Task
- ID: 445ecead-dfb3-47cc-ab8f-4d9f5af08d48
- Type: one_shot
- Layer: Senate
Effort: standard
Goal
/home/ubuntu/scidex/AGENTS.md is currently 1085 lines / 26,315
tokens. The Read tool's hard cap is 25,000 tokens; **every agent that
reads AGENTS.md without limit= fails on first try and has to retry
with offsets**. This is the exact regression PR #1223 fixed on
2026-04-28; the file has since grown back through three commits that
appended content without extracting:
175419ac7 standing pre-approval for /batch and /loop
e09a5560b substrate v2 concurrent pip-install footgun
f2c2cc5c7 merge CLAUDE.md into AGENTS.md
What this task does
Bring AGENTS.md back under ~22,500 tokens (≥10% headroom under the
cap) using the same extract-and-summarize pattern that worked in PR
#1223:
Measure: Run wc -l AGENTS.md and python3 -c "import os;
print(os.path.getsize('AGENTS.md'))". Confirm size > 90 KB.
Verify Read-tool failure by calling Read with no offset/limit; it
should error with "exceeds maximum allowed tokens (25000)".
Identify three extractions from the regressed sections:
-
/batch and /loop pre-approval → extract to
docs/dev/batch_loop_preapproval.md (or similar) with a 3-5
line pointer in AGENTS.md.
-
Substrate-v2 pip footgun → extract to
docs/dev/substrate_v2_pip_concurrent_install.md; pointer in
AGENTS.md.
-
CLAUDE.md merge → audit the merged content. Likely the
CLAUDE.md material was project-overview / top-level context.
Keep the critical "read AGENTS.md first" pointer inline; extract
the rest to whichever existing detail doc fits, or to a new
docs/design/project_overview.md.
Re-trim opportunities in still-fat AGENTS.md sections (if (2)
isn't enough):
- Re-audit the "Plan Before Acting" section (still ~60 lines) —
candidate for a
docs/dev/agent_planning_checklist.md pointer.
- Re-audit "Git Safety (CRITICAL)" — postgres/sqlite-retirement
details could move to
docs/design/sqlite_retirement.md (already
exists per AGENTS.md cross-ref).
Lock in with a guard: Add an entry to
.claude/hooks/guard-main-writes.sh (or a sibling check) that
refuses to commit AGENTS.md if it exceeds 24,500 tokens. Make the
check easy to override with a clear flag (
AGENTS_MD_OVER_CAP=1)
so an operator can still ship an emergency edit, but the default
refuses silently-large appends.
Acceptance criteria
☐ AGENTS.md ≤ 22,500 tokens (verified by Read tool returning the
whole file without the cap error).
☐ All three regression sections (batch/loop, substrate-v2 pip,
CLAUDE.md merge) extracted to detail docs with pointer summaries
remaining inline.
☐ Pre-commit guard added that refuses AGENTS.md commits over
24,500 tokens unless
AGENTS_MD_OVER_CAP=1 is set.
☐ Test: clone the repo, python3 -c "from pathlib import Path;
import sys; sys.exit(0 if len(Path('AGENTS.md').read_text())//4 <
25000 else 1)" returns 0.
☐ No regression in detail-doc coverage — every extracted section
is reachable via a clear pointer in AGENTS.md.
Why priority 80
This is a real, daily friction. The watchdog repair task that
originally surfaced the issue (a1c31194-... on 2026-04-28) is the
canonical example: an agent's first action was a Read on AGENTS.md,
which failed, costing the run a few cycles before recovery. Every
agent that picks up a new task pays this cost.
Out of scope
- The pre-commit guard implementation can be deferred to a follow-up
task if it's not trivially droppable into the existing
guard-main-writes.sh hook. The trim itself is the load-bearing
piece.
- Rewriting AGENTS.md from scratch. Keep the structure; just
extract.