Knowledge Growth Rate Tracking and Trajectory

← All Specs

Knowledge Growth Rate Tracking and Trajectory

Goal

Build a system to track SciDEX's knowledge growth over time to ensure the system is converging toward its mission, not stalling or regressing. Track key metrics hourly (analyses, hypotheses, KG edges, papers, wiki pages, causal edges), calculate growth rates, detect stalls and regressions, and display growth trajectories on /senate with projections.

Acceptance Criteria

☑ Create knowledge_snapshots table with schema: (id, timestamp, metric_name, value)
☑ Implement hourly snapshot function capturing: analyses_count, hypotheses_count, kg_edges_count, papers_count, wiki_pages_count, causal_edges_count
☑ Add quality filters: only count KG edges with non-empty source/target, hypotheses with citations
☑ Calculate growth rates: edges/hour, hypotheses/hour, papers/hour
☑ Implement stall detection: alert if growth rate drops below threshold for 2+ hours
☑ Implement regression detection: alert if any count decreases
☑ Add growth chart visualization to /senate page
☑ Add projections: at current rate, when will we hit 1000 hypotheses, 10,000 edges, 100 analyses

Approach

  • Read api.py and PostgreSQL schema to understand existing database structure
  • Create migration script to add knowledge_snapshots table
  • Implement snapshot collection function in a new metrics.py module:
  • - Query current counts from database with quality filters
    - Store in knowledge_snapshots table
    - Calculate growth rates from historical snapshots
    - Detect anomalies (stalls, regressions)
  • Add cron job or scheduled task to run snapshots hourly
  • Add /api/growth endpoint to serve growth data
  • Update /senate page to include growth chart with trend lines and projections
  • Test by simulating multiple snapshots and verifying calculations
  • Work Log

    2026-04-01 23:35 PT — Slot 11

    • Started task: Knowledge growth rate tracking
    • Created spec file
    • Created knowledge_snapshots table in PostgreSQL
    • Implemented metrics.py module with functions:
    - collect_snapshot(): Capture current counts with quality filters
    - save_snapshot(): Store metrics in database
    - calculate_growth_rates(): Compute items/hour over 24h window
    - detect_stalls(): Identify metrics with zero growth for 2+ hours
    - detect_regressions(): Flag metrics that decreased
    - project_milestones(): Estimate dates for 100 analyses, 1000 hypotheses, 10k edges
    - get_growth_history(): Retrieve last 7 days of data for charting
    • Added /api/growth endpoint in api.py (returns JSON with current, history, rates, milestones, stalls)
    • Added Knowledge Growth Trajectory section to /senate page showing:
    - System status (GROWING/STALLED/FLAT) with color coding
    - Growth rates for hypotheses, KG edges, analyses (per hour)
    - Milestone projections with estimated dates
    - Stall warnings if detected
    - Historical trend charts showing changes over 7 days
    • Scheduled hourly cron job (at :07) to run metrics.py snapshot collection
    • Created SQL migration file for knowledge_snapshots table
    • Tested implementation:
    - Created simulated historical data (6 snapshots over 6 hours)
    - Verified /api/growth endpoint returns correct data (200 OK)
    - Verified /senate page displays growth section (200 OK)
    - Verified all key pages still load (/, /exchange, /gaps, /analyses/, /senate)
    - Confirmed growth rates calculated correctly (0.33 hyp/hr, 0.71 edges/hr, 0.08 analyses/hr)
    - Confirmed milestone projections working (100 analyses by 2026-05-09, 1000 hyp by 2026-07-20, 10k edges by 2027-10-20)
    - Confirmed stall detection working (identifies metrics with no growth)
    • Committed changes and merged to main
    • Result: Complete — All acceptance criteria met. Growth tracking system operational with hourly snapshots, visualization, and projections.

    File: fd00fcd3-2e0a-4dc8-8b82-76f6c2fdc37d_spec.md
    Modified: 2026-05-01 20:13
    Size: 3.9 KB