Task ID: f3ae01b2-00b7-4b8c-a977-dc6a02becaeb Layer: [Analytics] Priority: medium
site_pageviews grows unboundedly. Trend queries (e.g. "how many visits last 30 days?") had to full-scan that raw table. No pre-aggregated daily summaries existed.
Implement a rollup pipeline that reads site_pageviews for a given UTC date and writes pre-aggregated metrics into site_daily_stats.
Both tables already exist in PostgreSQL:
site_pageviews — raw event log (path, page_type, referrer, session_id, status_code, response_time_ms, is_bot, created_at)site_daily_stats — pre-aggregated: (date, metric, value, breakdown_key) PK) — roll up one date (default: yesterday) — roll up all dates present in site_pageviews — return pre-aggregated stats for the last N daysThe rollup is designed to be called daily (e.g. via Orchestra cron or a systemd timer). Typical invocation:
POST /api/admin/daily-rollup # rolls up yesterday by defaultUpsert semantics mean re-running is idempotent.
and site_daily_stats tables already exist in PostgreSQL helper in api.py (near line 8135) endpoint with backfill support