[Forge] Resource-prediction model — predict CPU/RAM/time pre-run

← All Specs

Goal

Train a lightweight regression model that, given a candidate analysis script
+ runtime spec, predicts CPU-seconds, peak RAM, and wall-clock duration before it runs. Wire the predictions into the executor so the supervisor
can allocate the right slot, set a conservative cgroup memory cap, and reject
manifestly impossible jobs early.

Why this matters

The cgroup isolator at scidex/senate/cgroup_isolation.py already enforces
limits, but the limits are static defaults from RESOURCE_LIMITS. Half the
analyses are wasting headroom (8 GB cap on a 200 MB job) and the other half
are getting OOM-killed mid-debate. A predictor closes that loop, lifting
throughput without raising the host's resident-set ceiling.

Acceptance Criteria

☐ New module scidex/forge/resource_predictor.py exposes
predict(script_text, runtime_name, input_sizes_mb) returning
{cpu_seconds, peak_rss_mb, wall_seconds, confidence}.
☐ Model trained from RuntimeResult history rows (joined to existing
forge_runtime log lines); features include script-token counts of
heavy libs (scanpy, torch, pyscenic, pymc), runtime name,
input artifact sizes, and the analysis's prior runs (if any).
☐ Migration resource_prediction(analysis_id, predicted_cpu_s,
predicted_rss_mb, predicted_wall_s, actual_cpu_s, actual_rss_mb,
actual_wall_s, model_version, predicted_at, settled_at)
for
ground-truth tracking.
☐ Executor: when deterministic mode is off, set
memory_limit_mb = 1.5 * predicted_rss_mb, capped to host budget;
timeout_seconds = max(60, 1.3 * predicted_wall_s).
/forge/resource-predictor page renders prediction-vs-actual scatter
and current model accuracy (R² across the last 200 runs).
☐ Retrain job runs weekly via scidex-predictor-retrain.timer once
≥100 settled rows accumulate.

Approach

  • Start with sklearn.ensemble.GradientBoostingRegressor per target —
  • small, fast, no GPU dependency.
  • Feature extraction reuses tool_chains.py import-detection scaffolding.
  • Confidence = predicted ± 1.96 * residual_std_for_this_runtime.
  • Roll-out: dry-run for 3 days (predict but don't enforce), then enforce.
  • Dependencies

    • scidex/senate/cgroup_isolation.py, forge/runtime.py,
    scidex/forge/executor.py.

    Work Log

    2026-04-27 — Implementation complete [task:b3a31ccd-59a1-4216-a98c-bd3c72370f30]

    Files created/modified:

    • scidex/forge/resource_predictor.py — GBR model per target; predict(), train(),
    record_prediction(), settle_actuals(), get_accuracy_stats(), retrain_if_ready().
    Feature vector: 30 dims (heavy-lib import flags, token counts, runtime encoding,
    input sizes, prior-run actuals). Falls back to synthetic training from analyses.resource_cost
    until ≥10 real settled rows accumulate.
    • migrations/add_resource_prediction_table.py — Creates resource_prediction table with
    predicted/actual columns + partial index on settled_at.
    • scidex/forge/executor.pyLocalExecutor.run_analysis now calls predict() pre-run,
    writes prediction via record_prediction(), overrides memory_limit_mb and timeout_seconds
    with 1.5×predicted_rss / max(60, 1.3×predicted_wall) when confidence > 5%, and calls
    settle_actuals() post-run.
    • api.py — Added GET /forge/resource-predictor (HTML dashboard with Chart.js scatter),
    GET /api/forge/resource-predictor/stats, POST /api/forge/resource-predictor/predict.
    • deploy/bootstrap/systemd/scidex-predictor-retrain.service + .timer — weekly retrain
    at Sun 03:00 UTC via retrain_if_ready() (no-op until ≥100 settled rows).

    Dry-run mode: Set RESOURCE_PREDICTOR_DRY_RUN=1 to predict without enforcing.

    Acceptance criteria status:

    • predict() function with correct signature and return keys
    • ✅ GBR trained from DB history; features include heavy-lib imports + runtime + sizes
    • resource_prediction migration applied
    • ✅ Executor dynamic caps: 1.5×rss + 1.3×wall with host budget ceiling
    • /forge/resource-predictor scatter + R² dashboard
    • scidex-predictor-retrain.timer weekly, ≥100-row guard

    Tasks using this spec (1)
    [Forge] Resource-prediction model - predict CPU/RAM/time pre
    File: q-sand-resource-prediction_spec.md
    Modified: 2026-05-01 20:13
    Size: 4.3 KB