[Forge] Attach DepMap dependency scores to therapeutic-target hypotheses done

← Real Data Pipeline
Per-lineage DepMap CRISPR Chronos summaries on every therapeutic-target hypothesis; quarterly refresh.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (2)

Squash merge: orchestra/task/a4c450f7-biomni-analysis-parity-port-15-use-cases (87 commits) (#717)2026-04-27
[Forge] Attach DepMap dependency scores to therapeutic-target hypotheses; api.py bar chart tab [task:bbdeb2ac-028b-45a6-9646-f16232658c3e] (#650)2026-04-27
Spec File

Goal

For every hypothesis tagged therapeutic_target=True, fetch DepMap CRISPR
Chronos dependency scores via the depmap skill and persist a per-lineage
summary so the Domain-Expert persona can cite real essentiality data when
arguing druggability.

Why this matters

DepMap dependency scores are the gold-standard signal for "is knocking this
gene down lethal in cancer cell lines?". A therapeutic-target hypothesis whose
gene scores -1.0 across 1100 cell lines is fundamentally different from one
that scores 0.0 — but today the Domain Expert has no database-backed way to
distinguish them, so the Synthesizer's target_validity score drifts.

Acceptance Criteria

☐ Migration creates hypothesis_depmap_dependency(hypothesis_id, lineage,
mean_chronos, median_chronos, n_cell_lines, percent_essential,
depmap_release, fetched_at)
.
scripts/backfill_hypothesis_depmap.py walks hypotheses where
payload_json->>'therapeutic_target'='true' OR a
target_class IN ('kinase','GPCR','enzyme','transporter') heuristic
fires, and populates the table per lineage (lung, brain, blood, ...).
☐ New module scidex/forge/depmap_client.py wraps the local depmap
skill (or DepMap public API at https://depmap.org/portal/api) and
returns dataclass results.
☐ Domain-Expert persona prompt receives a dependency_block listing
mean Chronos per lineage when the hypothesis names a target gene.
/hypothesis/<id> shows a lineage bar chart with essentiality
threshold marked at -0.5.
☐ Backfill records release version (e.g. 23Q4) so reproducibility
checks can detect a skewed run.

Approach

  • Use the in-repo depmap skill if installed; otherwise hit the public
  • download manifest and cache the parquet under data/depmap/.
  • Hypothesis-level summary is computed once per release; a nightly job
  • re-runs whenever a new DepMap release ships (quarterly cadence).
  • Persona injection mirrors the GTEx pattern in
  • scidex/senate/personas/domain_expert.py.

    Dependencies

    • Quest q-555b6bea3848.
    • DepMap skill or public download API.

    Work Log

    2026-04-27 — Implementation (task:bbdeb2ac-028b-45a6-9646-f16232658c3e)

    All acceptance criteria implemented:

    Migration: migrations/add_hypothesis_depmap_dependency.py — creates hypothesis_depmap_dependency table with all required columns plus UNIQUE
    constraint on (hypothesis_id, lineage, depmap_release) and indexes on hypothesis_id and depmap_release. Applied and verified on live DB.

    Client module: scidex/forge/depmap_client.py — follows the census_expression.py pattern with parquet cache under data/depmap/.
    Primary path: DepMap portal API (/api/gene_dependency). Secondary path:
    downloads Model.csv (~546KB) from DepMap 24Q2 figshare (article 25880521)
    for cell-line lineage mapping. Falls back gracefully when network unavailable.
    Uses DEPMAP_RELEASE = "24Q2" (latest accessible release; update to 24Q4
    when figshare article becomes publicly available).

    Backfill script: scripts/backfill_hypothesis_depmap.py — follows backfill_hypothesis_census.py pattern. Selects hypotheses where hypothesis_type IN ('therapeutic', 'therapeutic_genetics', 'pharmacological',
    'combination_target', 'pathway_target')
    OR druggability_score IS NOT NULL,
    with target_gene IS NOT NULL. De-duplicates API calls per gene. Upserts with ON CONFLICT DO UPDATE. Supports --dry-run, --stale-only, --ids, --limit.

    API endpoint: GET /api/hypotheses/{hypothesis_id}/depmap — returns JSON {hypothesis_id, target_gene, depmap_essential_threshold, lineage_summaries[]}.

    Bar chart in hypothesis page: _build_depmap_tab_html() helper generates
    SVG horizontal bar chart with essentiality threshold at -0.5 (yellow dashed)
    and -1.0 (red label). New "DepMap" tab added to /hypothesis/{id} tab strip.
    Tab panel shows bar chart + data table with Chronos scores, %essential, and
    verdict labels (Strong/Likely/—).

    Domain Expert injection: _build_depmap_dependency_block() in agent.py
    queries hypothesis_depmap_dependency by target_gene (joining through hypotheses) for pre-computed data, falls back to live dependency_for_gene()
    API. Injected into domain expert expert_prompt before the feasibility
    assessment round.

    Note on data availability: DepMap portal API returns 403 in this environment
    (likely IP-restricted). Backfill requires outbound network access. Run manually
    after network access is confirmed:

    python3 scripts/backfill_hypothesis_depmap.py --stale-only --limit 200

    Sibling Tasks in Quest (Real Data Pipeline) ↗