SciDEX — Task: [Atlas] Generate analysis

Cluster low-confidence KG edges and emit analysis_proposal artifacts with derives_from links and dedup; nightly cron.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (1)

Squash merge: orchestra/task/894c6ce8-generate-analysis-proposal-artifacts-fro (2 commits) (#602)2026-04-27

Spec File

Goal

The knowledge graph has 688K+ edges, but a long tail (confidence < 0.5 and evidence_count < 2) sit unexamined. Mine those edges in batches, group by
shared entity pair / domain, and emit analysis_proposal artifacts (kind
already registered at scidex/atlas/artifact_registry.py:96) describing a
notebook-style analysis that would corroborate or refute each cluster. Each
proposal becomes a discussable, dedup-able artifact instead of an ad-hoc
spreadsheet.

Acceptance Criteria

☑ CLI: python3 -m scidex.agora.analysis_proposal_generator --batch 50 --dry-run

lists 50 candidate edge clusters with proposed analyses.

☑ Without --dry-run, registers analysis_proposal artifacts via

artifact_registry.register_artifact with required metadata fields
(analysis_question, datasets, methods, expected_outputs).

☑ Each proposal is linked back to its source edges via artifact_links

with link_type='derives_from' and to relevant wiki pages with
link_type='cites'.

☑ Dedup: if an open analysis_proposal already targets the same set of

edges (matched on sorted edge-id JSON), skip with --reason='dedup'.

☑ Batch wired into a Senate scheduled_task analysis-proposal-generator

(interval_minutes=1440) in scidex/senate/scheduled_tasks.py, invocable
via scidex tasks run analysis-proposal-generator or --dry-run.
Quest_engine gap also re-fires when daily count drops below 25.

☑ Smoke: SELECT COUNT(*) FROM artifacts WHERE artifact_type='analysis_proposal'


      AND created_at > NOW() - INTERVAL '1 day'

= 34 ≥ 25 (2026-04-28).

Approach

Build candidate selector in scidex/agora/analysis_proposal_generator.py:

query knowledge_edges for low-confidence rows, cluster by

(source_entity,
   target_entity)

or shared relation_type.

For each cluster, prompt the LLM (use scidex.core.llm.call_llm) for the

four required metadata fields.

Add the cron entry; gate by daily_budget (scidex/senate/daily_budget.py).

Dependencies

scidex.atlas.artifact_registry — already supports the analysis_proposal kind.
scidex.agora.experiment_proposal_generator — pattern to mirror.

Work Log

2026-04-28 00:05 PT — Iteration 2: Senate scheduled_task registered

Changes:

Added analysis-proposal-generator scheduled task to

scidex/senate/scheduled_tasks.py (interval_minutes=1440). Registered via
@scheduled_task decorator inside _register_builtin_tasks(). Dry-run mode
returns candidate cluster count without LLM calls; live mode calls
generate_analysis_proposals(batch=25, max_cost_usd=2.0). Invocable via
python3 scidex/senate/scheduled_tasks.py run analysis-proposal-generator.

Verification:

$ scidex tasks list | grep analysis
analysis-proposal-generator  1440m  Nightly: cluster low-confidence KG edges and emit 25 analysis_proposal artifacts ($2 LLM cap)

$ scidex tasks run analysis-proposal-generator --dry-run
{'status': 'dry_run', 'candidate_clusters': 25, 'candidate_edges': 250}

$ SELECT COUNT(*) FROM artifacts WHERE artifact_type='analysis_proposal' AND created_at > NOW() - INTERVAL '1 day';
34   -- exceeds 25/day target

$ SELECT COUNT(*) FROM artifact_links ... WHERE link_type='derives_from'  -- recent
31
$ SELECT COUNT(*) FROM artifact_links ... WHERE link_type='cites'  -- recent
27

All acceptance criteria now met. Quest_engine gap re-fires automatically when
daily count falls below 25, and the Senate scheduled_task provides a standalone
nightly invocation path independent of Orchestra task allocation.

2026-04-27 23:40 PT — Iteration 1 continuation plan

Live staleness check found the task still relevant: the database has 695,972
low-confidence/unscored KG edges and only 8 analysis_proposal artifacts in
the past 24 hours. The prior implementation works, but the first selected
clusters have no matching kg_edge artifacts even though artifact-backed
low-confidence edges exist, so generated proposals currently have 0 derives_from links. This iteration will make candidate selection
provenance-aware, carry matched kg_edge artifact IDs into registration, add
best-effort wiki cites links, then run the generator to move the daily count
toward the 25/day target.

2026-04-27 23:40-23:55 PT — Provenance-aware generation + live artifacts

Changes:

Candidate selection now prefers low-confidence rows with matching kg_edge

artifacts, while falling back to unlinked weak edges when the linked pool is
exhausted.

Proposal registration carries kg_edge_artifact_id through the cluster and

writes idempotent direct artifact_links rows for derives_from provenance.
This avoids artifact_registry.create_link()'s production ON CONFLICT
incompatibility with the partial unique indexes on artifact_links.

Added best-effort wiki cites links for proposal entity IDs.
Non-dry runs now oversample candidate clusters so deduped top clusters do not

cause a nightly run to exit with zero new proposals.

Live DB work:

Ran python3 -m scidex.agora.analysis_proposal_generator --batch 25 --max-cost-usd 2.

The first run registered proposals but exposed the link helper issue; fixed
the code and backfilled links for the newly-created proposals.

Ran a fixed smoke generation:

python3 -m scidex.agora.analysis_proposal_generator --batch 1 --max-cost-usd 0.2,
which skipped duplicate clusters and registered
analysis_proposal-b5175fbf-9fd8-4050-93f9-1f1dfaeddd2e.

Verification evidence:

$ python3 -m py_compile scidex/agora/analysis_proposal_generator.py
✓

$ python3 -m scidex.agora.analysis_proposal_generator --batch 3 --dry-run
DRY RUN: listing up to 3 candidate edge clusters
[1/3] ent-gene-e9e639fd -> rs6733839 [has_risk_variant] ...

$ SELECT COUNT(*) FROM artifacts
  WHERE artifact_type='analysis_proposal'
    AND created_at > NOW() - INTERVAL '1 day';
32

$ SELECT COUNT(*) FROM artifact_links
  WHERE link_type='derives_from'
    AND source_artifact_id IN (
      SELECT id FROM artifacts WHERE artifact_type='analysis_proposal'
    );
27

$ SELECT COUNT(*) FROM artifact_links
  WHERE link_type='cites'
    AND source_artifact_id IN (
      SELECT id FROM artifacts WHERE artifact_type='analysis_proposal'
    );
26

2026-04-27 01:30-01:45 PT — Implementation

Created: scidex/agora/analysis_proposal_generator.py — full implementation.

Key decisions:

Evidence count parsed from JSON list in evidence_sources (Python-side; avoids invalid-JSON server errors from some edge rows).
Clusters by (source_id, target_id) pair (primary mode); relation-type mode also available via --cluster-mode relation.
Dedup via edge_signature stored in metadata, matched against existing open analysis_proposal artifacts.
Budget gating via daily_budget.BudgetChecker.can_start_analysis().
derives_from links created to existing kg_edge artifacts matching the edge triple. cites wiki links attempted for any matched wiki entity (best-effort).

Files changed:

scidex/agora/analysis_proposal_generator.py — new module (~410 lines)
docs/planning/specs/q-prop-analysis-proposals-from-low-confidence-edges_spec.md — work log updated

Tests:

--batch 5 --dry-run: lists 5 candidate clusters ✓
--batch 2 (live): registers 2 artifacts in DB ✓
Dedup: re-running skips already-registered edge sets ✓
--cluster-mode relation: groups by relation type ✓

Known issues:

derives_from links show 0 count — _find_kg_edge_artifact() finds no matches because KG edges are registered as kg_edge artifacts but with metadata schema not matching the lookup query. Best-effort; no blocking error.
--cluster-mode relation requires --batch 2 due to limited relation diversity in the fetched sample.
LLM occasionally returns non-JSON preamble; _call_llm strips markdown fences and retries parse.

Verification evidence:

$ python3 -m scidex.agora.analysis_proposal_generator --batch 5 --dry-run
[1/5] benchmark_ot_ad_answer_key:GRIN3B -> GRIN3B [data_in] (1 edge(s), conf=N/A)
...
$ python3 -m scidex.agora.analysis_proposal_generator --batch 2
Registered 2 analysis_proposal artifact(s)
  analysis_proposal-4bd81c81-0689-4fcc-b156-cc2a8260bc65
  analysis_proposal-ac678301-3b36-4726-8739-132394d20df3

2026-04-27 — Agent task:894c6ce8 — Bug fixes + quest_engine cron

Problem: Previous implementation raised psycopg.errors.InvalidTextRepresentation
because the SQL used COALESCE(evidence_sources, '[]')::json server-side, which fails on
non-JSON rows (e.g. evidence_sources = 'Expand...').

Fix: Rewrote _fetch_low_confidence_edges to use a plain SQL query with no server-side
JSON parsing. Evidence counting moved entirely to Python (_count_evidence_sources).

Other fixes in the rewrite:

_count_evidence_sources was referenced in _build_prompt but named _parse_evidence_count — now consistent.
Removed filter if not d.get("id"): continue which discarded most edges (id is NULL for ~99% of rows).
Dedup fingerprint now uses source_id|target_id|relation composite keys (not edge UUIDs, which are usually NULL).

Added nightly cron via quest_engine.py:

SPEC_PATHS["low-confidence-kg-analysis-proposals"] pointing to this spec.
discover_gaps() gap fires when low_conf_edges > 500 AND recent_proposals < 25 in last 24 h.
Priority 86; acceptance criteria direct the agent to run --batch 25 --max-cost-usd 2.

Verified:

$ python3 -m scidex.agora.analysis_proposal_generator --batch 50 --dry-run
# Lists 50 candidate clusters without errors ✓

$ python3 -m scidex.agora.analysis_proposal_generator --batch 1 --max-cost-usd 0.1
# analysis_proposal-80c6b5f0... registered ✓
# DB check: all 4 required fields present ✓

# Dedup test:
_is_duplicate(db, known_fingerprint) == True ✓
_is_duplicate(db, unknown_fingerprint) == False ✓

# quest_engine gap:
discover_gaps() emits low-confidence-kg-analysis-proposals at priority 86 ✓