SciDEX has two parallel KG structures: causal_edges (19,753 rows, free-text source_entity/target_entity with mechanism_description and evidence_pmids) vs kg_edges (2,366 rows, typed entity references). These are NOT connected.
The 19K causal edges are the richest mechanistic knowledge in SciDEX — genes to pathways to diseases, with evidence PMIDs — but they are not queryable via KG API, not surfaced in hypothesis scoring, and not linked to canonical entities.
What to do:
1. Build entity resolution pipeline: map causal_edges text entities to canonical_entities using fuzzy + semantic matching
2. For resolved pairs, create corresponding kg_edges entries
3. Track resolution quality (confidence, unresolved fraction)
4. Enrich hypothesis KG-connectivity scores with new edges
Match strategy: exact lowercase -> alias lookup -> substring -> LLM fuzzy
Confidence tiers: exact=1.0, alias=0.95, fuzzy>=0.8=record with confidence, fuzzy<0.8=skip
Read first: docs/planning/specs/quest_atlas_causal_kg_entity_resolution.md
Success criteria per iteration: >= 5,000 causal edges entity-resolved and linked to kg_edges.
Completion Notes
Released by supervisor slot 12 because credential acquisition failed after pre-claim. Reason: worktree_creation_failed:branch_held_by_other:held_by=/tmp/cer-iter5