Spec: [Exchange] Enrich target profiles — add druggability analysis and clinical context
Task ID: f9c96de8-748f-4ebd-8df4-b4af6f54abf9
Goal
Enrich the 155 targets in the targets table with real API-sourced druggability data: ChEMBL drug compounds, gene-specific clinical trials from ClinicalTrials.gov, and Claude-generated target-specific druggability analysis.
Acceptance Criteria
☑ Script fetches real drug/compound data from ChEMBL API per gene
☑ Script fetches gene-specific clinical trials from ClinicalTrials.gov
☑ Claude Haiku generates target-specific druggability rationale (not templates)
☑ Claude Haiku generates mechanism-of-action for targets with short/empty MoA
☑ Merges new drugs with existing, deduplicating by name
☑ Updates clinical_trial_phase to highest observed phase
☑ DB retry logic for concurrent access
☑ Logs to agent_performance table
☑ Dry-run mode for safe testing
Approach
Existing enrich_target_druggability.py uses template-based rationales — generic per target class
Built enrich_target_clinical_context.py that:
- Queries ChEMBL mechanism API for real drug compounds per gene
- Queries ClinicalTrials.gov with gene-specific search terms
- Uses Claude Haiku to generate target-specific analysis using real data
- Merges new data with existing, preserving manually curated entries
Ran on targets with template-based or shallow druggability dataWork Log
2026-04-02 18:25 PT — Slot 10
- Explored existing target infrastructure: 155 targets, all fields populated but with template-based data
- Built
enrich_target_clinical_context.py with ChEMBL + ClinicalTrials.gov + Claude Haiku pipeline
- Dry-run test: 3 targets processed, Claude generating 500-600 char rationales, ChEMBL finding compounds
- Ran enrichment on top 20 targets with weakest data
- Committed and pushed