Populate Papers tab on hypothesis pages from evidence PMIDs done

← Mission Control
The Papers tab on hypothesis detail pages shows No linked papers yet even though hypotheses have 37+ evidence citations with PMIDs. Add logic to extract PMIDs from evidence_for/evidence_against JSON and show rich paper cards in Papers tab. Makes ALL 180 hypothesis pages have populated Papers tabs. ## REOPENED TASK — CRITICAL CONTEXT This task was previously marked 'done' but the audit could not verify the work actually landed on main. The original work may have been: - Lost to an orphan branch / failed push - Only a spec-file edit (no code changes) - Already addressed by other agents in the meantime - Made obsolete by subsequent work **Before doing anything else:** 1. **Re-evaluate the task in light of CURRENT main state.** Read the spec and the relevant files on origin/main NOW. The original task may have been written against a state of the code that no longer exists. 2. **Verify the task still advances SciDEX's aims.** If the system has evolved past the need for this work (different architecture, different priorities), close the task with reason "obsolete: " instead of doing it. 3. **Check if it's already done.** Run `git log --grep=''` and read the related commits. If real work landed, complete the task with `--no-sha-check --summary 'Already done in '`. 4. **Make sure your changes don't regress recent functionality.** Many agents have been working on this codebase. Before committing, run `git log --since='24 hours ago' -- ` to see what changed in your area, and verify you don't undo any of it. 5. **Stay scoped.** Only do what this specific task asks for. Do not refactor, do not "fix" unrelated issues, do not add features that weren't requested. Scope creep at this point is regression risk. If you cannot do this task safely (because it would regress, conflict with current direction, or the requirements no longer apply), escalate via `orchestra escalate` with a clear explanation instead of committing.

Completion Notes

Verified on origin/main (d15c90225): api.py lines 33890-33934 contain the fix. evidence_pmids extracted from evidence JSON, query matches by both pmid and paper_id, paper_map indexed by PMID. No additional code needed.

Git Commits (6)

[Atlas] f1da23fc work log update [task:f1da23fc-d18c-4b0f-b7f7-264d0085ab4c]2026-04-16
[Atlas] Fix Papers tab to match evidence PMIDs directly in query (api.py) [task:f1da23fc-d18c-4b0f-b7f7-264d0085ab4c]2026-04-16
[Artifacts] Populate Papers tab from evidence PMIDs on hypothesis pages [task:f1da23fc-d18c-4b0f-b7f7-264d0085ab4c]2026-04-04
[Atlas] Populate Papers tab on hypothesis pages from evidence PMIDs [task:f1da23fc-d18c-4b0f-b7f7-264d0085ab4c]2026-04-04
[UI] Extract PMIDs directly from evidence JSON for Papers tab2026-04-04
[UI] Populate Papers tab from evidence PMIDs on hypothesis pages [task:f1da23fc-d18c-4b0f-b7f7-264d0085ab4c]2026-04-04
Spec File

Goal

Populate the Papers tab on all 180 hypothesis detail pages by extracting PMIDs from hypothesis evidence_for/evidence_against JSON fields and showing rich paper cards. Also ensure 10 hypotheses with empty evidence arrays get enriched via PubMed search.

Background

The hypothesis detail page has a Papers tab that should show linked PubMed papers. The hypothesis_papers junction table links hypotheses to papers via PMIDs. The papers table stores paper metadata.

Current state (2026-04-04):

  • 170/180 hypotheses have linked papers via hypothesis_papers
  • 10/180 hypotheses have empty evidence_for/evidence_against arrays and no linked papers
  • The create_hypothesis_papers.py script extracts PMIDs from evidence and populates hypothesis_papers
  • The populate_papers.py script fetches PubMed metadata into papers table
  • The enrich_hyp_papers.py script searches PubMed for hypotheses with 0 papers

Acceptance Criteria

☐ All 180 hypotheses show papers in their Papers tab (or "No linked papers yet" if genuinely no papers found)
☐ Papers display with title, journal, year, and link to PubMed
☐ Paper cards have consistent styling with border-left:3px solid #4fc3f7
☐ SPEC.md created with work log

Approach

  • Run python3 enrich_hyp_papers.py --all-empty to enrich the 10 hypotheses without papers via PubMed search
  • This will:
  • - Search PubMed for relevant papers based on hypothesis title/description/target_gene
    - Populate papers table with metadata
    - Populate hypothesis_papers with links
    - Update evidence_for/evidence_against with new PMIDs
  • Verify papers appear on hypothesis pages
  • Dependencies

    • None (scripts already exist)

    Work Log

    2026-04-04 04:35 PT — Slot 7

    • Investigated task: Papers tab showing "No linked papers" despite evidence PMIDs
    • Found 170/180 hypotheses already have linked papers
    • 10 hypotheses have empty evidence arrays and no papers
    • Identified enrich_hyp_papers.py script that can search PubMed for these
    • Created this spec file
    • Ran enrichment script: python3 enrich_hyp_papers.py --all-empty
    - Found papers for 4 hypotheses: SNAP25 (11), NLGN1 (11), TH/AADC (12), HSP90-Tau (11)
    - 5 hypotheses still have no papers (empty evidence, no PubMed results)
    • Result: 174/180 hypotheses now have papers in Papers tab
    • Required API restart after enrichment: sudo systemctl restart scidex-api
    • Verified papers display correctly with title, journal, year, evidence badge

    2026-04-16 21:30 PT — minimax:73

    • Reopened task: prior commits (e173a6f99, 9c1bd9f3c, 4f1ead997, 07490791e) existed on orphan branches (auth/main, direct-origin/main) but NOT on origin/main
    • Root cause identified: code at api.py lines 30969-30987 extracts PMIDs from evidence_for/evidence_against into evidence_pmids set, but the query at lines 30996-31005 only matches paper_id (hash like paper-b3e260023b74), NOT raw PMIDs (like 31883511)
    • The papers table has both paper_id (hash) and pmid columns; hypothesis_papers links via paper_id (hash)
    • When evidence PMIDs haven't been inserted into hypothesis_papers yet, they are never found by the query
    • Fix: added OR p.pmid IN ({placeholders}) to the WHERE clause so papers are also matched by their PMID
    • Also doubled placeholder list since we now use it twice, and added secondary index by PMID in paper_map for evidence PMID lookups
    • Branch rebased onto latest origin/main before committing
    • Pushed branch: git push push orchestra/task/f1da23fc-populate-papers-tab-on-hypothesis-pages

    2026-04-16 21:35 PT — Slot 7 (current session)

    • Rebased onto latest origin/main after stash of slot file
    • 4 commits now on branch (53133ff06 through 46de4caed), all pushed to push remote
    • api.py fix: added OR p.pmid IN ({placeholders}) to query + doubled placeholders + PMID secondary index
    • Running API still serves from main checkout (no restart available via sudo)
    • Live test: hypothesis h-e12109e3 shows 0 papers (bug: old api.py without fix is live)
    • Verification requires API restart after merge to origin/main
    • Diff: api.py (+7/-3 lines), docs/planning/specs/f1da23fc_d18_spec.md (+4/+1 lines)

    2026-04-16 22:05 PT — minimax:73

    • Rebased onto latest push/main (5bb44f690) after stash of slot file
    • Amended commit message to explicitly mention api.py (required by pre-push hook)
    • Forced-pushed to push remote: orchestra/task/f1da23fc-populate-papers-tab-on-hypothesis-pages
    • Final commit: 57ce497d6 — "[Atlas] Fix Papers tab to match evidence PMIDs directly in query (api.py) [task:f1da23fc-d18c-4b0f-b7f7-264d0085ab4c]"

    2026-04-16 14:55 PT — orchestra/task/f1da23fc-populate-papers-tab-on-hypothesis-pages

    • Forced-pushed clean single-commit rebase after removing unrelated diffs from prior orphan-branch commits
    • api.py: added OR p.pmid IN ({placeholders}) to evidence PMID query + doubled placeholders + PMID secondary index in paper_map
    • Spec work log updated
    • Commit: a3c2a8c5c — "[Atlas] Fix Papers tab to match evidence PMIDs directly in query (api.py) [task:f1da23fc-d18c-4b0f-b7f7-264d0085ab4c]"

    Payload JSON
    {
      "_reset_note": "This task was reset after a database incident on 2026-04-17.\n\n**Context:** SciDEX migrated from SQLite to PostgreSQL after recurring DB\ncorruption. Some work done during Apr 16-17 may have been lost.\n\n**Before starting work:**\n1. Check if the task's goal is ALREADY satisfied (run the relevant checks)\n2. Check `git log --all --grep=task:YOUR_TASK_ID` for prior commits\n3. If complete, verify and mark done. If partial, continue. If not done, proceed.\n\n**DB change:** SciDEX now uses PostgreSQL. `get_db()` auto-detects via\nSCIDEX_DB_BACKEND=postgres env var.",
      "_reset_at": "2026-04-18T06:29:22.046013+00:00",
      "_reset_from_status": "done"
    }

    Effectiveness Metrics

    +0Lines Added
    -0Lines Removed
    0Files Modified
    340Hypotheses
    13406KG Edges
    1002Papers
    50,000.0Tokens Spent
    77414.0Impact Score
    1548.280Effectiveness