[Forge] Build automated PubMed update pipeline for hypothesis evidence

← All Specs

[Forge] Build automated PubMed update pipeline for hypothesis evidence

Task ID: eba164da-60eb-48ba-99b3-7246ec365465

Goal

Create and maintain a recurring pipeline that searches PubMed for new papers related to top hypotheses and updates their evidence_for/evidence_against fields with classified citations. The pipeline runs as a systemd daemon, uses Claude Haiku for evidence classification, and tracks all updates in a monitoring log.

Current State

The pipeline (pubmed_update_pipeline.py) is already built and running as scidex-pubmed-update systemd service. It has processed 176 hypotheses and added 509 papers. API endpoints exist at /api/pubmed-pipeline-status and /api/pubmed-pipeline/trigger.

Acceptance Criteria

☑ Pipeline script with PubMed search, Claude Haiku classification, DB update
☑ Daemon mode with configurable interval (default 6 hours)
☑ systemd service running (scidex-pubmed-update)
☑ API endpoints for status and manual trigger
☑ Monitoring log table (pubmed_update_log)
☑ Deduplication of existing PMIDs
☑ Rate limiting for NCBI API compliance
☐ Evidence items include abstract snippets for richer display
☐ Classifier returns strength ratings (strong/moderate/weak)
☐ Search queries use hypothesis description keywords for broader coverage

Approach

  • Enhance evidence item format to include abstract snippet (200 chars) and strength rating
  • Update Claude Haiku prompt to include strength classification
  • Improve build_queries() to extract keywords from description text
  • Test with dry run, verify evidence quality
  • Commit and deploy
  • Work Log

    2026-04-02 — Slot 6

    • Reviewed existing pipeline: fully operational, 176 hypotheses tracked, 509 papers added
    • Identified quality gaps: evidence items lack abstract snippets and strength ratings
    • Enhancing classifier prompt and evidence item format

    File: eba164da-60eb-48ba-99b3-7246ec365465_spec.md
    Modified: 2026-05-01 20:13
    Size: 1.9 KB