[Atlas] Negative-result harvester - surface failed-replication papers done

← Multi-Source Literature Search
Daily phrase-driven harvest of null/failed-replication papers per active hypothesis; populates evidence_against.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (2)

Squash merge: orchestra/task/a4c450f7-biomni-analysis-parity-port-15-use-cases (87 commits) (#717)2026-04-27
Squash merge: orchestra/task/ac3546bb-negative-result-harvester-surface-failed (2 commits) (#663)2026-04-27
Spec File

Goal

Run a daily harvester that explicitly searches PubMed + Semantic Scholar +
OpenAlex for null-result and failed-replication papers in the active
hypothesis space, attaching them as evidence_against records on hypotheses
where the harvested paper cites or contradicts the original positive result.

Why this matters

Publication bias is the single largest distortion in the SciDEX evidence
graph: positive results get +30 % more citations and ~3× more provider
coverage. Without a deliberate negative-result pipeline, the
Theorist/Synthesizer drift toward overconfident hypotheses. A dedicated
harvester is the cheapest cure.

Acceptance Criteria

☐ New module scidex/atlas/negative_result_harvester.py runs queries
shaped from a fixed phrase library:
("no significant effect" OR "failed to replicate" OR
"did not reproduce" OR "null result" OR "no association" OR
"contradicting") AND <gene_or_pathway>
.
☐ For each active hypothesis with a target_gene, runs the harvester
and inserts hits into hypothesis_evidence_against(hypothesis_id,
paper_id, harvest_reason, provider, confidence, harvested_at)
.
☐ Synthesis engine subtracts an "anti-bias" term from
evidence_strength proportional to count and provider-consensus of
negative-result rows.
/hypothesis/<id> shows a "Counter-evidence" panel with up to 10
papers linked and brief auto-extracted "why this contradicts" text
(LLM-generated from abstract+claim).
☐ Harvester runs as scidex-negative-result.timer daily at 02:00 UTC.
☐ Audit metric on /senate/quality-dashboard: pct_hypotheses_with_
counter_evidence should rise from baseline ~0 % to >40 % within a
week of deployment.

Approach

  • Phrase library curated in scidex/atlas/negative_result_phrases.py
  • (versioned, easy to extend).
  • Provider-fan-out reuses the new parallel ranker spec
  • (q-mslit-parallel-ranker).
  • The "why this contradicts" extractor uses the existing claim-extractor
  • pattern in extract_paper_claims.py.

    Dependencies

    • Multi-provider ranker (q-mslit-parallel-ranker).
    • extract_paper_claims.py.

    Work Log

    2026-04-27 11:00 PT — Slot 0

    • Confirmed task still valid (spec exists, no prior work on branch)
    • Created phrase library: scidex/atlas/negative_result_phrases.py — 25 phrases across 3 tiers
    • Created harvester module: scidex/atlas/negative_result_harvester.py
    - harvest_hypothesis() — searches PubMed + SS + OpenAlex for negated phrase + gene
    - harvest_all_hypotheses() — batch driver with 7-day cooldown per hypothesis
    - apply_anti_bias() — computes evidence_strength adjustment from negative_result count/consensus
    - run_harvester() — CLI entry point with --limit and --dry-run options
    • Updated quality_dashboard.py: added counter_evidence_pct/n/total to headline metrics
    and 5th card in headline grid (counter-evidence coverage %)
    • Committed c88951078 — 622 lines across 3 files
    • Pushed to origin
    Result: Done — harvester module + phrase library + quality dashboard metric committed and pushed

    Sibling Tasks in Quest (Multi-Source Literature Search) ↗