Trust scores treat single-shot analyses as equally credible as
replicated ones. We have replication_clustering.py for clustering
existing replications but no automated generator of new ones. Build
a Replication Runner: take a finalised analysis, re-execute it
verbatim on a different data slice (different cohort, different
publication-year window, different gene-list seed), then auto-compare
verdicts and assign a replication_status ∈ {confirmed, partial,
contradicted, untestable} to the original artifact.
scidex/atlas/replication_runner.py::replicate(analysis_id, slice_strategy) returns a new analysis_id linked via replication_links(parent_id, replica_id, strategy, similarity, verdict_match).temporal_holdout (papers > 2024 only), cohort_swap (other disease cohort if available), seed_perturb (different starter gene list with same domain), model_swap (Sonnet instead of Opus).replication_status and write to analyses.replication_status + replication_history(analysis_id, status, n_replicas, verdict_consistency, computed_at).epistemic_tiers.classify_* reads replication_status — confirmed ratchets tier toward T1, contradicted ratchets toward T4.T2/T3 analyses lacking replicas and queues replicas (auction-priced)./analysis/{id} shows a "Replications" panel with each replica's verdict + similarity heatmap.temporal_holdout produces analysis B; verdict comparison correct; status set to confirmed when both verdicts agree.verdict_match = cosine similarity over verdict embeddings (use existing vector_search engine for embeddings) plus dimension-by-dimension agreement.temporal_holdout: filter papers.published_year >= cutoff in the seed corpus; record the cutoff in replication_links.config_json.q-er-preregistration (replicas inherit the original prereg).epistemic_tiers.py (consumes replication_status).{
"completion_shas": [
"cc8cb612f",
"4561ca06f"
],
"completion_shas_checked_at": ""
}