Effort: extensive
Build a first-class CRISPR-design pipeline that takes a gene symbol
(or an experimental hypothesis with a target_gene) and runs an
end-to-end workflow: pull the canonical transcript via biopython
Entrez, design SpCas9 sgRNAs across the coding region with
on-target scoring (Doench Rule Set 2), screen off-targets across the
human genome with pysam-backed BWA, render an annotated expression
construct map, and persist every step as a versioned artifact in the
artifact registry. The pipeline is invokable as a tool from any debate
("design 5 guides for <gene>") and from a script.
CRISPR design is a workflow not a tool — guides without off-target
analysis are dangerous; off-target analysis without an annotated
construct is hard to act on; and none of the three are useful unless
the resulting artifact is reproducible. Today SciDEX has zero CRISPR
capability; an experiment-proposal generator
(q-prop-experiment-proposals-from-debate-cruxes) cannot translate a
"knock down MAPT" debate-crux into an actionable experimental
spec. This pipeline fills that gap and enables every wave-1 proposal
quest to emit truly executable proposals.
scidex/forge/crispr_design.py (≤900 LoC) with:design_guides(gene_symbol, n=20, pam='NGG', region='CDS') —biopython Entrez, enumeratescrispritz if installed, else a vendored Rule Set 2 weightsscreen_off_targets(guides, genome='hg38') — runs BWA-MEMdata/genomes/hg38/); reports CFD score per off-target hit;build_construct(guide, vector='lentiCRISPRv2') — renders anBio.SeqIO.write) withpipeline(gene_symbol) — composes the three calls, writesdata/scidex-artifacts/crispr/<gene>/<run_id>/,commit_artifact.
crispr_design_run(run_id PRIMARY KEY, gene_symbol,tools.py registers crispr_design_pipeline(gene_symbol) as a@log_tool_call instrumentation.
/api/crispr/design/<gene> POST endpoint kicks off a run and/artifacts/<id> page renders the construct as a SnapGene-stylepython -m scidex.forge.crispr_design --genetests/test_crispr_design.py — synthetic 1 kb gene,crispritz is present, prefer it.
scripts/build_crispr_offtarget_index.py.bwt/.pac/.sa files under data/genomes/hg38/biopython features API; vector backbones.gb files in scidex/forge/crispr_vectors/.
q-tools-skill-marketplace listsbiopython, pysam skills.data/scidex-artifacts/ submodule.q-prop-experiment-proposals-from-debate-cruxes — consumer ofscidex/forge/crispr_design.py (~550 LoC) with design_guides,screen_off_targets, build_construct, pipeline functions
100_add_crispr_design_run.pycrispr_design_pipeline in tools.py with @log_tool_callPOST /api/crispr/design/{gene_symbol} endpoint to api.pycrispr_design artifact type viewer in artifact_detail (type_viewer_html)tests/test_crispr_design.py (18 tests, all passing)python -m scidex.forge.crispr_design --gene MAPT → 4 guides,crispr-mapt-<run_id>