Two pieces of external content in SciDEX are currently managed by ad-hoc copies instead of tracked git submodules. This spec cleans both up and lays groundwork for session-scoped workspaces.
forge/skills/ is a manually-copied mirrorSKILL.md bundles were copied on 2026-04-16 from K-Dense-AI/scientific-agent-skills (commit 304dc005cf on the import branch)register_kdense_skills.py) lives in an abandoned worktree — never merged to mainforge/skills/ is not referenced by any live code path. The canonical skill loader (scidex/forge/skills_canonical.py) reads from skills/ (23 SciDEX-native API wrappers), not forge/skills/.claude/skills/ already symlinks the 23 native skills + 9 founding personas for Claude Code discovery — same symlink pattern should extend to K-Dense/data/scidex-artifacts is already a separate repoDespite ADR-001 (2026-04-12) deciding to keep artifacts in the main repo, SciDEX-AI/SciDEX-Artifacts was created on 2026-04-18 and is now 1.2 GB / 9,995 files:
site/figures/ (8,569), site/notebooks/, datasets/. The API at api.py serves from the main-repo path; dual-path fallback code exists but is not hit in prod.ADR-001's "revisit when volume exceeds 1 GB" trigger has fired. Reality has already forked; docs haven't caught up.
K-Dense's share model (e.g. session_20260403_101458_7f548077d599) bundles an analysis's inputs, outputs, and code into a single session directory with a manifest. SciDEX has no equivalent; artifacts float loose. Adding a tiny session layer now keeps future work cheap.
git submodule add https://github.com/K-Dense-AI/scientific-agent-skills.git vendor/kdense-skills (pinned to current head)forge/skills/* from main (content is now in submodule)forge/skills/<slug> → ../../vendor/kdense-skills/scientific-skills/<slug> (keeps the forge/skills/ namespace that existing specs reference).claude/skills/<slug> → ../../vendor/kdense-skills/scientific-skills/<slug> (one symlink per skill) so Claude Code inside the SciDEX checkout auto-discovers all 134 K-Dense skills in addition to the existing 23 + 9register_kdense_skills.py into scripts/register_kdense_skills.py on main; point it at vendor/kdense-skills/scientific-skills/git submodule add https://github.com/SciDEX-AI/SciDEX-Artifacts.git data/scidex-artifacts (not vendor/ since it's first-class content, not vendored code; path matches the upstream repo name)ln -s data/scidex-artifacts /home/ubuntu/scidex/.../mount (skipped — host /data/scidex-artifacts already exists; document that it's the same checkout once submodule is initialized)002-artifacts-separate-repo.md that supersedes ADR-001 with the factual history (why it split despite the prior decision)site/figures/, site/notebooks/, datasets/ from main yet. Surface the duplication for human review in a follow-up PR — touching ~10K LFS+tracked files is out of scope for this specscripts/install_skills.sh:
git submodule update --init --recursive vendor/kdense-skills vendor/mimeo vendor/mimeographs.claude/skills/ symlinks for all 3 sources (personas, native skills, K-Dense skills) if missing--register-db flag: runs python -m scidex.forge.skills_canonical (native) and python scripts/register_kdense_skills.py (K-Dense) to populate the DBAdd to scidex-artifacts repo:
sessions/README.md — describes the session directory conventionsessions/.gitkeepsessions/_schema/session.schema.json — JSON Schema for the session manifestscidex/atlas/sessions.py — create_session(title, owner) -> session_id, load_session(id) -> dict, list_sessions(limit), link_artifact(session_id, artifact_path, role). Thin wrapper over the filesystem under data/scidex-artifacts/sessions/<id>/{
"id": "session_<utc-iso-compact>_<8-hex>",
"title": "...",
"owner": "...",
"created_at": "...",
"artifacts": [
{"path": "figures/foo.png", "role": "output", "sha": "..."},
{"path": "notebooks/bar.html", "role": "code"}
],
"parent_session": null,
"notes_md": "..."
}Phase D is intentionally minimal — filesystem-backed, no DB, no UI. Subsequent work can wire it into analyses, wiki pages, and the share link surface.
skills DB table (they use a different metadata schema; keeping them FS-only for Claude Code discovery is enough for v1)Each phase rolls back via git submodule deinit <path> && rm -rf <path> && git checkout -- .gitmodules followed by restoring the deleted directories from git show HEAD~N:<path>.
git submodule status shows both new submodules pinnedls .claude/skills/ | wc -l returns 166 (9 personas + 23 native + 134 K-Dense)python -m scidex.forge.skills_canonical --dry-run still reports 23 native skills discovered (unchanged)python scripts/register_kdense_skills.py --dry-run reports 134 K-Dense skillsls -la data/scidex-artifacts/figures/ returns ~8.5K files (via submodule)/api/sessions/{id} routegit submodule update --remote vendor/kdense-skills + diff review