Tau propagation mechanisms and therapeutic interception points¶
Notebook ID: nb-SDA-2026-04-04-gap-tau-prop-20260402003221 · Analysis: SDA-2026-04-04-gap-tau-prop-20260402003221 · Generated: 2026-04-10
Research question¶
Investigate prion-like spreading of tau pathology through connected brain regions, focusing on trans-synaptic transfer, extracellular vesicle-mediated spread, and intervention strategies at each propagation step
Approach¶
This notebook is generated programmatically from real Forge tool calls and SciDEX debate data. Code cells load cached evidence bundles from data/forge_cache/seaad/*.json and query live data from scidex.db. Re-run python3 scripts/regenerate_notebooks.py --analysis SDA-2026-04-04-gap-tau-prop-20260402003221 --force to refresh.
7 hypotheses were generated and debated. The knowledge graph has 100 edges.
Debate Summary¶
Quality score: 0.54 · Rounds: 4 · Personas: Theorist, Skeptic, Domain_Expert, Synthesizer
1. Forge tool provenance¶
import json, sys, sqlite3
from pathlib import Path
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
matplotlib.rcParams['figure.dpi'] = 110
matplotlib.rcParams['figure.facecolor'] = 'white'
REPO = Path('.').resolve()
sys.path.insert(0, str(REPO))
CACHE_SUB = 'seaad'
CACHE = REPO / 'data' / 'forge_cache' / CACHE_SUB
def load(name):
p = CACHE / f'{name}.json'
if p.exists():
return json.loads(p.read_text())
return {}
db_path = Path('/home/ubuntu/scidex/scidex.db')
try:
db = sqlite3.connect(str(db_path))
prov = pd.read_sql_query('''
SELECT skill_id, status, COUNT(*) AS n_calls,
ROUND(AVG(duration_ms),0) AS mean_ms
FROM tool_calls
WHERE created_at >= date('now','-30 days')
GROUP BY skill_id, status
ORDER BY n_calls DESC
''', db)
db.close()
prov['tool'] = prov['skill_id'].str.replace('tool_', '', regex=False)
print(f'{len(prov)} tool-call aggregates (last 30 days):')
prov[['tool','status','n_calls','mean_ms']].head(20)
except Exception as e:
print(f'Provenance unavailable: {e}')
77 tool-call aggregates (last 30 days):
2. Target gene annotations¶
ann_rows = []
for g in ['VCP']:
mg = load(f'mygene_{g}')
hpa = load(f'hpa_{g}')
if not mg and not hpa:
ann_rows.append({'gene': g, 'name': '—', 'protein_class': '—',
'disease_involvement': '—'})
continue
ann_rows.append({
'gene': g,
'name': (mg.get('name') or '')[:55],
'protein_class': ', '.join((hpa.get('protein_class') or [])[:2])[:55]
if isinstance(hpa.get('protein_class'), list)
else str(hpa.get('protein_class') or '—')[:55],
'disease_involvement': ', '.join((hpa.get('disease_involvement') or [])[:2])[:55]
if isinstance(hpa.get('disease_involvement'), list)
else str(hpa.get('disease_involvement') or '')[:55],
})
pd.DataFrame(ann_rows)
| gene | name | protein_class | disease_involvement | |
|---|---|---|---|---|
| 0 | VCP | — | — | — |
3. GO Biological Process enrichment (Enrichr)¶
go_bp = load('enrichr_GO_Biological_Process')
if isinstance(go_bp, list) and go_bp:
go_df = pd.DataFrame(go_bp[:10])[['term','p_value','odds_ratio','genes']]
go_df['p_value'] = go_df['p_value'].apply(lambda p: f'{p:.2e}')
go_df['odds_ratio'] = go_df['odds_ratio'].round(1)
go_df['term'] = go_df['term'].str[:60]
go_df['n_hits'] = go_df['genes'].apply(len)
go_df['genes'] = go_df['genes'].apply(lambda g: ', '.join(g))
go_df[['term','n_hits','p_value','odds_ratio','genes']]
else:
print('No GO:BP enrichment data')
# Visualize top GO BP enrichment
go_bp = load('enrichr_GO_Biological_Process')
if isinstance(go_bp, list) and go_bp:
top = go_bp[:8]
terms = [t['term'][:45] for t in top][::-1]
neglogp = [-np.log10(max(t['p_value'], 1e-300)) for t in top][::-1]
fig, ax = plt.subplots(figsize=(9, 4.5))
ax.barh(terms, neglogp, color='#4fc3f7')
ax.set_xlabel('-log10(p-value)')
ax.set_title('Top GO:BP enrichment (Enrichr)')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout(); plt.show()
else:
print('No GO:BP data to plot')
4. KEGG pathway enrichment¶
kegg = load('enrichr_KEGG_Pathways')
if isinstance(kegg, list) and kegg:
kegg_df = pd.DataFrame(kegg[:10])[['term','p_value','odds_ratio','genes']]
kegg_df['genes'] = kegg_df['genes'].apply(lambda g: ', '.join(g))
kegg_df['p_value'] = kegg_df['p_value'].apply(lambda p: f'{p:.2e}')
kegg_df['odds_ratio'] = kegg_df['odds_ratio'].round(1)
kegg_df
else:
print('No KEGG enrichment data')
No KEGG enrichment data
5. STRING protein interaction network¶
ppi = load('string_network')
if isinstance(ppi, list) and ppi:
ppi_df = pd.DataFrame(ppi).sort_values('score', ascending=False)
display_cols = [c for c in ['protein1','protein2','score','escore','tscore'] if c in ppi_df.columns]
print(f'{len(ppi_df)} STRING edges')
ppi_df[display_cols].head(20)
else:
print('No STRING edges returned')
11 STRING edges
# Network figure
ppi = load('string_network')
if isinstance(ppi, list) and ppi:
import math
nodes = sorted({p for e in ppi for p in (e['protein1'], e['protein2'])})
n = len(nodes)
pos = {n_: (math.cos(2*math.pi*i/n), math.sin(2*math.pi*i/n)) for i, n_ in enumerate(nodes)}
fig, ax = plt.subplots(figsize=(7, 7))
for e in ppi:
x1,y1 = pos[e['protein1']]; x2,y2 = pos[e['protein2']]
ax.plot([x1,x2],[y1,y2], color='#888', alpha=0.3+0.5*e['score'],
linewidth=0.5+2*e['score'])
for name,(x,y) in pos.items():
ax.scatter([x],[y], s=450, color='#ffd54f', edgecolors='#333', zorder=3)
ax.annotate(name, (x,y), ha='center', va='center', fontsize=8, fontweight='bold', zorder=4)
ax.set_aspect('equal'); ax.axis('off')
ax.set_title(f'STRING PPI network ({len(ppi)} edges)')
plt.tight_layout(); plt.show()
else:
print('No STRING data to visualize')
6. Reactome pathway footprint¶
pw_rows = []
for g in ['VCP']:
pws = load(f'reactome_{g}')
if isinstance(pws, list):
pw_rows.append({'gene': g, 'n_pathways': len(pws),
'top_pathway': (pws[0]['name'] if pws else '—')[:70]})
else:
pw_rows.append({'gene': g, 'n_pathways': 0, 'top_pathway': '—'})
pd.DataFrame(pw_rows).sort_values('n_pathways', ascending=False)
| gene | n_pathways | top_pathway | |
|---|---|---|---|
| 0 | VCP | 0 | — |
7. Allen Brain Atlas ISH regional expression¶
ish_rows = []
for g in ['VCP']:
ish = load(f'allen_ish_{g}')
regions = ish.get('regions') or [] if isinstance(ish, dict) else []
ish_rows.append({
'gene': g,
'n_ish_regions': len(regions),
'top_region': (regions[0].get('structure','') if regions else '—')[:45],
'top_energy': round(regions[0].get('expression_energy',0), 2) if regions else None,
})
pd.DataFrame(ish_rows)
| gene | n_ish_regions | top_region | top_energy | |
|---|---|---|---|---|
| 0 | VCP | 0 | — | — |
8. Hypothesis ranking (7 hypotheses)¶
hyp_data = [('VCP-Mediated Autophagy Enhancement', 0.563), ('Trans-Synaptic Adhesion Molecule Modulation', 0.559), ('Synaptic Vesicle Tau Capture Inhibition', 0.557), ('LRP1-Dependent Tau Uptake Disruption', 0.554), ('Extracellular Vesicle Biogenesis Modulation', 0.553), ('TREM2-mediated microglial tau clearance enhancement', 0.521), ('HSP90-Tau Disaggregation Complex Enhancement', 0.506)]
titles = [h[0] for h in hyp_data][::-1]
scores = [h[1] for h in hyp_data][::-1]
fig, ax = plt.subplots(figsize=(10, max(8, len(titles)*0.4)))
colors = ['#ef5350' if s >= 0.6 else '#ffa726' if s >= 0.5 else '#66bb6a' for s in scores]
ax.barh(range(len(titles)), scores, color=colors)
ax.set_yticks(range(len(titles))); ax.set_yticklabels(titles, fontsize=7)
ax.set_xlabel('Composite Score'); ax.set_title('Tau propagation mechanisms and therapeutic interception points')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout(); plt.show()
9. Score dimension heatmap (top 10)¶
labels = ['VCP-Mediated Autophagy Enhancement', 'Trans-Synaptic Adhesion Molecule Modulat', 'Synaptic Vesicle Tau Capture Inhibition', 'LRP1-Dependent Tau Uptake Disruption', 'Extracellular Vesicle Biogenesis Modulat', 'TREM2-mediated microglial tau clearance ', 'HSP90-Tau Disaggregation Complex Enhance']
matrix = np.array([[0.5124, 0.46359999999999996, 0.49776, 0.518, 0.56, 0, 0, 0.72, 0], [0.357, 0.32299999999999995, 0.3468, 0.36, 0.4, 0, 0, 0.35, 0], [0.357, 0.32299999999999995, 0.3468, 0.36, 0.462, 0, 0, 0.35, 0], [0.5091692307692308, 0.46067692307692304, 0.49462153846153845, 0.5149230769230769, 1.0, 0, 0, 0.62, 0], [0.357, 0.32299999999999995, 0.3468, 0.36, 0.367, 0, 0, 0.35, 0], [0.5419906661714097, 0.49037250748841826, 0.5265052185665123, 0.566181586829914, 0.462, 0, 0, 0.72, 0], [0.5544, 0.5016, 0.53856, 0.5780000000000001, 0.12, 0, 0, 0.82, 0]])
dims = ['novelty_score', 'feasibility_score', 'impact_score', 'mechanistic_plausibility_score', 'clinical_relevance_score', 'data_availability_score', 'reproducibility_score', 'druggability_score', 'safety_profile_score']
if matrix.size:
fig, ax = plt.subplots(figsize=(10, 5))
im = ax.imshow(matrix, cmap='RdYlGn', aspect='auto', vmin=0, vmax=1)
ax.set_xticks(range(len(dims)))
ax.set_xticklabels([d.replace('_score','').replace('_',' ').title() for d in dims],
rotation=45, ha='right', fontsize=8)
ax.set_yticks(range(len(labels))); ax.set_yticklabels(labels, fontsize=7)
ax.set_title('Score dimensions — top hypotheses')
plt.colorbar(im, ax=ax, shrink=0.8)
plt.tight_layout(); plt.show()
else:
print('No score data available')
10. PubMed evidence per hypothesis¶
Hypothesis 1: VCP-Mediated Autophagy Enhancement¶
Target genes: VCP · Composite score: 0.563
Molecular Mechanism and Rationale
The valosin-containing protein (VCP), also known as p97, represents a critical hexameric AAA+ ATPase that orchestrates multiple cellular quality control pathways, including autophagy, endoplasmic reticulum-associated degradation (ERAD), and proteasomal degradation. In the context of tauopathies, VCP functions as a key regulatory hub for tau aggregate clearance through its essential role in autophagosome maturation and lysosomal fusion. The molecular mechani
hid = 'h-18a0fcc6'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 2: Trans-Synaptic Adhesion Molecule Modulation¶
Target genes: NLGN1 · Composite score: 0.559
Molecular Mechanism and Rationale
The neurexin-neuroligin trans-synaptic adhesion system represents a critical molecular bridge that maintains synaptic integrity while potentially facilitating pathological tau propagation in neurodegenerative diseases. Neuroligin-1 (NLGN1), the primary target of this therapeutic approach, is a postsynaptic cell adhesion molecule that forms heterotypic interactions with presynaptic neurexins (NRXN1, NRXN2, NRXN3). This interaction occurs through the extracel
hid = 'h-fdaae8d9'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 3: Synaptic Vesicle Tau Capture Inhibition¶
Target genes: SNAP25 · Composite score: 0.557
Molecular Mechanism and Rationale
The synaptic vesicle tau capture inhibition hypothesis centers on the critical role of SNAP25 (Synaptosome-Associated Protein of 25 kDa) in facilitating pathological tau protein uptake at presynaptic terminals during synaptic vesicle recycling processes. SNAP25 is a key component of the SNARE (Soluble N-ethylmaleimide-sensitive factor Attachment protein REceptor) complex, which mediates synaptic vesicle fusion with the presynaptic membrane during neurotrans
hid = 'h-73e29e3a'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 4: LRP1-Dependent Tau Uptake Disruption¶
Target genes: LRP1 · Composite score: 0.554
Overview
LRP1 (Low-density lipoprotein receptor-related protein 1) functions as a critical gateway receptor mediating the cellular internalization of pathological tau species in Alzheimer's disease. This therapeutic hypothesis proposes developing selective small molecule inhibitors targeting the tau-binding domain of LRP1 to block cellular uptake of pathological tau while preserving essential LRP1 functions in lipid metabolism, cellular signaling, and vascular homeostasis. The strategy addr
hid = 'h-4dd0d19b'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 5: Extracellular Vesicle Biogenesis Modulation¶
Target genes: CHMP4B · Composite score: 0.553
Molecular Mechanism and Rationale¶
The endosomal sorting complex required for transport III (ESCRT-III) represents a critical molecular machinery governing the final stages of extracellular vesicle (EV) biogenesis, particularly the formation of multivesicular bodies (MVBs) and subsequent exosome release. CHMP4B (Charged Multivesicular body Protein 4B) functions as a core component of the ESCRT-III complex, working in concert with other CHMP proteins (CHMP2A, CHMP3, CHMP6) to execute membr
hid = 'h-55ef81c5'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 6: TREM2-mediated microglial tau clearance enhancement¶
Target genes: TREM2 · Composite score: 0.521
TREM2-Mediated Microglial Reprogramming for Tau Clearance in Alzheimer's Disease
Overview: Microglia as Tau Propagators vs. Tau Clearers
TREM2 (Triggering Receptor Expressed on Myeloid cells 2) is a microglial surface receptor that regulates phagocytic activity, metabolic fitness, and inflammatory responses. In Alzheimer's disease, TREM2 function becomes critically important: Loss-of-function variants (R47H, R62H) increase AD risk 2-4-fold, while enhancing TREM2 signaling shows therape
hid = 'h-b234254c'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 7: HSP90-Tau Disaggregation Complex Enhancement¶
Target genes: HSP90AA1 · Composite score: 0.506
Molecular Mechanism and Rationale¶
The heat shock protein 90 (HSP90) chaperone system represents a critical cellular machinery for protein folding, stability, and quality control. HSP90AA1, the inducible cytoplasmic isoform of HSP90, exhibits distinct conformational states that can be allosterically modulated to enhance specific client protein interactions. In the context of tau pathology, HSP90 demonstrates intrinsic disaggregation activity toward tau aggregates through a complex mechani
hid = 'h-0f00fd75'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
11. Knowledge graph edges (100 total)¶
edge_data = [{'source': 'LRP1', 'relation': 'regulates', 'target': 'LRP1-Dependent Tau Uptake Disr', 'strength': 0.7}, {'source': 'TREM2', 'relation': 'regulates', 'target': 'TREM2-mediated microglial tau ', 'strength': 0.7}, {'source': 'CHMP4B', 'relation': 'regulates', 'target': 'Extracellular Vesicle Biogenes', 'strength': 0.7}, {'source': 'VCP', 'relation': 'regulates', 'target': 'VCP-Mediated Autophagy Enhance', 'strength': 0.7}, {'source': 'HSP90AA1', 'relation': 'regulates', 'target': 'HSP90-Tau Disaggregation Compl', 'strength': 0.7}, {'source': 'SNAP25', 'relation': 'regulates', 'target': 'Synaptic Vesicle Tau Capture I', 'strength': 0.7}, {'source': 'NLGN1', 'relation': 'regulates', 'target': 'Trans-Synaptic Adhesion Molecu', 'strength': 0.7}, {'source': 'LRP1-Dependent Tau Uptake Disr', 'relation': 'therapeutic_target', 'target': "Alzheimer's Disease", 'strength': 0.65}, {'source': 'TREM2-mediated microglial tau ', 'relation': 'therapeutic_target', 'target': "Alzheimer's Disease", 'strength': 0.65}, {'source': 'Extracellular Vesicle Biogenes', 'relation': 'therapeutic_target', 'target': "Alzheimer's Disease", 'strength': 0.65}, {'source': 'VCP-Mediated Autophagy Enhance', 'relation': 'therapeutic_target', 'target': "Alzheimer's Disease", 'strength': 0.65}, {'source': 'HSP90-Tau Disaggregation Compl', 'relation': 'therapeutic_target', 'target': "Alzheimer's Disease", 'strength': 0.65}, {'source': 'Synaptic Vesicle Tau Capture I', 'relation': 'therapeutic_target', 'target': "Alzheimer's Disease", 'strength': 0.65}, {'source': 'Trans-Synaptic Adhesion Molecu', 'relation': 'therapeutic_target', 'target': "Alzheimer's Disease", 'strength': 0.65}, {'source': 'LRP1', 'relation': 'regulates', 'target': 'Tau Propagation', 'strength': 0.6}, {'source': 'TREM2', 'relation': 'regulates', 'target': 'Tau Propagation', 'strength': 0.6}, {'source': 'CHMP4B', 'relation': 'regulates', 'target': 'Tau Propagation', 'strength': 0.6}, {'source': 'VCP', 'relation': 'regulates', 'target': 'Tau Propagation', 'strength': 0.6}, {'source': 'HSP90AA1', 'relation': 'regulates', 'target': 'Tau Propagation', 'strength': 0.6}, {'source': 'SNAP25', 'relation': 'regulates', 'target': 'Tau Propagation', 'strength': 0.6}, {'source': 'NLGN1', 'relation': 'regulates', 'target': 'Tau Propagation', 'strength': 0.6}, {'source': 'HSP90AA1', 'relation': 'participates_in', 'target': 'Tau protein / microtubule-asso', 'strength': 0.53}, {'source': 'HSP90AA1', 'relation': 'associated_with', 'target': "Alzheimer's Disease", 'strength': 0.53}, {'source': 'TREM2', 'relation': 'associated_with', 'target': "Alzheimer's Disease", 'strength': 0.5}, {'source': 'VCP', 'relation': 'participates_in', 'target': 'Autophagy-lysosome pathway', 'strength': 0.49}]
if edge_data:
pd.DataFrame(edge_data).head(25)
else:
print('No KG edge data available')
12. Caveats¶
This notebook uses real Forge tool calls cached from live APIs, but:
- Enrichment is against curated gene-set libraries, not genome-wide screens
- STRING/Reactome/HPA/MyGene reflect curated knowledge
- PubMed literature is search-relevance ranked, not systematic review
The cached evidence bundle is the minimum viable real-data analysis for this topic.