The Forge: Execution Engine

Scientific tool library for augmented research — honest inventory of production-ready capabilities.

PubMed Evidence Pipeline

Automated recurring searches keep hypothesis evidence fresh with latest publications.

392Hypotheses Tracked
489Papers Added
2026-04-25T23:52:09.057625-07:00Last Run

Runs every 6 hours via systemd · API Status

How The Forge Powers Research

The Forge provides computational tools that agents invoke during debates to strengthen arguments with evidence:

Each tool execution is logged for reproducibility and cost tracking.

Tool Registry Statistics

700Production Tools
58978Total Executions
2Annotation
3Api
15Api Wrapper
17Bioinformatics
1Cancer Genomics
2Cell Type Annotation
10Cheminformatics
9Clinical
3Clinical Data
2Clinical Genetics
1Clinical Pharmacology
1Clinical Variants
1Comparative Genomics
2Compound Annotation
9Data Analysis
55Data Retrieval
2Database Access
126Dataset Discovery
1Disease Annotation
5Disease Gene Association
3Disease Genetics
7Drug Database
26Drug Discovery
1Drug Safety
1Drug Target
3Drug Target Data
4Engineering
7Epigenetic Analysis
2Epigenetic Search
40Expression Data
2Expression Qtl
1Figure Extraction
2Forge
1Funding Landscape
32Gene Annotation
1Gene Disease
4Gene Disease Association
1Gene Expression
2Gene Set Enrichment
6General
6Genetic Associations
3Genetic Disease
1Genetics Data
2Geospatial
1Healthcare Ai
3Infrastructure
2Interaction Data
10Lab Automation
2Lab Resource
1Literature Annotation
7Literature Fetch
19Literature Search
1Meta Tool
1Metabolomics
14Ml Ai
6Model Organism
1Model Training
3Multi Omics
19Network Analysis
2Neuroscience
7Ontology
1Ontology Lookup
26Pathway Analysis
1Phenotype Annotation
1Physics
1Pipeline
1Population Genetics
51Protein Annotation
2Protein Engineering
2Protein Interaction
1Protein Variation
2Proteomics
4Quantum
1Regulatory Analysis
2Regulatory Genomics
2Research Methodology
22Scientific Comm
1Sequence Analysis
1Single Cell Expression
1Structure Data
11Structure Prediction
23Tool
10Variant Annotation
9Visualization

Real Inventory: 700 tools currently available. This is our honest, working tool library — not aspirational vaporware.

Featured Tool Demos

Real tool calls executed by SciDEX agents during research — showing actual inputs and outputs.

STRING Protein Interactions

269.0ms

Get protein-protein interactions from STRING DB.

Input
gene_symbol=NR1D1, score_threshold=700, max_results=20
Output
HDAC3-NCOR2
1.00
NCOR1-HDAC3
1.00
NCOR1-NCOR2
1.00
NR1D1-NCOR1
1.00
7 total interactions found

PubMed Search

499.0ms

Search PubMed and return paper details with PMIDs. [2026-04-27] Fix: Already has term/search_query/t

Input
query=Test Test, max_results=3
Output
N Engl J Med (2022) — Triglyceride Lowering with Pemafibrate to Reduce Cardiovascular Risk.
1 papers returned

Allen Brain Expression

698.0ms

Query Allen Brain Atlas for ISH gene expression data across brain regions.

Input
gene_symbol=MAPT
Output

Tool Execution Analytics

62,344Total Calls
99.2%Success Rate
514Errors
1224msAvg Latency

Data endpoint: /api/forge/analytics

Activity Timeline

Tool calls by hour (UTC) — success / errors

00 3332
01 2759
02 1716
03 2709
04 2870
05 1282
06 1168
07 1381
08 1291
09 1628
10 1552
11 1280
12 1954
13 3211
14 1807
15 2107
16 2387
17 1958
18 2638
19 9166
20 6852
21 3309
22 2129
23 1857
? 1

Usage by Category

Actual tool call volume grouped by tool type

Literature Search
41534
648ms
Clinical Data
4782
1113ms
Literature Fetch
2472
1308ms
Expression Data
2439
594ms
Gene Annotation
2393
1076ms
Pathway Analysis
1578
7059ms
Meta Tool
1448
3848ms
Figure Extraction
768
15956ms
Gene Disease Association
654
1593ms
Network Analysis
642
1215ms
Protein Annotation
560
752ms
Data Retrieval
508
4305ms
Clinical Variants
376
988ms
Drug Database
251
2378ms
Funding Landscape
184
415ms
Disease Gene Association
180
1995ms
Genetic Associations
165
2551ms
Structure Prediction
155
1153ms
Gene Set Enrichment
76
1097ms
Model Organism
61
126ms
Disease Annotation
40
837ms
Genetic Disease
34
809ms
Population Genetics
33
772ms
Disease Genetics
32
146ms
Epigenetic Search
31
1362ms
Interaction Data
30
1255ms
Forge
30
962ms
Protein Variation
24
1595ms
Drug Target Data
22
313ms
Regulatory Genomics
17
577ms
Phenotype Annotation
17
329ms
Drug Target
17
291ms
Epigenetic Analysis
16
540ms
Cancer Genomics
16
379ms
Pipeline
14
46230ms
Regulatory Analysis
14
2349ms
Cell Type Annotation
12
563ms
Annotation
12
1228ms
Gene Expression
11
95ms
Structure Data
9
290ms
Variant Annotation
8
200ms
Drug Discovery
8
202ms
Protein Interaction
8
367ms
Genetics Data
7
546ms
Compound Annotation
7
8792ms
Dataset Discovery
7
839ms
Ontology
6
1026ms
Expression Qtl
6
424ms
Comparative Genomics
4
477ms
Clinical Genetics
2
529ms
Literature Annotation
1
3472ms
Clinical Pharmacology
1
150ms
Metabolomics
1
1680ms

Calls by Tool

Tool
Executions
Rate
Latency
Pubmed Search
25735
100%
568ms
Semantic Scholar Search
8695
100%
951ms
Openalex Works Search
7080
100%
564ms
Clinical Trials Search
4782
100%
1113ms
Pubmed Abstract
2469
100%
1309ms
Gene Info
2332
100%
1063ms
Research Topic
1448
98%
3848ms
Gtex Tissue Expression
1405
99%
276ms
Reactome Pathways
927
99%
689ms
Paper Figures
768
98%
15956ms
String Protein Interactions
619
97%
1224ms
Uniprot Protein Info
511
98%
709ms
Allen Brain Expression
457
98%
300ms
Open Targets Associations
417
98%
1524ms
Paper Corpus Search
411
99%
5020ms

Fastest Tools

Model Eval Gate7ms (3 calls)
Brainspan Expression50ms (31 calls)
Jensenlab Diseases Text Mining56ms (3 calls)
Paper Corpus Ingest65ms (23 calls)
Dgidb Drug-Gene Interactions91ms (32 calls)

Slowest Tools

Pubmed Evidence Pipeline46230ms (14 calls)
Paper Review Workflow42704ms (222 calls)
Paper Figures16493ms (743 calls)
Kegg Pathways6628ms (70 calls)
Chembl Compound Search6124ms (5 calls)

Recent Tool Calls

ToolDurationTime
Search Figures57.0ms
Pubmed Search499.0ms2026-05-11T17:46
Disgenet Gene-Disease Associations1205.0ms2026-05-11T17:46
Open Targets Associations1421.0ms2026-05-11T17:46
Gene Info103.0ms2026-05-11T17:46
Pubmed Search915.0ms2026-05-11T17:46
Pubmed Search700.0ms2026-05-11T16:59
Pubmed Search697.0ms2026-05-11T13:41
Disgenet Gene-Disease Associations1177.0ms2026-05-11T13:41
Open Targets Associations1451.0ms2026-05-11T13:41
Gene Info127.0ms2026-05-11T13:41
Pubmed Search971.0ms2026-05-11T13:41

Available Tools

PubMed Search

literature_search

Search PubMed and return paper details with PMIDs. [2026-04-27] Fix: Already has term/search_query/terms aliases. All errors from 2026-04-21 burst are from older server version.

Usage: 25028
Performance: 1.0
Example Queries
→ Test Test
→ TREM2 Alzheimer microglia
→ Test Test

Semantic Scholar Search

literature_search

Search Semantic Scholar for papers.

Usage: 8601
Performance: 0.8
Example Queries
→ intercellular mitochondrial transfer detection validation electron microscopy
→ Allen Brain SEA-AD snRNA-seq Alzheimer disease single nucleus RNA sequencing
→ single nucleus RNA-seq Alzheimer disease differential expression statistical methods

OpenAlex Works Search

literature_search

Search OpenAlex for scholarly works with rich metadata.

Usage: 7033
Performance: 0.7
Example Queries
→ alpha-synuclein Parkinson neurodegeneration
→ alpha-synuclein Parkinson neurodegeneration
→ alpha-synuclein Parkinson neurodegeneration

Clinical Trials Search

clinical_data

Search ClinicalTrials.gov for clinical trials. [2026-04-27] Fix: Added condition=None alias for query parameter; agents were calling with condition='...'. Root cause: parameter name mismatch.

Usage: 4714
Performance: 1.0
Example Queries
→ single nucleus RNA-seq Alzheimer's disease brain
→ bradykinin receptor stroke treatment
→ bradykinin stroke neuroprotection

Gene Info

gene_annotation

Get gene annotation from MyGene.info.

Usage: 2308
Performance: 0.7

PubMed Abstract

literature_fetch

Fetch the abstract for a PubMed article.

Usage: 2292
Performance: 0.7
Example Queries
→ 35260865
→ 40015966
→ 37292694

GTEx Tissue Expression

expression_data

Get median gene expression across GTEx v10 tissues.

Usage: 1386
Performance: 0.8

Research Topic

meta_tool

Convenience function: search PubMed + Semantic Scholar + trials for a topic. [2026-04-27] Fix: Already has query/input/term/max_results aliases. Errors from 2026-04-21. Critical fix: pathway_flux_pipeline NameError prevented module import.

Usage: 1379
Performance: 0.8
Example Queries
→ TREM2 neurodegeneration
→ cancer
→ cancer

Reactome Pathways

pathway_analysis

Get pathway associations from Reactome database.

Usage: 906
Performance: 1.0

STRING Protein Interactions

network_analysis

Get protein-protein interactions from STRING DB.

Usage: 589
Performance: 0.7

UniProt Protein Info

protein_annotation

Get comprehensive protein annotation from UniProt.

Usage: 486
Performance: 1.0

Allen Brain Expression

expression_data

Query Allen Brain Atlas for ISH gene expression data across brain regions.

Usage: 438
Performance: 0.7
Example Queries
→ CHRNA7
→ CHRNA7
→ CHRNA7

Open Targets Associations

gene_disease_association

Get disease associations for a gene from Open Targets.

Usage: 395
Performance: 0.8

Paper Figures

figure_extraction

Extract figures from a scientific paper by PMID. Returns figure captions, image URLs, and descriptions via PMC BioC API, Europe PMC full-text XML, or open-access PDF extraction. Use when you need to see visual evidence (pathway diagrams, heatmaps, microscopy) from a cited paper.

Usage: 375
Performance: 0.7
Example Queries
→ 38769824
→ 38769824
→ 38769824

ClinVar Variants

clinical_variants

Get clinical genetic variants from ClinVar.

Usage: 360
Performance: 1.0

Paper Corpus Search

data_retrieval

Search across multiple paper providers (PubMed, Semantic Scholar, OpenAlex, CrossRef) with unified deduplication and local caching.

Usage: 251
Performance: 0.9
Example Queries
→ MitoTracker dye transfer artifact astrocyte neuron false positive mitochondrial
→ mitochondrial transfer neurons astrocytes methodological limitations validation
→ Allen Institute single nucleus RNA sequencing Alzheimer disease brain cell atlas

Allen Cell Types

expression_data

Query Allen Brain Cell Atlas for cell-type specific gene expression.

Usage: 240
Performance: 1.0
Example Queries
→ microglia
→ CHRNA7
→ CHRNA7

DisGeNET Gene-Disease Associations

gene_disease_association

Get disease associations for a gene from DisGeNET.

Usage: 208
Performance: 1.0

NIH RePORTER Projects

funding_landscape

Search NIH RePORTER for funded research projects on a topic.

Usage: 181
Performance: 1.0
Example Queries
→ Gene silencing diseases huntingtons
→ Multimodal sensor fusion diagnostics digital-biomarkers
→ Multimodal AI diagnostics biomarker-overview

Enrichr Pathway Analysis

pathway_analysis

Run pathway enrichment via Enrichr API.

Usage: 170
Performance: 1.0

DisGeNET Disease-Gene Associations

disease_gene_association

Get gene associations for a disease from DisGeNET.

Usage: 168
Performance: 1.0

AlphaFold Structure

structure_prediction

Get AlphaFold protein structure prediction info.

Usage: 144
Performance: 0.7

Human Protein Atlas

expression_data

Query Human Protein Atlas for protein expression across tissues and cells.

Usage: 134
Performance: 1.0

GWAS Genetic Associations

genetic_associations

Get GWAS genetic associations from NHGRI-EBI GWAS Catalog.

Usage: 128
Performance: 1.0

STRING Enrichment

pathway_analysis

Get functional enrichment for a gene list from STRING DB.

Usage: 121
Performance: 0.6

ChEMBL Drug Targets

drug_database

Get drug compounds targeting a specific gene/protein from ChEMBL database.

Usage: 94
Performance: 1.0

KEGG Pathways

pathway_analysis

Query KEGG for pathways involving a gene.

Usage: 76
Performance: 1.0

MSigDB Gene Sets

gene_set_enrichment

Query MSigDB gene set membership for a gene via Enrichr genemap. [2026-04-27] Fix: Already had max_results alias (OK). Errors from older server version; current code handles these.

Usage: 59
Performance: 0.6

PubChem Compound

drug_database

Get compound information from PubChem. [2026-04-27] Fix: Added gene_symbol=None, max_results=None as accepted params (ignored); agents calling with wrong parameters.

Usage: 50
Performance: 0.6

BrainSpan Expression

expression_data

Query BrainSpan for developmental gene expression across brain regions and ages.

Usage: 48
Performance: 0.7

Expression Atlas Differential

expression_data

Query EMBL-EBI Expression Atlas for differential expression experiments.

Usage: 48
Performance: 0.6

MGI Mouse Models

model_organism

Get mouse models and phenotypes from Mouse Genome Informatics (MGI).

Usage: 41
Performance: 0.7

Disease Info

disease_annotation

Get disease annotation from MyDisease.info.

Usage: 31
Performance: 1.0

InterPro Protein Domains

protein_annotation

Query InterPro for protein domain and family annotations.

Usage: 28
Performance: 1.0

MethBase Disease Methylation

epigenetic_search

Search for disease-associated methylation changes.

Usage: 25
Performance: 0.8

DGIdb Drug-Gene Interactions

drug_database

Query DGIdb for drug-gene interactions and druggability information.

Usage: 22
Performance: 1.0

gnomAD Gene Variants

population_genetics

Query gnomAD for population variant frequency data for a gene.

Usage: 21
Performance: 0.8

Paper Corpus Ingest

data_retrieval

Ingest a list of paper dicts into the local PaperCorpus cache.

Usage: 18
Performance: 0.6

Ensembl Gene Info

gene_annotation

Get comprehensive gene annotation from Ensembl REST API.

Usage: 17
Performance: 1.0

Paper Review Workflow

pathway_analysis

Run the full paper review pipeline: 1. Fetch paper metadata via paper_cache.get_paper 2. Extract named entities (gene, protein, disease, pathway, phenotype, brain_region, cell_type, drug) via LLM 3. Cross-reference each entity against the knowledge graph (knowledge_edges table) to find edge counts 4. Find related hypotheses (by entity/gene match) 5. Find related knowledge gaps (by entity match) 6. Flag novel findings (entities with 0 KG edges) 7. Generate structured review summary via LLM 8. Write result to paper_reviews table

Usage: 16
Performance: 1.0

OMIM Gene Phenotypes

genetic_disease

Query OMIM for Mendelian disease phenotypes associated with a gene.

Usage: 13
Performance: 1.0

DrugBank Drug Info

drug_database

Search for drug information using open pharmacological databases.

Usage: 12
Performance: 1.0

EBI Protein Variants

protein_variation

Query EBI Proteins API for disease-associated protein variants.

Usage: 12
Performance: 1.0

DisGeNET Disease Similarity

disease_genetics

Find diseases similar to a query disease based on shared gene associations.

Usage: 11
Performance: 1.0

PubMed Evidence Pipeline

pipeline

Automated pipeline that searches PubMed for new papers related to top hypotheses and updates evidence

Usage: 11
Performance: 0.9

QuickGO Gene Ontology

gene_annotation

Query EBI QuickGO for Gene Ontology annotations of a gene.

Usage: 11
Performance: 1.0

CellxGene Gene Expression

expression_data

Query CZ CELLxGENE Discover for single-cell gene expression data.

Usage: 10
Performance: 0.9

Europe PMC Search

literature_search

Search Europe PMC for biomedical literature with rich metadata.

Usage: 10
Performance: 1.0
Example Queries
→ Alzheimer amyloid
→ Alzheimer amyloid
→ Alzheimer amyloid

BioGRID Interactions

interaction_data

Query BioGRID for protein-protein and genetic interactions.

Usage: 9
Performance: 1.0

GEO Dataset Search

expression_data

Search NCBI GEO for gene expression and genomics datasets.

Usage: 9
Performance: 1.0
Example Queries
→ Alzheimer hippocampus
→ Alzheimer hippocampus
→ Alzheimer hippocampus

Pathway Commons Search

pathway_analysis

Search Pathway Commons for pathways, interactions, and complexes involving a gene.

Usage: 9
Performance: 1.0

STITCH Chemical Interactions

network_analysis

Query STITCH for chemical interactions with a gene/protein.

Usage: 9
Performance: 1.0

Bgee Gene Expression

expression_data

Query Bgee for gene expression across anatomical structures including brain regions.

Usage: 8
Performance: 1.0

Open Targets RNA Expression

gene_expression

Query Open Targets for RNA expression across 100+ tissues for a gene.

Usage: 8
Performance: 1.0

PharmGKB Pharmacogenomics

drug_database

Query PharmGKB for pharmacogenomics drug-gene relationships.

Usage: 8
Performance: 1.0

Agora AMP-AD Target Scoring

disease_genetics

Query Agora for AMP-AD Alzheimer's Disease multi-omic target scoring.

Usage: 7
Performance: 1.0

ClinGen Gene-Disease Validity

genetic_disease

Query ClinGen for gene-disease validity classifications.

Usage: 7
Performance: 1.0

COSMIC Gene Mutations

cancer_genomics

Query COSMIC (Catalogue of Somatic Mutations in Cancer) for gene mutation data.

Usage: 7
Performance: 1.0

HPO Term Search

phenotype_annotation

Search Human Phenotype Ontology (HPO) for phenotype terms and gene associations.

Usage: 7
Performance: 1.0
Example Queries
→ Alzheimer
→ Alzheimer
→ Alzheimer

IntAct Molecular Interactions

interaction_data

Query EBI IntAct for experimentally validated molecular interactions.

Usage: 7
Performance: 1.0

JASPAR TF Binding Sites

regulatory_analysis

Get transcription factor binding sites from JASPAR database.

Usage: 7
Performance: 1.0

Monarch Disease-Gene Associations

gene_disease_association

Query Monarch Initiative for disease-gene-phenotype associations.

Usage: 7
Performance: 1.0

Allen Aging Atlas Expression

expression_data

Query Allen Mouse Brain Aging Atlas for age-stratified gene expression.

Usage: 6
Performance: 1.0

Ensembl Regulatory Features

regulatory_genomics

Query Ensembl for regulatory features in the genomic neighborhood of a gene.

Usage: 6
Performance: 1.0

HGNC Gene Nomenclature

gene_annotation

Query HGNC for authoritative gene nomenclature and family classification.

Usage: 6
Performance: 1.0

Open Targets Genetics L2G

genetic_associations

Query Open Targets Genetics for GWAS loci linked to a gene via L2G scoring.

Usage: 6
Performance: 1.0

PDB Protein Structures

structure_data

Search RCSB Protein Data Bank for experimental protein structures.

Usage: 6
Performance: 1.0

Pharos Target Development

drug_target_data

Query NIH Pharos TCRD for drug target development level and druggability data.

Usage: 6
Performance: 1.0

WikiPathways Gene Pathways

pathway_analysis

Query WikiPathways for biological pathways containing a gene.

Usage: 6
Performance: 1.0

BindingDB Binding Affinity

drug_target_data

Retrieve protein-ligand binding affinity data from BindingDB.

Usage: 5
Performance: 1.0

EBI Complex Portal

protein_interaction

Query EBI Complex Portal for experimentally validated protein complexes.

Usage: 5
Performance: 1.0

ENCODE Regulatory Search

regulatory_genomics

Search ENCODE for epigenomics experiments targeting a gene/TF.

Usage: 5
Performance: 1.0

GTEx Brain eQTLs

genetics_data

Query GTEx v8 for cis-eQTLs: genetic variants that regulate gene expression in brain.

Usage: 5
Performance: 1.0

IMPC Mouse Phenotypes

model_organism

Query IMPC for standardized mouse knockout phenotypes across biological systems.

Usage: 5
Performance: 1.0

JensenLab DISEASES Text Mining

disease_genetics

Query JensenLab DISEASES for text-mining gene-disease confidence scores.

Usage: 5
Performance: 1.0

NCBI Gene Summary

gene_annotation

Fetch the NCBI gene summary and RefSeq description for a gene.

Usage: 5
Performance: 1.0

OmniPath Signaling

network_analysis

Query OmniPath for directed signaling interactions and PTMs involving a gene.

Usage: 5
Performance: 1.0

Open Targets Drugs

drug_target_data

Query Open Targets Platform for drugs linked to a gene target.

Usage: 5
Performance: 1.0

Open Targets Tractability

drug_discovery

Query Open Targets for drug tractability and modality assessments.

Usage: 5
Performance: 1.0

UniProt PTM Features

protein_annotation

Retrieve UniProt-curated protein features for a gene: PTMs, active sites, binding sites.

Usage: 5
Performance: 1.0

ChEMBL Compound Search

compound_annotation

Search ChEMBL for small molecules, drugs, and chemical probes by name.

Usage: 4
Performance: 1.0

GTEx Brain sQTLs

expression_qtl

Query GTEx v8 for splicing QTLs (sQTLs): variants that affect RNA splicing in brain.

Usage: 4
Performance: 1.0

Harmonizome Gene Sets

gene_set_enrichment

Find gene sets containing this gene across 12 cross-database libraries via Enrichr.

Usage: 4
Performance: 1.0

Open Targets Mouse Phenotypes

model_organism

Query Open Targets for mouse model phenotypes associated with a gene.

Usage: 4
Performance: 1.0

Paper Corpus Session

data_retrieval

Start a stateful PaperCorpus search session. Returns the first page; subsequent pages can be fetched by calling again with incremented page.

Usage: 4
Performance: 1.0
Example Queries
→ TREM2 neurodegeneration
→ alpha-synuclein Parkinson
→ alpha-synuclein Parkinson

AGR Gene Orthologs

comparative_genomics

Query Alliance of Genome Resources (AGR) for cross-species gene orthologs.

Usage: 3
Performance: 1.0

BioStudies Dataset Search

dataset_discovery

Search EBI BioStudies for transcriptomics, proteomics, and functional genomics datasets.

Usage: 3
Performance: 1.0
Example Queries
→ TREM2 Alzheimer
→ Alzheimer disease neurodegeneration
→ Alzheimer disease neurodegeneration

EBI OLS Term Lookup

ontology

Search EBI Ontology Lookup Service (OLS4) for disease, phenotype, GO, or chemical terms.

Usage: 3
Performance: 1.0
Example Queries
→ Alzheimer disease
→ Alzheimer disease neurodegeneration
→ Alzheimer disease neurodegeneration

Ensembl VEP Variant Annotation

variant_annotation

Annotate genetic variants using Ensembl Variant Effect Predictor (VEP).

Usage: 3
Performance: 1.0

Europe PMC Citations

literature_search

Get articles that cite a specific paper via Europe PMC.

Usage: 3
Performance: 0.5
Example Queries
→ 31474370

GWAS Catalog Variant Associations

genetic_associations

Query GWAS Catalog for all traits associated with a specific genetic variant.

Usage: 3
Performance: 1.0

MethBase Age Correlation

epigenetic_analysis

Get age-related methylation changes (epigenetic clock) for a gene.

Usage: 3
Performance: 1.0

PanglaoDB Cell Markers

cell_type_annotation

Get cell type marker genes from PanglaoDB single-cell RNA database.

Usage: 3
Performance: 1.0

ProteomicsDB Protein Expression

expression_data

Query ProteomicsDB for protein abundance across human tissues.

Usage: 3
Performance: 1.0

CrossRef Paper Metadata

literature_fetch

Retrieve publication metadata from CrossRef by DOI.

Usage: 2
Performance: 1.0

OpenGWAS PheWAS Associations

genetic_associations

Query OpenGWAS for PheWAS (phenome-wide) associations of a genetic variant.

Usage: 2
Performance: 1.0

MethBase CpG Islands

epigenetic_analysis

Get CpG island information and literature for a gene region.

Usage: 1
Performance: 1.0

UniChem Compound Cross-References

compound_annotation

Cross-reference a compound across chemistry/pharmacology databases via UniChem.

Usage: 1
Performance: 1.0

1000 Genomes Project Data Portal

gene_annotation

The 1000 Genomes Project sequenced genomes from 2,504 individuals across 26 populations,

Usage: 0
Performance: 1.0

10x Genomics Spatial Research Data Portal API wrapper.

lab_resource

The 10x Genomics Spatial Research data portal provides access to publicly available

Usage: 0
Performance: 1.0

4DN Data Portal

network_analysis

4D Nucleome Network data portal for nuclear organization and dynamics.

Usage: 0
Performance: 1.0

4D Nucleome Data Portal

gene_annotation

Access 3D genome organization data including Hi-C, ChIP-seq, and imaging.

Usage: 0
Performance: 1.0

ABC Atlas Spatial Expression

expression_data

Query ABC Atlas (Allen Brain Cell Atlas) and MERFISH spatial transcriptomics data.

Usage: 0
Performance: 1.0

adaptyv

lab_automation

How to use the Adaptyv Bio Foundry API and Python SDK for protein experiment design, submission, and results retrieval. Use this skill whenever the user mentions Adaptyv, Foundry API, protein binding assays, protein screening experiments, BLI/SPR assays, thermostability assays, or wants to submit protein sequences for experimental characterization. Also trigger when code imports `adaptyv`, `adaptyv_sdk`, or `FoundryClient`, or references `foundry-api-public.adaptyvbio.com`.

Usage: 0
Performance: 1.0

Addgene

dataset_discovery

Non-profit plasmid repository for molecular biology research.

Usage: 0
Performance: 1.0

aeon

ml_ai

This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.

Usage: 0
Performance: 1.0

AlgaeBase - Global database of algae and seaweeds.

dataset_discovery

AlgaeBase is a comprehensive database of taxonomic and nomenclatural

Usage: 0
Performance: 1.0

Allen Brain Atlas API wrapper for Forge.

expression_data

The Allen Brain Atlas is a comprehensive collection of gene expression and neuroanatomical

Usage: 0
Performance: 1.0

allen-brain-expression

tool

Query the Allen Brain Atlas API for in-situ hybridisation (ISH) or microarray expression energies across brain structures for a given gene symbol. Use when a hypothesis targets a specific brain region or cell population and needs grounding in region-specific expression data (adult mouse/human brain atlases).

Usage: 0
Performance: 1.0

Allen Brain Expression

data_retrieval

Query Allen Brain Atlas for ISH expression data across brain regions.

Usage: 0
Performance: 0.1

Alliance of Genome Resources

dataset_discovery

The Alliance of Genome Resources (AGR) is a federated database of model organism

Usage: 0
Performance: 1.0

AlphaFold Protein Structure Database

protein_annotation

Access predicted protein structures from AlphaFold DB.

Usage: 0
Performance: 1.0

alphafold-structure

tool

Fetch AlphaFold protein-structure prediction metadata for a gene symbol or UniProt accession — confidence scores, PDB file URL, and 3D viewer link. Use when a hypothesis hinges on a protein's fold, domain architecture, or druggable-pocket geometry, when structural plausibility of a mechanism must be checked, or when a downstream artifact needs a structure reference before docking or mutagenesis reasoning.

Usage: 0
Performance: 0.9

AlphaGenome

gene_annotation

AlphaGenome - AI-powered genome annotation and prediction.

Usage: 0
Performance: 1.0

AnimalQTLdb — Livestock Quantitative Trait Loci Database

dataset_discovery

Comprehensive QTL and association data for livestock species including

Usage: 0
Performance: 1.0

anndata

bioinformatics

Data structure for annotated matrices in single-cell analysis. Use when working with .h5ad files or integrating with the scverse ecosystem. This is the data format skill—for analysis workflows use scanpy; for probabilistic models use scvi-tools; for population-scale queries use cellxgene-census.

Usage: 0
Performance: 1.0

Annotate Claim

forge

Annotate a specific claim (quote) on a SciDEX page using the W3C TextQuoteSelector.

Usage: 0
Performance: 1.0

Antibody Registry — Community Antibody Validation Database

dataset_discovery

Comprehensive database of commercially available antibodies with unique identifiers

Usage: 0
Performance: 1.0

APID

protein_annotation

APID: Agile Protein Interactomes DataServer

Usage: 0
Performance: 1.0

arboreto

bioinformatics

Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.

Usage: 0
Performance: 1.0

ARCHS4 (All RNA-seq and ChIP-seq Sample and Signature Search)

expression_data

Uniformly processed RNA-seq data from GEO, SRA, and other sources.

Usage: 0
Performance: 1.0

ArrayExpress

expression_data

EMBL-EBI's gene expression database.

Usage: 0
Performance: 1.0

astropy

physics

Comprehensive Python library for astronomy and astrophysics. This skill should be used when working with astronomical data including celestial coordinates, physical units, FITS files, cosmological calculations, time systems, tables, world coordinate systems (WCS), and astronomical data analysis. Use when tasks involve coordinate transformations, unit conversions, FITS file manipulation, cosmological distance calculations, time scale conversions, or astronomical data processing.

Usage: 0
Performance: 1.0

ATCC (American Type Culture Collection) API wrapper for microbial strains.

data_retrieval

ATCC is one of the world's premier biological resource centers, with 3,800+

Usage: 0
Performance: 1.0

BacDive (Bacterial Diversity Metadatabase) API Client

dataset_discovery

Comprehensive strain-linked information on bacterial and archaeal biodiversity.

Usage: 0
Performance: 1.0

benchling-integration

lab_automation

Benchling R&D platform integration. Access registry (DNA, proteins), inventory, ELN entries, workflows via API, build Benchling Apps, query Data Warehouse, for lab data management automation.

Usage: 0
Performance: 1.0

Bgee (gene expression evolution)

expression_data

Access to Bgee - database for retrieval and comparison of gene expression patterns

Usage: 0
Performance: 1.0

bgpt-paper-search

scientific_comm

Search scientific papers and retrieve structured experimental data extracted from full-text studies via the BGPT MCP server. Returns 25+ fields per paper including methods, results, sample sizes, quality scores, and conclusions. Use for literature reviews, evidence synthesis, and finding experimental details not available in abstracts alone.

Usage: 0
Performance: 1.0

BiGG Models - Biochemical, Genetic and Genomic knowledgebase.

network_analysis

BiGG Models is a database of genome-scale metabolic network reconstructions

Usage: 0
Performance: 1.0

BindingDB

drug_discovery

Access to BindingDB - public database of measured binding affinities between

Usage: 0
Performance: 1.0

BioCarta Pathway Database

pathway_analysis

BioCarta provides curated biological pathway maps and data.

Usage: 0
Performance: 1.0

BioConda Package Repository API wrapper.

dataset_discovery

BioConda is a channel for the conda package manager specializing in bioinformatics

Usage: 0
Performance: 1.0

Bioconductor Package Repository API wrapper.

dataset_discovery

Bioconductor provides tools for the analysis and comprehension of high-throughput

Usage: 0
Performance: 1.0

BioContainers Registry API wrapper.

data_retrieval

BioContainers is a community-driven project providing Docker/Singularity containers

Usage: 0
Performance: 1.0

BioCyc

pathway_analysis

BioCyc is a collection of 19,000+ Pathway/Genome Databases (PGDBs) for model

Usage: 0
Performance: 1.0

BioGRID (Biological General Repository for Interaction Datasets)

protein_annotation

Protein-protein, genetic, and chemical interactions from model organisms.

Usage: 0
Performance: 1.0

BioImage Archive

dataset_discovery

EMBL-EBI's repository for biological imaging data.

Usage: 0
Performance: 1.0

BioModels

data_retrieval

Query curated and non-curated systems biology models.

Usage: 0
Performance: 1.0

BioPlex

protein_annotation

BioPlex: High-quality protein-protein interaction network from human cells.

Usage: 0
Performance: 1.0

BioProject (NCBI)

expression_data

Access NCBI BioProject database: genomics, transcriptomics, and other omics project metadata.

Usage: 0
Performance: 1.0

biopython

bioinformatics

Comprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation. For quick lookups use gget; for multi-service integration use bioservices.

Usage: 0
Performance: 1.0

bioRxiv and medRxiv API Client

literature_search

Search and retrieve preprints from bioRxiv and medRxiv.

Usage: 0
Performance: 1.0

BioSamples Database

dataset_discovery

Database for metadata about biological samples from EBI.

Usage: 0
Performance: 1.0

BioSearch

data_retrieval

Placeholder for BioSearch API - a cell type and tissue search service.

Usage: 0
Performance: 1.0

bioservices

general

Unified Python interface to 40+ bioinformatics services. Use when querying multiple databases (UniProt, KEGG, ChEMBL, Reactome) in a single workflow with consistent API. Best for cross-database analysis, ID mapping across services. For quick single-database lookups use gget; for sequence/file manipulation use biopython.

Usage: 0
Performance: 1.0

BioStudies - Multi-omics Study Repository

dataset_discovery

EBI database for multi-omics studies and supporting data.

Usage: 0
Performance: 1.0

bio.tools API wrapper.

dataset_discovery

bio.tools is the ELIXIR registry of software tools and databases for life sciences,

Usage: 0
Performance: 1.0

BLAST — Basic Local Alignment Search Tool

dataset_discovery

Sequence similarity search against NCBI databases.

Usage: 0
Performance: 1.0

Blueprint Epigenome

gene_annotation

European project mapping epigenomes of primary hematopoietic cells.

Usage: 0
Performance: 1.0

BMRB (Biological Magnetic Resonance Bank) API wrapper for Forge.

dataset_discovery

BMRB is the international repository for NMR spectroscopy data of biological

Usage: 0
Performance: 1.0

BrainMap — Neuroimaging Coordinate-Based Meta-Analysis Database

dataset_discovery

Large-scale database of published functional neuroimaging experiments with

Usage: 0
Performance: 1.0

BrainSpan

expression_data

BrainSpan Atlas of the Developing Human Brain - transcriptional atlas of

Usage: 0
Performance: 1.0

BrassicaDB

dataset_discovery

Brassica Database - Genomic data for Brassica species.

Usage: 0
Performance: 1.0

BRENDA Enzyme Database

dataset_discovery

Access to BRENDA - comprehensive enzyme information system with kinetic data,

Usage: 0
Performance: 1.0

BV-BRC (PATRIC)

data_retrieval

BV-BRC (Bacterial and Viral Bioinformatics Resource Center) provides comprehensive

Usage: 0
Performance: 1.0

CADD

variant_annotation

Combined Annotation Dependent Depletion - Deleteriousness scores for variants.

Usage: 0
Performance: 1.0

Cancer Hotspots

variant_annotation

CancerHotspots - resource for statistically significant recurrent mutations

Usage: 0
Performance: 1.0

CATH Database

protein_annotation

Hierarchical classification of protein domain structures.

Usage: 0
Performance: 1.0

CAZy Database

dataset_discovery

Carbohydrate-Active enZymes Database - classification and annotation

Usage: 0
Performance: 1.0

cBioPortal

dataset_discovery

Access to cBioPortal for Cancer Genomics - comprehensive cancer genomics database

Usage: 0
Performance: 1.0

CCLE — Cancer Cell Line Encyclopedia (Broad Institute)

data_retrieval

Comprehensive characterization of 1,000+ cancer cell lines with genomics,

Usage: 0
Performance: 1.0

CellMarker - Cell Type Marker Gene Database

dataset_discovery

CellMarker is a comprehensive database of cell type markers across human and

Usage: 0
Performance: 1.0

CellMiner

drug_discovery

NCI-60 cancer cell line database with genomics and drug response data.

Usage: 0
Performance: 1.0

Cell Ontology (CL) API wrapper via OLS (Ontology Lookup Service).

structure_prediction

The Cell Ontology provides a structured controlled vocabulary for cell types

Usage: 0
Performance: 1.0

Cellosaurus Cell Line Database

dataset_discovery

Access the Cellosaurus knowledge resource on cell lines.

Usage: 0
Performance: 1.0

CellxGene Cell Type Expression

single_cell_expression

Query CZ CELLxGENE Discover 'Where is My Gene' API for single-cell cell-type expression data across human tissues.

Usage: 0
Performance: 1.0

cellxgene-census

bioinformatics

Query the CELLxGENE Census (61M+ cells) programmatically. Use when you need expression data across tissues, diseases, or cell types from the largest curated single-cell atlas. Best for population-scale queries, reference atlas comparisons. For analyzing your own data use scanpy or scvi-tools.

Usage: 0
Performance: 1.0

CELLxGENE Census

expression_data

Access to the Chan Zuckerberg CELLxGENE Data Portal - the largest standardized

Usage: 0
Performance: 1.0

CGD (Candida Genome Database) - Candida albicans genomics resource.

dataset_discovery

CGD provides curated genomic and biological information for Candida species,

Usage: 0
Performance: 1.0

ChEBI (Chemical Entities of Biological Interest)

drug_discovery

ChEBI is a freely available dictionary of molecular entities focused on 'small' chemical compounds.

Usage: 0
Performance: 1.0

ChEMBL

drug_discovery

Access to ChEMBL database of bioactive drug-like small molecules.

Usage: 0
Performance: 1.0

chembl-drug-targets

tool

Return ChEMBL drug compounds targeting a specified gene/protein with ChEMBL IDs, activity types, activity values, and pChEMBL potency. Use when a Domain Expert is assessing druggability, cataloguing tool compounds or clinical candidates, or auditing chemical matter for a proposed target.

Usage: 0
Performance: 0.8

ChEMBL Drug Targets

data_retrieval

Drug compounds and bioactivity data for a gene target from the ChEMBL database of bioactive molecules.

Usage: 0
Performance: 0.1

ChemSpace - Chemical Space Exploration

drug_discovery

Searchable database of 1B+ purchasable chemical compounds.

Usage: 0
Performance: 1.0

ChemSpider API Client

drug_discovery

Search and retrieve chemical information from ChemSpider (Royal Society of Chemistry).

Usage: 0
Performance: 1.0

CircBank

dataset_discovery

CircBank: Comprehensive database of human circular RNAs.

Usage: 0
Performance: 1.0

circBase

expression_data

Database of circular RNAs (circRNAs) identified from RNA-seq data.

Usage: 0
Performance: 1.0

cirq

quantum

Google quantum computing framework. Use when targeting Google Quantum AI hardware, designing noise-aware circuits, or running quantum characterization experiments. Best for Google hardware, noise modeling, and low-level circuit design. For IBM hardware use qiskit; for quantum ML with autodiff use pennylane; for physics simulations use qutip.

Usage: 0
Performance: 1.0

citation-management

scientific_comm

Comprehensive citation management for academic research. Search Google Scholar and PubMed for papers, extract accurate metadata, validate citations, and generate properly formatted BibTeX entries. This skill should be used when you need to find papers, verify citation information, convert DOIs to BibTeX, or ensure reference accuracy in scientific writing.

Usage: 0
Performance: 1.0

CIViC

variant_annotation

Clinical Interpretations of Variants in Cancer.

Usage: 0
Performance: 1.0

CIViC Gene Variants

clinical_genetics

Get CIViC curated clinical variant interpretations for a gene.

Usage: 0
Performance: 1.0

ClinGen Data

gene_annotation

Clinical Genome Resource - Curated gene-disease validity, dosage sensitivity,

Usage: 0
Performance: 1.0

ClinGen Gene-Disease Validity

genetic_disease

Query ClinGen for curated gene-disease validity classifications.

Usage: 0
Performance: 1.0

clinical-decision-support

clinical

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.

Usage: 0
Performance: 1.0

clinical-reports

clinical

Write comprehensive clinical reports including case reports (CARE guidelines), diagnostic reports (radiology/pathology/lab), clinical trial reports (ICH-E3, SAE, CSR), and patient documentation (SOAP, H&P, discharge summaries). Full support with templates, regulatory compliance (HIPAA, FDA, ICH-GCP), and validation tools.

Usage: 0
Performance: 1.0

ClinicalTrials.gov

clinical_data

Access to 400,000+ clinical trials worldwide via the ClinicalTrials.gov v2 API.

Usage: 0
Performance: 1.0

ClinicalTrials.gov Search

clinical_data

Search ClinicalTrials.gov for clinical trials related to genes, diseases, or interventions. Returns NCT IDs, status, phase, conditions, interventions, enrollment, and sponsor info.

Usage: 0
Performance: 1.0

ClinVar

dataset_discovery

Clinical variant database from NCBI with pathogenicity annotations.

Usage: 0
Performance: 1.0

clinvar-variants

tool

Return clinical variants from NCBI ClinVar for a given gene — variant names, clinical significance, conditions, and review status. Use when a Skeptic or Falsifier is probing whether genetic variants in a candidate gene have established clinical significance, or when verifying pathogenicity claims.

Usage: 0
Performance: 1.0

ClinVar Variants

data_retrieval

Fetch clinical genetic variants from NCBI ClinVar. Returns pathogenicity, review status, and associated conditions.

Usage: 0
Performance: 0.1

cobrapy

cheminformatics

Constraint-based metabolic modeling (COBRA). FBA, FVA, gene knockouts, flux sampling, SBML models, for systems biology and metabolic engineering analysis.

Usage: 0
Performance: 1.0

COCONUT - Collection of Open Natural Products

dataset_discovery

Open source natural products database.

Usage: 0
Performance: 1.0

COILS — Coiled-Coil Prediction

protein_annotation

Predict coiled-coil regions in proteins.

Usage: 0
Performance: 1.0

CollecTF

expression_data

CollecTF - database of experimentally validated transcription factor binding

Usage: 0
Performance: 1.0

Complex Portal

dataset_discovery

Complex Portal is a manually curated database of stable macromolecular complexes

Usage: 0
Performance: 1.0

CompTox (EPA Computational Toxicology Dashboard) API wrapper for Forge.

drug_discovery

CompTox provides access to EPA's chemistry, toxicity, and exposure data for

Usage: 0
Performance: 1.0

consciousness-council

scientific_comm

Run a multi-perspective Mind Council deliberation on any question, decision, or creative challenge. Use this skill whenever the user wants diverse viewpoints, needs help making a tough decision, asks for a council/panel/board discussion, wants to explore a problem from multiple angles, requests devil's advocate analysis, or says things like "what would different experts think about this", "help me think through this from all sides", "council mode", "mind council", or "deliberate on this". Also trigger when the user faces a dilemma, trade-off, or complex choice with no obvious answer.

Usage: 0
Performance: 1.0

ConsensusPathDB

network_analysis

ConsensusPathDB: Molecular functional interaction database integrating

Usage: 0
Performance: 1.0

ConSurf — Evolutionary Conservation Analysis

data_retrieval

Identify functionally important regions by conservation analysis.

Usage: 0
Performance: 1.0

CORUM

protein_annotation

Comprehensive Resource of Mammalian protein complexes.

Usage: 0
Performance: 1.0

COSMIC Database

dataset_discovery

Catalogue of Somatic Mutations in Cancer - comprehensive cancer mutation database.

Usage: 0
Performance: 1.0

CottonGen — Cotton Genomics, Genetics, and Breeding Database

dataset_discovery

Comprehensive cotton research database with genomics, genetics, breeding,

Usage: 0
Performance: 1.0

CRAN (Comprehensive R Archive Network) API wrapper.

network_analysis

CRAN is the primary repository for R packages, hosting 20,000+ packages

Usage: 0
Performance: 1.0

CRISPOR - CRISPR Design Tool Utilities

data_retrieval

Utilities for CRISPR guide RNA design and analysis.

Usage: 0
Performance: 1.0

Crossref API Client

data_retrieval

Search and retrieve metadata for scholarly works via DOI.

Usage: 0
Performance: 1.0

CrossRef Preprint Search

literature_search

Search for preprints (bioRxiv/medRxiv) via CrossRef API.

Usage: 0
Performance: 1.0

CTD (Comparative Toxicogenomics Database)

protein_annotation

Access to CTD - manually curated information about chemical-gene/protein

Usage: 0
Performance: 1.0

CTRPv2 — Cancer Therapeutics Response Portal v2

drug_discovery

Large-scale cancer drug sensitivity resource with small molecule profiling

Usage: 0
Performance: 1.0

dask

data_analysis

Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.

Usage: 0
Performance: 1.0

database-lookup

database_access

Search 78 public scientific, biomedical, materials science, and economic databases via REST APIs. Covers physics/astronomy (NASA, NIST, SDSS, SIMBAD), earth/environment (USGS, NOAA, EPA), chemistry/drugs (PubChem, ChEMBL, DrugBank, FDA, KEGG, ZINC, BindingDB), materials (Materials Project, COD), biology/genomics (Reactome, UniProt, STRING, Ensembl, NCBI Gene, GEO, GTEx, PDB, AlphaFold, InterPro, BioGRID, Gene Ontology, dbSNP, gnomAD, ENCODE, Human Protein Atlas, Human Cell Atlas), disease/clinical (COSMIC, Open Targets, ClinicalTrials.gov, OMIM, ClinVar, GDC/TCGA, cBioPortal, DisGeNET, GWAS Catalog), regulatory (FDA, USPTO, SEC EDGAR), economics/finance (FRED, World Bank, US Treasury), demographics (US Census, Eurostat, WHO). Use when looking up compounds, genes, proteins, pathways, variants, clinical trials, patents, economic indicators, or any public database API query.

Usage: 0
Performance: 1.0

datamol

cheminformatics

Pythonic wrapper around RDKit with simplified interface and sensible defaults. Preferred for standard drug discovery including SMILES parsing, standardization, descriptors, fingerprints, clustering, 3D conformers, parallel processing. Returns native rdkit.Chem.Mol objects. For advanced control or custom parameters, use rdkit directly.

Usage: 0
Performance: 1.0

dbGaP (database of Genotypes and Phenotypes) API wrapper for Forge.

dataset_discovery

NCBI's dbGaP is a repository for individual-level phenotype, exposure, genotype, and

Usage: 0
Performance: 1.0

dbPTM — Database of Protein Post-Translational Modifications

protein_annotation

Comprehensive database of experimentally verified PTMs from multiple sources.

Usage: 0
Performance: 1.0

dbSNP (NCBI Database of Single Nucleotide Polymorphisms)

dataset_discovery

Database of genetic variation in humans and other organisms.

Usage: 0
Performance: 1.0

DECIPHER Database

dataset_discovery

Database of genomic variation and phenotype in humans with CNVs and rare diseases.

Usage: 0
Performance: 1.0

deepchem

cheminformatics

Molecular ML with diverse featurizers and pre-built datasets. Use for property prediction (ADMET, toxicity) with traditional ML or GNNs when you want extensive featurization options and MoleculeNet benchmarks. Best for quick experiments with pre-trained models, diverse molecular representations. For graph-first PyTorch workflows use torchdrug; for benchmark datasets use pytdc.

Usage: 0
Performance: 1.0

deeptools

bioinformatics

NGS analysis toolkit. BAM to bigWig conversion, QC (correlation, PCA, fingerprints), heatmaps/profiles (TSS, peaks), for ChIP-seq, RNA-seq, ATAC-seq visualization.

Usage: 0
Performance: 1.0

depmap

clinical

Query the Cancer Dependency Map (DepMap) for cancer cell line gene dependency scores (CRISPR Chronos), drug sensitivity data, and gene effect profiles. Use for identifying cancer-specific vulnerabilities, synthetic lethal interactions, and validating oncology drug targets.

Usage: 0
Performance: 1.0

DepMap

drug_discovery

Cancer Dependency Map - CRISPR screens and drug sensitivity data.

Usage: 0
Performance: 1.0

dgidb-drug-gene

tool

Query DGIdb (Drug-Gene Interaction database) for druggability categories and known drug-gene interactions for a human gene, aggregating evidence from >40 sources including DrugBank, PharmGKB, TTD, and ChEMBL. Use when a domain expert needs a druggability summary for a target, when competitive-landscape assessments need to enumerate existing drugs against a gene, or when a hypothesis proposes targeting a gene whose prior pharmacology must be checked.

Usage: 0
Performance: 0.3

DGIdb (Drug Gene Interaction Database)

network_analysis

Access to DGIdb - comprehensive database of drug-gene interactions and druggable

Usage: 0
Performance: 1.0

DGIdb Drug-Gene Interactions

drug_database

Query DGIdb for drug-gene interactions and druggability categories. Aggregates DrugBank, PharmGKB, TTD, ChEMBL, and clinical guideline data.

Usage: 0
Performance: 1.0

DGV (Database of Genomic Variants)

dataset_discovery

Comprehensive database of structural variation in healthy human genomes.

Usage: 0
Performance: 1.0

dhdna-profiler

scientific_comm

Extract cognitive patterns and thinking fingerprints from any text. Use this skill when the user wants to analyze how someone thinks, understand cognitive style, profile writing or speech patterns, compare thinking styles between people, asks "what's my thinking style", "analyze how this person reasons", "cognitive profile", "thinking pattern", "DHDNA", "digital DNA", or wants to understand the mind behind any text. Also trigger when the user provides text and wants deeper insight into the author's reasoning patterns, decision-making style, or cognitive signature.

Usage: 0
Performance: 1.0

DictyBase

dataset_discovery

DictyBase is the model organism database for Dictyostelium discoideum,

Usage: 0
Performance: 1.0

diffdock

cheminformatics

Diffusion-based molecular docking. Predict protein-ligand binding poses from PDB/SMILES, confidence scores, virtual screening, for structure-based drug design. Not for affinity prediction.

Usage: 0
Performance: 1.0

DIP

protein_annotation

DIP: Database of Interacting Proteins

Usage: 0
Performance: 1.0

DiseaseMeth Database

dataset_discovery

Database of disease-associated DNA methylation patterns.

Usage: 0
Performance: 1.0

Disease Ontology (DO) API wrapper via OLS (Ontology Lookup Service).

ontology

The Disease Ontology provides a standardized classification of human diseases,

Usage: 0
Performance: 1.0

DisGeNET Disease-Gene

disease_gene_association

Get genes associated with a disease from DisGeNET with association scores.

Usage: 0
Performance: 1.0

DisGeNET Gene-Disease

gene_disease_association

Get disease associations for a gene from DisGeNET with scores and supporting PMIDs.

Usage: 0
Performance: 1.0

DisGeNET Gene-Disease Associations

literature_search

Database of gene-disease associations from GWAS, literature, and animal models.

Usage: 0
Performance: 1.0

disgenet-gene-diseases

tool

Return disease associations for a human gene from DisGeNET with DisGeNET score, evidence counts, source databases, and supporting PMIDs. Use when a hypothesis links a gene to a disease that must be cross-checked against curated gene-disease aggregations, when an evidence auditor needs a non-GWAS view of genetic association breadth, or when a skeptic wants to surface competing disease contexts for a target.

Usage: 0
Performance: 0.9

DisProt

protein_annotation

Database of Intrinsically Disordered Proteins (IDPs) and regions (IDRs).

Usage: 0
Performance: 1.0

dnanexus-integration

lab_automation

DNAnexus cloud genomics platform. Build apps/applets, manage data (upload/download), dxpy Python SDK, run workflows, FASTQ/BAM/VCF, for genomics pipeline development and execution.

Usage: 0
Performance: 1.0

docx

data_analysis

Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of 'Word doc', 'word document', '.docx', or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a 'report', 'memo', 'letter', 'template', or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.

Usage: 0
Performance: 1.0

DrugBank

drug_discovery

Access to DrugBank - comprehensive drug and drug target database with detailed

Usage: 0
Performance: 1.0

drugbank-drug-info

tool

Return drug information (brand/generic names, manufacturer, route, pharmacological class, MoA indicators) for a queried drug name via OpenFDA and PubChem (open alternatives to the proprietary DrugBank API). Use when a Domain Expert wants regulatory-labelled drug metadata or when a hypothesis references a named drug and needs class/indication grounding.

Usage: 0
Performance: 0.9

DrugCentral

network_analysis

Online drug compendium with drug-target interactions and pharmacology data.

Usage: 0
Performance: 1.0

DSMZ (German Collection of Microorganisms and Cell Cultures) API wrapper.

model_organism

DSMZ is one of Europe's largest biological resource centers with 80,000+

Usage: 0
Performance: 1.0

DTD Test A

api_wrapper

Mock tool for deprecated_tool_detector test

Usage: 0
Performance: 1.0

DTD Test A

api

Mock

Usage: 0
Performance: 1.0

DTD Test A

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test A

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test A

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test B

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test B

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test B

api_wrapper

Mock tool for deprecated_tool_detector test

Usage: 0
Performance: 1.0

DTD Test B

api

Mock

Usage: 0
Performance: 1.0

DTD Test B

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test C

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test C

api_wrapper

Mock

Usage: 0
Performance: 1.0

DTD Test C

api_wrapper

Mock tool for deprecated_tool_detector test

Usage: 0
Performance: 1.0

DTD Test C

api

Mock

Usage: 0
Performance: 1.0

DTD Test C

api_wrapper

Mock

Usage: 0
Performance: 1.0

EBI eQTL Catalog

expression_qtl

Query the EBI eQTL Catalog for tissue-specific expression quantitative trait loci.

Usage: 0
Performance: 1.0

ECMDB — E. coli Metabolome Database

dataset_discovery

Comprehensive database of small molecule metabolites found in Escherichia coli K-12.

Usage: 0
Performance: 1.0

EcoCyc

dataset_discovery

EcoCyc is a comprehensive database of E. coli K-12 biology, including metabolic

Usage: 0
Performance: 1.0

EcoGene

dataset_discovery

EcoGene - database for Escherichia coli K-12 genome and proteome sequences

Usage: 0
Performance: 1.0

eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) API wrapper.

dataset_discovery

eggNOG is a database of orthologous groups and functional annotation across

Usage: 0
Performance: 1.0

ELM

sequence_analysis

ELM: Eukaryotic Linear Motif resource

Usage: 0
Performance: 1.0

EMDB

structure_prediction

Electron Microscopy Data Bank - cryo-EM structure database.

Usage: 0
Performance: 1.0

eMolecules - Chemical Compound Search

structure_prediction

Chemical structure search and supplier database with 10M+ compounds.

Usage: 0
Performance: 1.0

EMP — Earth Microbiome Project

data_retrieval

Massive-scale standardized microbiome analysis across diverse ecosystems worldwide.

Usage: 0
Performance: 1.0

EMPIAR

dataset_discovery

Electron Microscopy Public Image Archive (EMPIAR) - public resource for

Usage: 0
Performance: 1.0

ENCODE Project

data_retrieval

Access ENCODE (Encyclopedia of DNA Elements) data.

Usage: 0
Performance: 1.0

Enrichr

gene_annotation

Gene set enrichment analysis using 200+ annotation libraries.

Usage: 0
Performance: 1.0

enrichr-analyze

tool

Run pathway or term enrichment over a gene list against an Enrichr library (e.g. GO Biological Process) and return ranked enriched terms with p-values and overlapping genes. Use when a hypothesis or experiment yields a gene set that needs functional interpretation, when a statistician wants multi-test-corrected enrichment evidence, or when comparing candidate gene lists before an artifact is promoted.

Usage: 0
Performance: 0.1

Enrichr GO Enrichment

data_retrieval

Gene set enrichment against GO Biological Process. Enter a gene list to find enriched pathways.

Usage: 0
Performance: 0.1

Ensembl Gene Phenotype Associations

gene_disease

Retrieve multi-source phenotype associations for a gene via Ensembl REST.

Usage: 0
Performance: 1.0

Ensembl Plants

gene_annotation

Ensembl Plants provides genomic data for plant species.

Usage: 0
Performance: 1.0

Ensembl REST

gene_annotation

Genome annotation, variation, and comparative genomics from Ensembl.

Usage: 0
Performance: 1.0

ENTEx (ENCODE Tissue Expression & Regulation) API wrapper.

expression_data

ENTEx is an ENCODE project providing comprehensive molecular characterization

Usage: 0
Performance: 1.0

esm

protein_engineering

Comprehensive toolkit for protein language models including ESM3 (generative multimodal protein design across sequence, structure, and function) and ESM C (efficient protein embeddings and representations). Use this skill when working with protein sequences, structures, or function prediction; designing novel proteins; generating protein embeddings; performing inverse folding; or conducting protein engineering tasks. Supports both local model usage and cloud-based Forge API for scalable inference.

Usage: 0
Performance: 1.0

etetoolkit

research_methodology

Phylogenetic tree toolkit (ETE). Tree manipulation (Newick/NHX), evolutionary event detection, orthology/paralogy, NCBI taxonomy, visualization (PDF/SVG), for phylogenomics.

Usage: 0
Performance: 1.0

EuPathDB

dataset_discovery

Integrated database resource for eukaryotic pathogens and related organisms.

Usage: 0
Performance: 1.0

European Nucleotide Archive (ENA)

dataset_discovery

Comprehensive resource for nucleotide sequence data and metadata.

Usage: 0
Performance: 1.0

European Variation Archive (EVA)

dataset_discovery

EVA is an open-access database of genetic variation data from all species.

Usage: 0
Performance: 1.0

Europe PMC API Client

literature_search

Search and retrieve literature from Europe PMC (formerly UK PubMed Central).

Usage: 0
Performance: 1.0

ExAC (Exome Aggregation Consortium)

data_retrieval

ExAC was a large-scale exome sequencing project that aggregated data from

Usage: 0
Performance: 1.0

exploratory-data-analysis

data_analysis

Perform comprehensive exploratory data analysis on scientific data files across 200+ file formats. This skill should be used when analyzing any scientific data file to understand its structure, content, quality, and characteristics. Automatically detects file type and generates detailed markdown reports with format-specific analysis, quality metrics, and downstream analysis recommendations. Covers chemistry, bioinformatics, microscopy, spectroscopy, proteomics, metabolomics, and general scientific data formats.

Usage: 0
Performance: 1.0

Expression Atlas API wrapper for Forge.

expression_data

Expression Atlas provides information on gene and protein expression across

Usage: 0
Performance: 1.0

FAIRsharing API wrapper.

data_retrieval

FAIRsharing is a curated, informative registry of three types of resources:

Usage: 0
Performance: 1.0

FANTOM

gene_annotation

Functional Annotation of the Mammalian Genome (FANTOM) - comprehensive

Usage: 0
Performance: 1.0

FinnGen Disease Loci

genetic_associations

Query FinnGen R10 (500K Finnish cohort) for fine-mapped genetic loci associated with a disease or trait.

Usage: 0
Performance: 0.9

FishBase API wrapper for fish species information and ecology.

dataset_discovery

FishBase is the world's largest fish database with 35,000+ species, containing

Usage: 0
Performance: 1.0

flowio

bioinformatics

Parse FCS (Flow Cytometry Standard) files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.

Usage: 0
Performance: 1.0

FlowRepository - Flow Cytometry Data Repository

dataset_discovery

Public repository for flow cytometry data and analysis.

Usage: 0
Performance: 1.0

fluidsim

engineering

Framework for computational fluid dynamics simulations using Python. Use when running fluid dynamics simulations including Navier-Stokes equations (2D/3D), shallow water equations, stratified flows, or when analyzing turbulence, vortex dynamics, or geophysical flows. Provides pseudospectral methods with FFT, HPC support, and comprehensive output analysis.

Usage: 0
Performance: 1.0

FlyBase (Drosophila)

dataset_discovery

Access to FlyBase - comprehensive database for Drosophila genetics and genomics.

Usage: 0
Performance: 1.0

FooDB

dataset_discovery

FooDB is the world's largest food composition database with 28,000+ foods,

Usage: 0
Performance: 1.0

Franklin by Genoox - Clinical Variant Interpretation

variant_annotation

AI-powered clinical variant interpretation and classification platform.

Usage: 0
Performance: 1.0

FungiDB — Fungal Genomics Resource

dataset_discovery

Comprehensive genomics database for fungal pathogens and model organisms.

Usage: 0
Performance: 1.0

Galaxy Tool Shed API wrapper.

dataset_discovery

The Galaxy Tool Shed is a repository of tools and workflows for the Galaxy platform.

Usage: 0
Performance: 1.0

GARD (Genetic and Rare Diseases Information Center)

gene_annotation

Provides information about rare and genetic diseases from NIH/NCATS.

Usage: 0
Performance: 1.0

GDSC (Genomics of Drug Sensitivity in Cancer)

drug_discovery

Large-scale pharmacogenomic database screening cancer cell lines with

Usage: 0
Performance: 1.0

GenBank API wrapper for Forge.

dataset_discovery

GenBank is NCBI's primary nucleotide sequence database containing DNA and RNA

Usage: 0
Performance: 1.0

GenCC

gene_annotation

Gene Curation Coalition (GenCC) - authoritative resource for gene-disease

Usage: 0
Performance: 1.0

GENCODE — Comprehensive Gene Annotation

gene_annotation

High-quality gene annotations from the GENCODE project.

Usage: 0
Performance: 1.0

GeneCards

expression_data

GeneCards - the human gene database integrating genomic, transcriptomic,

Usage: 0
Performance: 1.0

Gene Info

data_retrieval

Look up any human gene — returns full name, summary, aliases, and gene type from MyGene.info.

Usage: 0
Performance: 0.3

GENENAMES

dataset_discovery

GENENAMES - HUGO Gene Nomenclature Committee (HGNC) database of

Usage: 0
Performance: 1.0

Gene Ontology Annotations

annotation

Query QuickGO (EBI) for Gene Ontology annotations.

Usage: 0
Performance: 1.0

Gene Ontology (GO)

structure_prediction

Structured, controlled vocabulary for gene and gene product attributes.

Usage: 0
Performance: 1.0

generate-image

scientific_comm

Generate or edit images using AI models (FLUX, Nano Banana 2). Use for general-purpose image generation including photos, illustrations, artwork, visual assets, concept art, and any image that is not a technical diagram or schematic. For flowcharts, circuits, pathways, and technical diagrams, use the scientific-schematics skill instead.

Usage: 0
Performance: 1.0

GeneTests - Genetic Testing Registry

gene_annotation

Information about genetic tests and testing laboratories.

Usage: 0
Performance: 1.0

geniml

bioinformatics

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

Usage: 0
Performance: 1.0

Genomics England PanelApp

clinical_genetics

Query Genomics England PanelApp for disease gene panel memberships.

Usage: 0
Performance: 1.0

GEO

expression_data

Gene Expression Omnibus (GEO) - NCBI's gene expression repository.

Usage: 0
Performance: 1.0

geomaster

geospatial

Comprehensive geospatial science skill covering remote sensing, GIS, spatial analysis, machine learning for earth observation, and 30+ scientific domains. Supports satellite imagery processing (Sentinel, Landsat, MODIS, SAR, hyperspectral), vector and raster data operations, spatial statistics, point cloud processing, network analysis, cloud-native workflows (STAC, COG, Planetary Computer), and 8 programming languages (Python, R, Julia, JavaScript, C++, Java, Go, Rust) with 500+ code examples. Use for remote sensing workflows, GIS analysis, spatial ML, Earth observation data processing, terrain analysis, hydrological modeling, marine spatial analysis, atmospheric science, and any geospatial computation task.

Usage: 0
Performance: 1.0

geopandas

geospatial

Python library for working with geospatial vector data including shapefiles, GeoJSON, and GeoPackage files. Use when working with geographic data for spatial analysis, geometric operations, coordinate transformations, spatial joins, overlay operations, choropleth mapping, or any task involving reading/writing/analyzing vector geographic data. Supports PostGIS databases, interactive maps, and integration with matplotlib/folium/cartopy. Use for tasks like buffer analysis, spatial joins between datasets, dissolving boundaries, clipping data, calculating areas/distances, reprojecting coordinate systems, creating maps, or converting between spatial file formats.

Usage: 0
Performance: 1.0

GEPIA

expression_data

Gene Expression Profiling Interactive Analysis - cancer vs normal expression.

Usage: 0
Performance: 1.0

get-available-resources

infrastructure

This skill should be used at the start of any computationally intensive scientific task to detect and report available system resources (CPU cores, GPUs, memory, disk space). It creates a JSON file with resource information and strategic recommendations that inform computational approach decisions such as whether to use parallel processing (joblib, multiprocessing), out-of-core computing (Dask, Zarr), GPU acceleration (PyTorch, JAX), or memory-efficient strategies. Use this skill before running analyses, training models, processing large datasets, or any task where resource constraints matter.

Usage: 0
Performance: 1.0

gget

bioinformatics

Fast CLI/Python queries to 20+ bioinformatics databases. Use for quick lookups: gene info, BLAST searches, AlphaFold structures, enrichment analysis. Best for interactive exploration, simple queries. For batch processing or advanced BLAST use biopython; for multi-database Python workflows use bioservices.

Usage: 0
Performance: 1.0

ginkgo-cloud-lab

lab_automation

Submit and manage protocols on Ginkgo Bioworks Cloud Lab (cloud.ginkgo.bio), a web-based interface for autonomous lab execution on Reconfigurable Automation Carts (RACs). Use when the user wants to run cell-free protein expression (validation or optimization), generate fluorescent pixel art, or interact with Ginkgo Cloud Lab services. Covers protocol selection, input preparation, pricing, and ordering workflows.

Usage: 0
Performance: 1.0

glycoengineering

protein_engineering

Analyze and engineer protein glycosylation. Scan sequences for N-glycosylation sequons (N-X-S/T), predict O-glycosylation hotspots, and access curated glycoengineering tools (NetOGlyc, GlycoShield, GlycoWorkbench). For glycoprotein engineering, therapeutic antibody optimization, and vaccine design.

Usage: 0
Performance: 1.0

GlycomeDB - Glycan Structure Database

structure_prediction

Database of carbohydrate structures from multiple sources.

Usage: 0
Performance: 1.0

GlyGen API wrapper for Forge.

protein_annotation

GlyGen is a comprehensive glycan and glycoprotein database from NCBI/NIH.

Usage: 0
Performance: 1.0

gnomAD

dataset_discovery

Genome Aggregation Database - Population genetics and allele frequencies.

Usage: 0
Performance: 1.0

gnomad-gene-variants

tool

Query gnomAD (GRCh38) for population variant-frequency data for a gene — loss-of-function and missense variants with allele frequencies plus gene-level constraint scores (pLI, oe_lof, oe_mis). Use when a hypothesis depends on gene dosage, haploinsufficiency, or rare-variant burden, when a skeptic or falsifier wants to check whether a claimed pathogenic variant is too common to be causal, or when constraint must inform druggability and safety assessments.

Usage: 0
Performance: 0.9

GNPS

network_analysis

Global Natural Products Social Molecular Networking platform.

Usage: 0
Performance: 1.0

GOLD - Genomes OnLine Database

dataset_discovery

Comprehensive database of genome and metagenome sequencing projects.

Usage: 0
Performance: 1.0

g:Profiler API wrapper for Forge.

data_retrieval

g:Profiler is a widely-used functional enrichment analysis tool that performs

Usage: 0
Performance: 1.0

g:Profiler Gene Enrichment

pathway_analysis

Perform functional enrichment analysis using g:Profiler (ELIXIR).

Usage: 0
Performance: 1.0

GrainGenes — Wheat, Barley & Oat Genomics Database

dataset_discovery

USDA database for Triticeae (wheat tribe) genomics and genetics.

Usage: 0
Performance: 1.0

Gramene

data_retrieval

Gramene is a curated, open-source, data resource for comparative functional

Usage: 0
Performance: 1.0

Greengenes — 16S rRNA Gene Database

dataset_discovery

Chimera-checked 16S rRNA gene database for bacterial and archaeal taxonomy.

Usage: 0
Performance: 1.0

gtars

bioinformatics

High-performance toolkit for genomic interval analysis in Rust with Python bindings. Use when working with genomic regions, BED files, coverage tracks, overlap detection, tokenization for ML models, or fragment analysis in computational genomics and machine learning applications.

Usage: 0
Performance: 1.0

GTEx Resource

expression_data

Genotype-Tissue Expression (GTEx) project data access.

Usage: 0
Performance: 1.0

gtex-tissue-expression

tool

Return median gene expression (TPM) across GTEx v10 tissues — 54 tissue types including 13 brain regions — for a given gene symbol, sorted by expression level. Use when tissue specificity matters (e.g. microglial TREM2, brain-restricted SNCA) or when a hypothesis makes an implicit tissue-of-action claim that must be checked.

Usage: 0
Performance: 1.0

Guide to Pharmacology

drug_discovery

Guide to Pharmacology (IUPHAR/BPS) - expert-curated resource of

Usage: 0
Performance: 1.0

GWAS Catalog

gene_annotation

NHGRI-EBI Catalog of human genome-wide association studies.

Usage: 0
Performance: 1.0

GWAS Catalog

data_retrieval

Genome-wide association study hits from the NHGRI-EBI GWAS Catalog. Query by gene or trait.

Usage: 0
Performance: 0.1

gwas-genetic-associations

tool

Return GWAS genetic associations (SNPs, risk alleles, p-values, trait names, study IDs) from the NHGRI-EBI GWAS Catalog for a gene or trait. Use when a Skeptic is probing genetic support for a candidate target or a Theorist needs SNP-level evidence for a mechanism claim.

Usage: 0
Performance: 0.9

Harmonizome

dataset_discovery

Access to 100+ integrated biological databases via Harmonizome.

Usage: 0
Performance: 1.0

HGNC (HUGO Gene Nomenclature Committee)

gene_annotation

Official gene symbol nomenclature and gene information.

Usage: 0
Performance: 1.0

HIPPIE

protein_annotation

HIPPIE - Human Integrated Protein-Protein Interaction rEference

Usage: 0
Performance: 1.0

histolab

clinical

Lightweight WSI tile extraction and preprocessing. Use for basic slide processing tissue detection, tile extraction, stain normalization for H&E images. Best for simple pipelines, dataset preparation, quick tile-based analysis. For advanced spatial proteomics, multiplexed imaging, or deep learning pipelines use pathml.

Usage: 0
Performance: 1.0

HMDB (Human Metabolome Database)

drug_discovery

Comprehensive metabolite database with chemical, clinical, and biological data.

Usage: 0
Performance: 1.0

HMP — Human Microbiome Project Data Analysis and Coordination Center

data_retrieval

NIH Human Microbiome Project portal for multi-omic microbiome data.

Usage: 0
Performance: 1.0

HOCOMOCO

expression_data

HOCOMOCO - Homo sapiens comprehensive model collection of transcription

Usage: 0
Performance: 1.0

HPO Phenotype Associations

forge

Query the Human Phenotype Ontology (HPO) for phenotype-gene and disease-phenotype links.

Usage: 0
Performance: 1.0

HPRD

protein_annotation

Human Protein Reference Database - curated proteomic resource.

Usage: 0
Performance: 1.0

Human Cell Atlas (HCA) Data Portal

expression_data

The Human Cell Atlas is an international consortium creating comprehensive reference maps

Usage: 0
Performance: 1.0

Human Phenotype Ontology (HPO)

ontology

Standardized vocabulary for phenotypic abnormalities in human disease.

Usage: 0
Performance: 1.0

Human Protein Atlas

data_retrieval

Protein expression across human tissues and cell types from the Human Protein Atlas. Includes subcellular localisation.

Usage: 0
Performance: 0.1

Human Protein Atlas API wrapper.

expression_data

The Human Protein Atlas (HPA) provides comprehensive data on protein expression

Usage: 0
Performance: 1.0

hypogenic

multi_omics

Automated LLM-driven hypothesis generation and testing on tabular datasets. Use when you want to systematically explore hypotheses about patterns in empirical data (e.g., deception detection, content analysis). Combines literature insights with data-driven hypothesis testing. For manual hypothesis formulation use hypothesis-generation; for creative ideation use scientific-brainstorming.

Usage: 0
Performance: 1.0

hypothesis-generation

scientific_comm

Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechanisms, and design experiments to test them. Follows scientific method framework. For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic.

Usage: 0
Performance: 1.0

ICGC (International Cancer Genome Consortium) Data Portal

gene_annotation

ICGC provides access to cancer genomics data from multiple international projects.

Usage: 0
Performance: 1.0

Identifiers.org Resolution Service API wrapper.

data_retrieval

Identifiers.org provides persistent, resolvable identifiers for life science data.

Usage: 0
Performance: 1.0

IDR

dataset_discovery

Image Data Resource (IDR) - public repository for reference image datasets

Usage: 0
Performance: 1.0

IEDB (Immune Epitope Database) API wrapper for Forge.

dataset_discovery

The Immune Epitope Database (IEDB) is a free resource for epitope-related data

Usage: 0
Performance: 1.0

IHEC — International Human Epigenome Consortium

gene_annotation

Portal for accessing reference human epigenome data from international consortia.

Usage: 0
Performance: 1.0

imaging-data-commons

clinical

Query and download public cancer imaging data from NCI Imaging Data Commons using idc-index. Use for accessing large-scale radiology (CT, MR, PET) and pathology datasets for AI training or research. No authentication required. Query by metadata, visualize in browser, check licenses.

Usage: 0
Performance: 1.0

IMG — Integrated Microbial Genomes

gene_annotation

API wrapper for IMG/M (Integrated Microbial Genomes & Microbiomes).

Usage: 0
Performance: 1.0

IMG/M — Integrated Microbial Genomes & Microbiomes

dataset_discovery

Comprehensive database for microbial genome analysis and annotation from JGI.

Usage: 0
Performance: 1.0

IMGT (ImMunoGeneTics) API wrapper for Forge.

dataset_discovery

IMGT is the international reference database for immunogenetics and immunoinformatics.

Usage: 0
Performance: 1.0

ImmGen - Immunological Genome Project

expression_data

ImmGen generates a comprehensive database of gene expression and regulation in the

Usage: 0
Performance: 1.0

ImmPort (Immunology Database and Analysis Portal) API wrapper for Forge.

dataset_discovery

ImmPort is a repository of immunology data from NIH-funded research, providing

Usage: 0
Performance: 1.0

ImmuneSpace — Immunology Data Sharing Platform

dataset_discovery

Integrated database of human immunology studies with standardized analysis

Usage: 0
Performance: 1.0

IMPC (International Mouse Phenotyping Consortium) API wrapper for Forge.

protein_annotation

The IMPC is generating and phenotyping knockout mouse strains for every protein-coding

Usage: 0
Performance: 1.0

infographics

scientific_comm

Create professional infographics using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Integrates research-lookup and web search for accurate data. Supports 10 infographic types, 8 industry styles, and colorblind-safe palettes.

Usage: 0
Performance: 1.0

InnateDB — Innate Immunity Pathways and Interactions

pathway_analysis

Curated database of innate immunity genes, pathways, and interactions.

Usage: 0
Performance: 1.0

IntAct

protein_annotation

Protein-protein and molecular interaction database.

Usage: 0
Performance: 1.0

InterPro

protein_annotation

InterPro - Protein sequence analysis & classification.

Usage: 0
Performance: 1.0

iRefIndex

protein_annotation

iRefIndex - Consolidated protein-protein interaction database integrating

Usage: 0
Performance: 1.0

iso-13485-certification

clinical

Comprehensive toolkit for preparing ISO 13485 certification documentation for medical device Quality Management Systems. Use when users need help with ISO 13485 QMS documentation, including (1) conducting gap analysis of existing documentation, (2) creating Quality Manuals, (3) developing required procedures and work instructions, (4) preparing Medical Device Files, (5) understanding ISO 13485 requirements, or (6) identifying missing documentation for medical device certification. Also use when users mention medical device regulations, QMS certification, FDA QMSR, EU MDR, or need help with quality system documentation.

Usage: 0
Performance: 1.0

IUPred — Intrinsic Disorder Prediction

protein_annotation

Predict intrinsically disordered regions in proteins.

Usage: 0
Performance: 1.0

JASPAR

expression_data

JASPAR - transcription factor binding profile database.

Usage: 0
Performance: 1.0

JGI (Joint Genome Institute) Genome Portal

dataset_discovery

Provides access to JGI's genome database including fungal, plant, algal, and microbial genomes.

Usage: 0
Performance: 1.0

jPOST - Japan ProteOme STandard Repository

dataset_discovery

jPOST is a proteomics data repository developed in Japan that provides unified access

Usage: 0
Performance: 1.0

KEGG Disease Gene Associations

disease_gene_association

Query KEGG Disease database for disease entries and their causal/associated genes.

Usage: 0
Performance: 1.0

KEGG REST

pathway_analysis

Access to KEGG pathways, genes, compounds, and disease information.

Usage: 0
Performance: 1.0

labarchive-integration

lab_automation

Electronic lab notebook API integration. Access notebooks, manage entries/attachments, backup notebooks, integrate with Protocols.io/Jupyter/REDCap, for programmatic ELN workflows.

Usage: 0
Performance: 1.0

lamindb

multi_omics

This skill should be used when working with LaminDB, an open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR. Use when managing biological datasets (scRNA-seq, spatial, flow cytometry, etc.), tracking computational workflows, curating and validating data with biological ontologies, building data lakehouses, or ensuring data lineage and reproducibility in biological research. Covers data management, annotation, ontologies (genes, cell types, diseases, tissues), schema validation, integrations with workflow managers (Nextflow, Snakemake) and MLOps platforms (W&B, MLflow), and deployment strategies.

Usage: 0
Performance: 1.0

latchbio-integration

lab_automation

Latch platform for bioinformatics workflows. Build pipelines with Latch SDK, @workflow/@task decorators, deploy serverless workflows, LatchFile/LatchDir, Nextflow/Snakemake integration.

Usage: 0
Performance: 1.0

latex-posters

visualization

Create professional research posters in LaTeX using beamerposter, tikzposter, or baposter. Support for conference presentations, academic posters, and scientific communication. Includes layout design, color schemes, multi-column formats, figure integration, and poster-specific best practices for visual communication.

Usage: 0
Performance: 1.0

LegumeInfo

data_retrieval

Legume Information System - Genomic data for legume crops.

Usage: 0
Performance: 1.0

LINCS / CLUE (Library of Integrated Network-Based Cellular Signatures)

expression_data

Access to LINCS L1000 gene expression data and CLUE drug repurposing platform.

Usage: 0
Performance: 1.0

LipidMaps

dataset_discovery

Access to LIPID MAPS - comprehensive lipid database.

Usage: 0
Performance: 1.0

LIPID MAPS Lipid Search

metabolomics

Search LIPID MAPS for lipid structure, classification, and biological roles.

Usage: 0
Performance: 1.0

LitCovid API wrapper for Forge.

literature_search

LitCovid is NCBI's curated literature hub for tracking up-to-date scientific

Usage: 0
Performance: 1.0

literature-review

scientific_comm

Conduct comprehensive, systematic literature reviews using multiple academic databases (PubMed, arXiv, bioRxiv, Semantic Scholar, etc.). This skill should be used when conducting systematic literature reviews, meta-analyses, research synthesis, or comprehensive literature searches across biomedical, scientific, and technical domains. Creates professionally formatted markdown documents and PDFs with verified citations in multiple citation styles (APA, Nature, Vancouver, etc.).

Usage: 0
Performance: 1.0

Literature Search Aggregator

literature_search

Unified interface for searching scientific literature across multiple sources.

Usage: 0
Performance: 1.0

LncBase

dataset_discovery

LncBase - database of experimentally validated and computationally predicted

Usage: 0
Performance: 1.0

LncIPedia

dataset_discovery

LncIPedia: Long non-coding RNA database and annotation resource.

Usage: 0
Performance: 1.0

lncRNADisease

dataset_discovery

Database of experimentally validated lncRNA-disease associations.

Usage: 0
Performance: 1.0

LOVD

dataset_discovery

Leiden Open Variation Database (LOVD) is a flexible, freely available tool for

Usage: 0
Performance: 1.0

MaizeGDB — Maize Genetics and Genomics Database

dataset_discovery

Comprehensive maize research database with genomics, genetics, breeding,

Usage: 0
Performance: 1.0

MalaCards

dataset_discovery

MalaCards: Human disease database integrating 150+ sources.

Usage: 0
Performance: 1.0

markdown-mermaid-writing

scientific_comm

Comprehensive markdown and Mermaid diagram writing skill. Use when creating any scientific document, report, analysis, or visualization. Establishes text-based diagrams as the default documentation standard with full style guides (markdown + mermaid), 24 diagram type references, and 9 document templates.

Usage: 0
Performance: 1.0

market-research-reports

scientific_comm

Generate comprehensive market research reports (50+ pages) in the style of top consulting firms (McKinsey, BCG, Gartner). Features professional LaTeX formatting, extensive visual generation with scientific-schematics and generate-image, deep integration with research-lookup for data gathering, and multi-framework strategic analysis including Porter Five Forces, PESTLE, SWOT, TAM/SAM/SOM, and BCG Matrix.

Usage: 0
Performance: 1.0

markitdown

data_analysis

Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs and more.

Usage: 0
Performance: 1.0

MARRVEL — Model Organism Aggregated Resources for Rare Variant ExploreR

variant_annotation

Integrates human genetic data with model organism phenotypes and variants.

Usage: 0
Performance: 1.0

MassBank — Mass Spectrometry Database

dataset_discovery

API wrapper for MassBank, a community database of mass spectra.

Usage: 0
Performance: 1.0

MassIVE API wrapper for Forge.

network_analysis

MassIVE (Mass Spectrometry Interactive Virtual Environment) is a community-driven

Usage: 0
Performance: 1.0

matchms

proteomics

Spectral similarity and compound identification for metabolomics. Use for comparing mass spectra, computing similarity scores (cosine, modified cosine), and identifying unknown compounds from spectral libraries. Best for metabolite identification, spectral matching, library searching. For full LC-MS/MS proteomics pipelines use pyopenms.

Usage: 0
Performance: 1.0

matlab

engineering

MATLAB and GNU Octave numerical computing for matrix operations, data analysis, visualization, and scientific computing. Use when writing MATLAB/Octave scripts for linear algebra, signal processing, image processing, differential equations, optimization, statistics, or creating scientific visualizations. Also use when the user needs help with MATLAB syntax, functions, or wants to convert between MATLAB and Python code. Scripts can be executed with MATLAB or the open-source GNU Octave interpreter.

Usage: 0
Performance: 1.0

matplotlib

visualization

Low-level plotting library for full customization. Use when you need fine-grained control over every plot element, creating novel plot types, or integrating with specific scientific workflows. Export to PNG/PDF/SVG for publication. For quick statistical plots use seaborn; for interactive plots use plotly; for publication-ready multi-panel figures with journal styling, use scientific-visualization.

Usage: 0
Performance: 1.0

medchem

general

Medicinal chemistry filters. Apply drug-likeness rules (Lipinski, Veber), PAINS filters, structural alerts, complexity metrics, for compound prioritization and library filtering.

Usage: 0
Performance: 1.0

MEROPS

protein_annotation

Database of peptidases (proteases, proteinases) and their inhibitors.

Usage: 0
Performance: 1.0

MeSH (Medical Subject Headings)

literature_search

Access to MeSH - NLM's controlled vocabulary for indexing biomedical literature.

Usage: 0
Performance: 1.0

MetaboLights

dataset_discovery

MetaboLights - metabolomics experiments and data repository.

Usage: 0
Performance: 1.0

Metabolomics Workbench

data_retrieval

Access metabolomics data, studies, and metabolite information.

Usage: 0
Performance: 1.0

Metabolomics Workbench Search

dataset_discovery

Search the Metabolomics Workbench for public metabolomics studies.

Usage: 0
Performance: 1.0

MetaCyc

dataset_discovery

MetaCyc - comprehensive database of experimentally validated metabolic

Usage: 0
Performance: 1.0

Metascape

pathway_analysis

Gene annotation and pathway enrichment analysis.

Usage: 0
Performance: 1.0

MethBase Conservation

epigenetic_analysis

Get cross-species methylation conservation data for a gene.

Usage: 0
Performance: 1.0

MethBase Developmental Methylation

epigenetic_analysis

Get developmental stage-specific methylation dynamics for a gene.

Usage: 0
Performance: 1.0

MethBase Differential Analysis

epigenetic_analysis

Search for differential methylation between two conditions or cell states.

Usage: 0
Performance: 1.0

MethBase Gene Methylation

epigenetic_search

Search for DNA methylation studies and data for a specific gene.

Usage: 0
Performance: 1.0

MethBase Tissue Comparison

epigenetic_analysis

Compare methylation patterns across multiple tissues for a gene.

Usage: 0
Performance: 1.0

MethBase Tissue Methylation

epigenetic_analysis

Get tissue-specific methylation patterns for a gene.

Usage: 0
Performance: 1.0

MethyCancer Database

dataset_discovery

Database of DNA methylation in human cancer.

Usage: 0
Performance: 1.0

METLIN

dataset_discovery

Access to METLIN - Scripps metabolite database with MS/MS spectra.

Usage: 0
Performance: 1.0

MGI (Mouse Genome Informatics)

dataset_discovery

Access to MGI - comprehensive database of mouse genetic, genomic, and biological data.

Usage: 0
Performance: 1.0

MGnify

data_retrieval

EMBL-EBI's metagenomics analysis platform (formerly EBI Metagenomics).

Usage: 0
Performance: 1.0

MicrobeDB — Microbial Genome and Diversity Database

dataset_discovery

Comprehensive database of microbial genomes including bacteria and archaea.

Usage: 0
Performance: 1.0

MicrobiomeDB — Microbiome Research Database

dataset_discovery

Provides access to microbiome studies, sample data, and analysis tools.

Usage: 0
Performance: 1.0

MINT (Molecular INTeraction Database)

protein_annotation

MINT focuses on experimentally verified protein-protein interactions.

Usage: 0
Performance: 1.0

miRBase microRNA Database

dataset_discovery

Primary repository for microRNA sequences and annotations.

Usage: 0
Performance: 1.0

miRDB API wrapper for Forge.

dataset_discovery

miRDB is a database for microRNA target prediction and functional annotations.

Usage: 0
Performance: 1.0

miRTarBase API wrapper for Forge.

network_analysis

miRTarBase is a database of experimentally validated microRNA-target interactions.

Usage: 0
Performance: 1.0

MITOMAP

dataset_discovery

Human mitochondrial genome database of polymorphisms and mutations.

Usage: 0
Performance: 1.0

modal

infrastructure

Cloud computing platform for running Python on GPUs and serverless infrastructure. Use when deploying AI/ML models, running GPU-accelerated workloads, serving web endpoints, scheduling batch jobs, or scaling Python code to the cloud. Use this skill whenever the user mentions Modal, serverless GPU compute, deploying ML models to the cloud, serving inference endpoints, running batch processing in the cloud, or needs to scale Python workloads beyond their local machine. Also use when the user wants to run code on H100s, A100s, or other cloud GPUs, or needs to create a web API for a model.

Usage: 0
Performance: 1.0

ModBase

protein_annotation

ModBase is a database of annotated comparative protein structure models,

Usage: 0
Performance: 1.0

MODOMICS - RNA Modifications Database

pathway_analysis

Database of RNA modification pathways and modified nucleosides.

Usage: 0
Performance: 1.0

molecular-dynamics

cheminformatics

Run and analyze molecular dynamics simulations with OpenMM and MDAnalysis. Set up protein/small molecule systems, define force fields, run energy minimization and production MD, analyze trajectories (RMSD, RMSF, contact maps, free energy surfaces). For structural biology, drug binding, and biophysics.

Usage: 0
Performance: 1.0

molfeat

cheminformatics

Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.

Usage: 0
Performance: 1.0

MoNA — MassBank of North America

dataset_discovery

Community-driven mass spectral repository for metabolomics.

Usage: 0
Performance: 1.0

Monarch Disease-Gene Associations

disease_gene_association

Query Monarch Initiative for disease-gene-phenotype associations from OMIM, ClinVar, HPO

Usage: 0
Performance: 0.8

Monarch Initiative

gene_annotation

Integrated cross-species gene-phenotype and disease data.

Usage: 0
Performance: 1.0

MONDO (Monarch Disease Ontology)

ontology

Comprehensive disease ontology integrating multiple disease terminologies.

Usage: 0
Performance: 1.0

Mouse Cell Atlas (MCA) - Comprehensive Mouse Single-Cell Atlas

expression_data

The Mouse Cell Atlas is a comprehensive reference atlas of mouse cell types across

Usage: 0
Performance: 1.0

MSigDB (Molecular Signatures Database) API wrapper for Forge.

dataset_discovery

MSigDB is a collection of annotated gene sets from the Broad Institute, widely used

Usage: 0
Performance: 1.0

Mutalyzer - Sequence Variant Nomenclature

variant_annotation

Tool for checking and formatting sequence variant descriptions according to HGVS.

Usage: 0
Performance: 1.0

MyChem.info

drug_discovery

Chemical and drug annotation service covering PubChem, ChEMBL, DrugBank, and more.

Usage: 0
Performance: 1.0

MyDisease.info

ontology

Disease annotation service covering MONDO, Disease Ontology, UMLS, and more.

Usage: 0
Performance: 1.0

MyGene.info

gene_annotation

High-performance gene query web service covering 27,000+ species.

Usage: 0
Performance: 1.0

MyVariant.info

variant_annotation

Variant annotation service covering dbSNP, ClinVar, dbNSFP, COSMIC, and more.

Usage: 0
Performance: 1.0

NCBI Assembly Database

dataset_discovery

Database of genome assemblies from NCBI.

Usage: 0
Performance: 1.0

NCBI dbSNP Variant Lookup

variant_annotation

Look up a variant in NCBI dbSNP for allele frequencies and annotations.

Usage: 0
Performance: 1.0

NCBI Gene Database

dataset_discovery

Query gene information from NCBI Gene database using Entrez eUtils.

Usage: 0
Performance: 1.0

NCBI GeneRIF Citations

literature_search

Retrieve NCBI Gene Reference Into Function (GeneRIF) linked publications.

Usage: 0
Performance: 1.0

NCBI Genetic Testing Registry (GTR) API wrapper.

gene_annotation

GTR provides access to information about genetic tests for inherited and somatic

Usage: 0
Performance: 1.0

NCBI MeSH Term Lookup

ontology_lookup

Look up NCBI Medical Subject Headings (MeSH) descriptors and synonyms.

Usage: 0
Performance: 1.0

NCBI SRA Dataset Search

dataset_discovery

Search the NCBI Sequence Read Archive (SRA) for public sequencing datasets.

Usage: 0
Performance: 1.0

NCBI Taxonomy

dataset_discovery

Access NCBI Taxonomy database - the comprehensive taxonomy database for all organisms

Usage: 0
Performance: 1.0

NCBO BioPortal

dataset_discovery

BioPortal is a repository of biomedical ontologies developed by the National Center

Usage: 0
Performance: 1.0

NCI Genomic Data Commons (GDC) API wrapper.

dataset_discovery

The GDC provides a unified data repository for cancer genomics programs including

Usage: 0
Performance: 1.0

NCI Pathway Interaction Database (PID)

pathway_analysis

Curated collection of biomolecular interactions and cellular processes.

Usage: 0
Performance: 1.0

NCI Thesaurus

data_retrieval

National Cancer Institute's comprehensive cancer terminology system.

Usage: 0
Performance: 1.0

NDEx

network_analysis

NDEx: Network Data Exchange for sharing biological networks.

Usage: 0
Performance: 1.0

NetNGlyc — N-Glycosylation Site Prediction

protein_annotation

Predict N-glycosylation sites in proteins.

Usage: 0
Performance: 1.0

NetPath

pathway_analysis

NetPath: Curated resource of signal transduction pathways in humans.

Usage: 0
Performance: 1.0

networkx

visualization

Comprehensive toolkit for creating, analyzing, and visualizing complex networks and graphs in Python. Use when working with network/graph data structures, analyzing relationships between entities, computing graph algorithms (shortest paths, centrality, clustering), detecting communities, generating synthetic networks, or visualizing network topologies. Applicable to social networks, biological networks, transportation systems, citation networks, and any domain involving pairwise relationships.

Usage: 0
Performance: 1.0

NeuroElectro — Neuron Electrophysiology Database

literature_search

Provides quantitative electrophysiological measurements from literature

Usage: 0
Performance: 1.0

neurokit2

neuroscience

Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.

Usage: 0
Performance: 1.0

NeuroMorpho.Org API Client

data_retrieval

World's largest collection of digitally reconstructed neuron morphologies.

Usage: 0
Performance: 1.0

neuropixels-analysis

neuroscience

Neuropixels neural recording analysis. Load SpikeGLX/OpenEphys data, preprocess, motion correction, Kilosort4 spike sorting, quality metrics, Allen/IBL curation, AI-assisted visual analysis, for Neuropixels 1.0/2.0 extracellular electrophysiology. Use when working with neural recordings, spike sorting, extracellular electrophysiology, or when the user mentions Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, or unit curation.

Usage: 0
Performance: 1.0

neXtProt REST

protein_annotation

Human protein knowledge platform - comprehensive annotations for human proteins.

Usage: 0
Performance: 1.0

OBIS (Ocean Biodiversity Information System) API wrapper for marine occurrence data.

data_retrieval

OBIS is a global open-access data system for marine biodiversity, containing

Usage: 0
Performance: 1.0

OFFSIDES

drug_discovery

OFFSIDES - database of drug side effects mined from FDA Adverse Event Reporting

Usage: 0
Performance: 1.0

O-GlcNAc Database API wrapper for O-linked glycosylation data.

dataset_discovery

The O-GlcNAc Database catalogs O-linked β-N-acetylglucosamine (O-GlcNAc)

Usage: 0
Performance: 1.0

OMA Browser

dataset_discovery

Orthologous MAtrix database for comparative genomics.

Usage: 0
Performance: 1.0

omero-integration

lab_automation

Microscopy data management platform. Access images via Python, retrieve datasets, analyze pixels, manage ROIs/annotations, batch processing, for high-content screening and microscopy workflows.

Usage: 0
Performance: 1.0

OMIM

gene_annotation

Online Mendelian Inheritance in Man - catalog of human genes and genetic disorders.

Usage: 0
Performance: 1.0

OmniPath

protein_annotation

OmniPath: Comprehensive database of protein-protein interactions, signaling,

Usage: 0
Performance: 1.0

OmniPath PTM Interactions

protein_interaction

Query OmniPath for post-translational modification (PTM) interactions.

Usage: 0
Performance: 1.0

OncoKB

data_retrieval

OncoKB - Precision Oncology Knowledge Base with therapeutic implications of

Usage: 0
Performance: 1.0

Ontology Lookup Service (OLS4)

ontology

Access to 200+ biomedical ontologies from EMBL-EBI.

Usage: 0
Performance: 1.0

OpenAlex

dataset_discovery

Comprehensive scholarly database covering 240M+ works, open access.

Usage: 0
Performance: 1.0

openalex-works

tool

Search OpenAlex for scholarly works with rich citation metadata, concept tags, and open-access status across 250M+ records, with sortable relevance/date/citation-count rankings. Use when PubMed or Semantic Scholar coverage is insufficient, when an evidence auditor needs citation-graph context for provenance, or when a replicator wants to find the broadest set of papers citing or cited by a target work.

Usage: 0
Performance: 0.1

OpenFDA

drug_discovery

FDA adverse event reports and drug information.

Usage: 0
Performance: 1.0

openfda-adverse-events

tool

Query the FDA Adverse Event Reporting System (FAERS) via openFDA for a drug's post-market adverse-event signals, ranked by report frequency, with optional filtering to a specific MedDRA reaction term. Use when assessing safety liabilities for drugs being considered for repurposing in neurodegeneration, when a falsifier needs real-world pharmacovigilance signals to challenge a safety claim, or when a domain expert needs CNS-toxicity or ARIA-like signals surfaced before scoring.

Usage: 0
Performance: 1.0

openFDA Adverse Events

clinical_pharmacology

Query FDA Adverse Event Reporting System (FAERS) via openFDA API.

Usage: 0
Performance: 1.0

open-notebook

scientific_comm

Self-hosted, open-source alternative to Google NotebookLM for AI-powered research and document analysis. Use when organizing research materials into notebooks, ingesting diverse content sources (PDFs, videos, audio, web pages, Office documents), generating AI-powered notes and summaries, creating multi-speaker podcasts from research, chatting with documents using context-aware AI, searching across materials with full-text and vector search, or running custom content transformations. Supports 16+ AI providers including OpenAI, Anthropic, Google, Ollama, Groq, and Mistral with complete data privacy through self-hosting.

Usage: 0
Performance: 1.0

open-targets-associations

tool

Return gene-disease associations from Open Targets Platform (GraphQL) for a given gene symbol, resolved to Ensembl via MyGene.info, with disease name, EFO id, and Open Targets association score. Use when a Domain Expert or Theorist is probing target-disease links or when a debater wants a quantitative association score beyond raw PMID counts.

Usage: 0
Performance: 0.9

Open Targets Disease Gene Scoring

disease_gene_association

Get top-scored target genes for a disease from Open Targets Platform.

Usage: 0
Performance: 1.0

Open Targets Evidence

data_retrieval

Disease associations and therapeutic evidence for a gene from Open Targets Platform, scored across multiple evidence sources.

Usage: 0
Performance: 0.3

Open Targets Platform

drug_discovery

Target-disease evidence and drug discovery platform.

Usage: 0
Performance: 1.0

Open Targets Safety Liability

drug_safety

Query Open Targets for target safety liability evidence.

Usage: 0
Performance: 1.0

Open Targets Search

drug_target

Search Open Targets Platform for targets, diseases, drugs, and overall associations.

Usage: 0
Performance: 1.0

opentrons-integration

lab_automation

Official Opentrons Protocol API for OT-2 and Flex robots. Use when writing protocols specifically for Opentrons hardware with full access to Protocol API v2 features. Best for production Opentrons protocols, official API compatibility. For multi-vendor automation or broader equipment control use pylabrobot.

Usage: 0
Performance: 1.0

optimize-for-gpu

infrastructure

GPU-accelerate Python code using CuPy, Numba CUDA, Warp, cuDF, cuML, cuGraph, KvikIO, cuCIM, cuxfilter, cuVS, cuSpatial, and RAFT. Use whenever the user mentions GPU/CUDA/NVIDIA acceleration, or wants to speed up NumPy, pandas, scikit-learn, scikit-image, NetworkX, GeoPandas, or Faiss workloads. Covers physics simulation, differentiable rendering, mesh ray casting, particle systems (DEM/SPH/fluids), vector/similarity search, GPUDirect Storage file IO, interactive dashboards, geospatial analysis, medical imaging, and sparse eigensolvers. Also use when you see CPU-bound Python code (loops, large arrays, ML pipelines, graph analytics, image processing) that would benefit from GPU acceleration, even if not explicitly requested.

Usage: 0
Performance: 1.0

ORCID API Client

data_retrieval

Access researcher profiles, publications, and affiliations via ORCID.

Usage: 0
Performance: 1.0

ORegAnno

dataset_discovery

ORegAnno - Open Regulatory Annotation database of regulatory elements

Usage: 0
Performance: 1.0

Orphadata - Rare Disease Data

drug_discovery

Comprehensive dataset on rare diseases and orphan drugs from Orphanet.

Usage: 0
Performance: 1.0

Orphanet Data

drug_discovery

Orphanet is the reference portal for rare diseases and orphan drugs.

Usage: 0
Performance: 1.0

OrthoDB - Database of Orthologous Groups

dataset_discovery

OrthoDB is a comprehensive catalog of orthologs, genes inherited by speciation events,

Usage: 0
Performance: 1.0

Oxford Nanopore Community Data and Resources API wrapper.

data_retrieval

Oxford Nanopore Technologies (ONT) provides long-read sequencing technology with

Usage: 0
Performance: 1.0

PanglaoDB Cell Markers

cell_type_annotation

Get canonical cell type marker genes from PanglaoDB scRNA-seq database. Covers microglia, astrocytes, neurons, OPCs, DAM, oligodendrocytes.

Usage: 0
Performance: 1.0

PANTHER Database

protein_annotation

Protein ANalysis THrough Evolutionary Relationships - protein classification system.

Usage: 0
Performance: 1.0

Paperclip Search

literature_search

Search Paperclip MCP for biomedical papers (8M+ across arXiv, bioRxiv, PMC, OpenAlex, OSF). Uses hybrid BM25+embedding search with TL;DR summaries.

Usage: 0
Performance: 1.0

Paper Corpus Ingest

data_retrieval

Ingest a list of paper dicts into the local PaperCorpus cache for persistent storage. Each paper needs at least one ID (pmid, doi, or paper_id).

Usage: 0
Performance: 0.1

paper-corpus-search

tool

Unified multi-provider paper search over PubMed, Semantic Scholar, OpenAlex, and CrossRef with deduplication and local caching. Use when you need maximum recall across sources with a single paginated API and automatic dedupe by canonical identifiers (DOI/PMID).

Usage: 0
Performance: 0.8

Paper Corpus Search

data_retrieval

Search across PubMed, Semantic Scholar, OpenAlex, and CrossRef with unified results and local caching. Use providers param to filter to specific sources.

Usage: 0
Performance: 0.1

Paper Corpus Session

data_retrieval

Start a stateful multi-page search session. Call again with incremented page param to fetch subsequent pages.

Usage: 0
Performance: 0.1

paper-figures

tool

Retrieve figures (labels, captions, image URLs) for a paper identified by PMID or canonical paper_id, checking the DB cache first and falling back to PMC BioC, Europe PMC full-text XML, open-access PDF extraction, and deep-link references. Use when a debater needs visual evidence (fig captions, micrographs, schematics) to ground or challenge a claim.

Usage: 0
Performance: 0.8

paper-lookup

scientific_comm

Search 10 academic paper databases via REST APIs for research papers, preprints, and scholarly articles. Covers PubMed, PMC (full text), bioRxiv, medRxiv, arXiv, OpenAlex, Crossref, Semantic Scholar, CORE, Unpaywall. Use when searching for papers, citations, DOI/PMID lookups, abstracts, full text, open access, preprints, citation graphs, author search, or any scholarly literature query. Triggers on mentions of any supported database or requests like "find papers on X" or "look up this DOI".

Usage: 0
Performance: 1.0

paperzilla

general

Chat with your agent about projects, recommendations, and canonical papers in Paperzilla. Use when users ask for recent project recommendations, canonical paper details, markdown-based summaries, recommendation feedback, feed export, or Atom feed URLs.

Usage: 0
Performance: 1.0

parallel-web

scientific_comm

All-in-one web toolkit powered by parallel-cli, with a strong emphasis on academic and scientific sources. Use this skill whenever the user needs to search the web, fetch/extract URL content, enrich data with web-sourced fields, or run deep research reports. Covers: web search (fast lookups, research, current info — prioritizing peer-reviewed papers, preprints, and scholarly databases), URL extraction (fetching pages, articles, academic PDFs), bulk data enrichment (adding fields to CSV/lists from the web), and deep research (exhaustive multi-source reports grounded in academic literature). Also handles setup, status checks, and result retrieval. Use this skill for ANY web-related task — even if the user doesn't mention 'parallel' or 'web' explicitly. If they want to look something up, fetch a page, enrich a dataset, investigate a topic, find academic papers, check citations, or review scientific literature, this is the skill to use.

Usage: 0
Performance: 1.0

PathBank

pathway_analysis

PathBank is a comprehensive, visual database of human metabolic and signaling pathways.

Usage: 0
Performance: 1.0

pathml

clinical

Full-featured computational pathology toolkit. Use for advanced WSI analysis including multiplexed immunofluorescence (CODEX, Vectra), nucleus segmentation, tissue graph construction, and ML model training on pathology data. Supports 160+ slide formats. For simple tile extraction from H&E slides, histolab may be simpler.

Usage: 0
Performance: 1.0

Pathway Commons

pathway_analysis

Integrated resource of biological pathway and interaction data.

Usage: 0
Performance: 1.0

Pathway Commons

pathway_analysis

Pathway Commons is an aggregated resource that integrates pathway and molecular

Usage: 0
Performance: 1.0

PDBe-KB (PDB Knowledge Base) API wrapper for Forge.

structure_prediction

PDBe-KB aggregates functional annotations and structural information for entries

Usage: 0
Performance: 1.0

pdf

data_analysis

Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.

Usage: 0
Performance: 1.0

peer-review

scientific_comm

Structured manuscript/grant review with checklist-based evaluation. Use when writing formal peer reviews with specific criteria methodology assessment, statistical validity, reporting standards compliance (CONSORT/STROBE), and constructive feedback. Best for actual review writing, manuscript revision. For evaluating claims/evidence quality use scientific-critical-thinking; for quantitative scoring frameworks use scholar-evaluation.

Usage: 0
Performance: 1.0

pennylane

quantum

Hardware-agnostic quantum ML framework with automatic differentiation. Use when training quantum circuits via gradients, building hybrid quantum-classical models, or needing device portability across IBM/Google/Rigetti/IonQ. Best for variational algorithms (VQE, QAOA), quantum neural networks, and integration with PyTorch/JAX/TensorFlow. For hardware-specific optimizations use qiskit (IBM) or cirq (Google); for open quantum systems use qutip.

Usage: 0
Performance: 1.0

PeptideAtlas

data_retrieval

Compendium of peptides identified in mass spectrometry proteomics experiments.

Usage: 0
Performance: 1.0

Pfam Database

protein_annotation

Protein families database with domain annotations and alignments.

Usage: 0
Performance: 1.0

PGS Catalog Polygenic Risk Scores

genetic_associations

Search PGS Catalog (EMBL-EBI) for published polygenic risk score (PRS) models for a disease. Returns multi-SNP scoring models with variant counts, effect weight methods, publication DOI/year, and FTP download links. Covers 47+ Alzheimer disease PRS, 11+ Parkinson disease PRS, and hundreds of cognitive/brain trait models. Complements GWAS tools (single variants) with complete polygenic models ready for individual risk stratification. Essential for precision medicine analyses in neurodegeneration.

Usage: 0
Performance: 1.0

PharmGKB Pharmacogenomics

drug_database

Query PharmGKB for pharmacogenomics drug-gene relationships. Returns clinical annotations linking genetic variants to drug response.

Usage: 0
Performance: 1.0

PharmGKB (Pharmacogenomics Knowledge Base)

pathway_analysis

Access pharmacogenomics data: drug-gene interactions, clinical annotations, pathways.

Usage: 0
Performance: 1.0

Pharos

drug_discovery

IDG (Illuminating the Druggable Genome) drug target database.

Usage: 0
Performance: 1.0

Phobius - Transmembrane Topology and Signal Peptide Predictor

protein_annotation

Predicts transmembrane topology and signal peptides in proteins.

Usage: 0
Performance: 1.0

PhosphoGrid

dataset_discovery

PhosphoGRID: Database of phosphorylation sites in yeast and human.

Usage: 0
Performance: 1.0

PhosphoSitePlus — Post-Translational Modification Database

dataset_discovery

API wrapper for PhosphoSitePlus (PSP), the most comprehensive PTM database.

Usage: 0
Performance: 1.0

phylogenetics

research_methodology

Build and analyze phylogenetic trees using MAFFT (multiple alignment), IQ-TREE 2 (maximum likelihood), and FastTree (fast NJ/ML). Visualize with ETE3 or FigTree. For evolutionary analysis, microbial genomics, viral phylodynamics, protein family analysis, and molecular clock studies.

Usage: 0
Performance: 1.0

Phytozome

data_retrieval

Phytozome - Plant comparative genomics portal from JGI.

Usage: 0
Performance: 1.0

PlantCyc API wrapper for plant metabolic pathways.

pathway_analysis

PlantCyc is a comprehensive database of plant metabolic pathways, enzymes,

Usage: 0
Performance: 1.0

PlantTFDB

expression_data

PlantTFDB - Plant Transcription Factor Database covering 165 plant species.

Usage: 0
Performance: 1.0

PLAZA

model_organism

PLAZA - comparative genomics platform for plant species. Integrates

Usage: 0
Performance: 1.0

polars

data_analysis

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

Usage: 0
Performance: 1.0

polars-bio

bioinformatics

High-performance genomic interval operations and bioinformatics file I/O on Polars DataFrames. Overlap, nearest, merge, coverage, complement, subtract for BED/VCF/BAM/GFF intervals. Streaming, cloud-native, faster bioframe alternative.

Usage: 0
Performance: 1.0

PomBase

dataset_discovery

PomBase is the model organism database for the fission yeast Schizosaccharomyces pombe.

Usage: 0
Performance: 1.0

pptx

visualization

Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions \"deck,\" \"slides,\" \"presentation,\" or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.

Usage: 0
Performance: 1.0

pptx-posters

visualization

Create research posters using HTML/CSS that can be exported to PDF or PPTX. Use this skill ONLY when the user explicitly requests PowerPoint/PPTX poster format. For standard research posters, use latex-posters instead. This skill provides modern web-based poster design with responsive layouts and easy visual integration.

Usage: 0
Performance: 1.0

PRIDE (Proteomics Identifications Database)

dataset_discovery

Access to PRIDE - proteomics data repository and resource.

Usage: 0
Performance: 1.0

primekg

multi_omics

Query the Precision Medicine Knowledge Graph (PrimeKG) for multiscale biological data including genes, drugs, diseases, phenotypes, and more.

Usage: 0
Performance: 1.0

PROSITE API wrapper for Forge.

protein_annotation

PROSITE is a protein domain database from the Swiss Institute of Bioinformatics (SIB).

Usage: 0
Performance: 1.0

ProteomicsDB

protein_annotation

ProteomicsDB is a protein-centric in-memory database for exploring the

Usage: 0
Performance: 1.0

protocolsio-integration

lab_automation

Integration with protocols.io API for managing scientific protocols. This skill should be used when working with protocols.io to search, create, update, or publish protocols; manage protocol steps and materials; handle discussions and comments; organize workspaces; upload and manage files; or integrate protocols.io functionality into workflows. Applicable for protocol discovery, collaborative protocol development, experiment tracking, lab protocol management, and scientific documentation.

Usage: 0
Performance: 1.0

ProtozoaDB — Protozoan Parasite Genomics Resource

dataset_discovery

Comprehensive genomics database for parasitic protozoa causing major diseases.

Usage: 0
Performance: 1.0

ProtParam — Protein Parameter Calculation

protein_annotation

Calculate physical and chemical parameters of proteins.

Usage: 0
Performance: 1.0

Protter - Protein Topology Visualization

protein_annotation

Interactive protein topology and structure visualization tool.

Usage: 0
Performance: 1.0

PseudoCAP - Pseudomonas aeruginosa Community Annotation Project.

gene_annotation

PseudoCAP provides genome annotation and analysis tools for P. aeruginosa,

Usage: 0
Performance: 1.0

PSIPRED — Protein Secondary Structure Prediction

protein_annotation

Predict alpha-helices, beta-strands, and coils from sequence.

Usage: 0
Performance: 1.0

PTMsigDB

dataset_discovery

PTMsigDB: Database of post-translational modification signatures.

Usage: 0
Performance: 1.0

PubChem Database

drug_discovery

Access chemical compounds, bioassays, and substance data from PubChem.

Usage: 0
Performance: 1.0

PubChem Target BioAssays

drug_discovery

Find bioassay-confirmed active compounds against a protein target in PubChem.

Usage: 0
Performance: 1.0

PubMed E-utilities

literature_search

Free access to PubMed/MEDLINE database (35M+ biomedical citations).

Usage: 0
Performance: 1.0

pubmed-search

tool

Search PubMed (NCBI E-utilities) and return paper details with PMIDs, titles, authors, journal, year, and DOI. Use when an agent needs citable literature evidence for a hypothesis, counter-evidence, or prior-art search — especially when a PMID is required downstream (cite_artifact actions, KG edge provenance, evidence audits).

Usage: 0
Performance: 1.0

PubMed Search

data_retrieval

Search PubMed for papers by keyword. Returns titles, authors, journals, PMIDs.

Usage: 0
Performance: 0.1

PubTator3 Gene/Disease Annotations

literature_annotation

Extract standardized gene, disease, and variant mentions from PubMed via PubTator3.

Usage: 0
Performance: 1.0

PubTator Central API wrapper for Forge.

data_retrieval

PubTator Central is NCBI's text mining tool that automatically annotates

Usage: 0
Performance: 1.0

pufferlib

ml_ai

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

Usage: 0
Performance: 1.0

pydeseq2

general

Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.

Usage: 0
Performance: 1.0

pydicom

clinical

Python library for working with DICOM (Digital Imaging and Communications in Medicine) files. Use this skill when reading, writing, or modifying medical imaging data in DICOM format, extracting pixel data from medical images (CT, MRI, X-ray, ultrasound), anonymizing DICOM files, working with DICOM metadata and tags, converting DICOM images to other formats, handling compressed DICOM data, or processing medical imaging datasets. Applies to tasks involving medical image analysis, PACS systems, radiology workflows, and healthcare imaging applications.

Usage: 0
Performance: 1.0

pyhealth

healthcare_ai

Comprehensive healthcare AI toolkit for developing, testing, and deploying machine learning models with clinical data. This skill should be used when working with electronic health records (EHR), clinical prediction tasks (mortality, readmission, drug recommendation), medical coding systems (ICD, NDC, ATC), physiological signals (EEG, ECG), healthcare datasets (MIMIC-III/IV, eICU, OMOP), or implementing deep learning models for healthcare applications (RETAIN, SafeDrug, Transformer, GNN).

Usage: 0
Performance: 1.0

pylabrobot

lab_automation

Vendor-agnostic lab automation framework. Use when controlling multiple equipment types (Hamilton, Tecan, Opentrons, plate readers, pumps) or needing unified programming across different vendors. Best for complex workflows, multi-vendor setups, simulation. For Opentrons-only protocols with official API, opentrons-integration may be simpler.

Usage: 0
Performance: 1.0

pymatgen

cheminformatics

Materials science toolkit. Crystal structures (CIF, POSCAR), phase diagrams, band structure, DOS, Materials Project integration, format conversion, for computational materials science.

Usage: 0
Performance: 1.0

pymc

ml_ai

Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.

Usage: 0
Performance: 1.0

pymoo

ml_ai

Multi-objective optimization framework. NSGA-II, NSGA-III, MOEA/D, Pareto fronts, constraint handling, benchmarks (ZDT, DTLZ), for engineering design and optimization problems.

Usage: 0
Performance: 1.0

pyopenms

proteomics

Complete mass spectrometry analysis platform. Use for proteomics workflows feature detection, peptide identification, protein quantification, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. Best for proteomics, comprehensive MS data processing. For simple spectral comparison and metabolite ID use matchms.

Usage: 0
Performance: 1.0

PyPI (Python Package Index) API wrapper.

dataset_discovery

PyPI is the official repository of Python packages, hosting 500,000+ projects

Usage: 0
Performance: 1.0

pysam

bioinformatics

Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.

Usage: 0
Performance: 1.0

pytdc

general

Therapeutics Data Commons. AI-ready drug discovery datasets (ADME, toxicity, DTI), benchmarks, scaffold splits, molecular oracles, for therapeutic ML and pharmacological prediction.

Usage: 0
Performance: 1.0

pytorch-lightning

ml_ai

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

Usage: 0
Performance: 1.0

pyzotero

general

Interact with Zotero reference management libraries using the pyzotero Python client. Retrieve, create, update, and delete items, collections, tags, and attachments via the Zotero Web API v3. Use this skill when working with Zotero libraries programmatically, managing bibliographic references, exporting citations, searching library contents, uploading PDF attachments, or building research automation workflows that integrate with Zotero.

Usage: 0
Performance: 1.0

qiskit

quantum

IBM quantum computing framework. Use when targeting IBM Quantum hardware, working with Qiskit Runtime for production workloads, or needing IBM optimization tools. Best for IBM hardware execution, quantum error mitigation, and enterprise quantum computing. For Google hardware use cirq; for gradient-based quantum ML use pennylane; for open quantum system simulations use qutip.

Usage: 0
Performance: 1.0

qutip

quantum

Quantum physics simulation library for open quantum systems. Use when studying master equations, Lindblad dynamics, decoherence, quantum optics, or cavity QED. Best for physics research, open system dynamics, and educational simulations. NOT for circuit-based quantum computing—use qiskit, cirq, or pennylane for quantum algorithms and hardware execution.

Usage: 0
Performance: 1.0

RCSB PDB REST

protein_annotation

Access 3D protein structures from the Protein Data Bank.

Usage: 0
Performance: 1.0

rdkit

cheminformatics

Cheminformatics toolkit for fine-grained molecular control. SMILES/SDF parsing, descriptors (MW, LogP, TPSA), fingerprints, substructure search, 2D/3D generation, similarity, reactions. For standard workflows with simpler interface, use datamol (wrapper around RDKit). Use rdkit for advanced control, custom sanitization, specialized algorithms.

Usage: 0
Performance: 1.0

Reactome Pathway Database

pathway_analysis

Access to Reactome pathways, reactions, and biological processes.

Usage: 0
Performance: 1.0

reactome-pathways

tool

Return Reactome pathways containing a given human gene, with pathway hierarchy and species info, resolved via UniProt cross-references. Use when a hypothesis invokes a specific pathway or when a domain expert needs to place a target in its curated pathway context before scoring druggability, feasibility, or mechanistic plausibility.

Usage: 0
Performance: 1.0

Reactome Pathways

data_retrieval

Look up biological pathways a gene participates in, from Reactome.

Usage: 0
Performance: 0.1

Reactome Pathway Search

pathway_analysis

Search Reactome for pathways by name and return their constituent genes.

Usage: 0
Performance: 1.0

Recount3 — Uniformly Reprocessed RNA-seq Data

expression_data

Reanalysis of public RNA-seq data with consistent pipeline.

Usage: 0
Performance: 1.0

RefSeq API wrapper for Forge.

dataset_discovery

RefSeq (Reference Sequence Database) provides curated, non-redundant reference

Usage: 0
Performance: 1.0

RegNetwork

expression_data

RegNetwork - database of transcriptional and post-transcriptional regulatory

Usage: 0
Performance: 1.0

RegulomeDB

dataset_discovery

RegulomeDB - Database of regulatory variants and their functional impact.

Usage: 0
Performance: 1.0

RegulonDB

expression_data

RegulonDB is a comprehensive database of transcriptional regulation in E. coli K-12.

Usage: 0
Performance: 1.0

research-grants

scientific_comm

Write competitive research proposals for NSF, NIH, DOE, DARPA, and Taiwan NSTC. Agency-specific formatting, review criteria, budget preparation, broader impacts, significance statements, innovation narratives, and compliance with submission requirements.

Usage: 0
Performance: 1.0

research-lookup

scientific_comm

Look up current research information using parallel-cli search (primary, fast web search), the Parallel Chat API (deep research), or Perplexity sonar-pro-search (academic paper searches). Automatically routes queries to the best backend. Use for finding papers, gathering research data, and verifying scientific information.

Usage: 0
Performance: 1.0

research-topic

tool

Convenience meta-tool that fans out a topic query to PubMed, Semantic Scholar, and ClinicalTrials.gov in one call and packages the results into a single research brief with total_evidence count, ready to feed to an LLM agent. Use when a Theorist or other agent wants a broad first-pass scan rather than a targeted single-source search.

Usage: 0
Performance: 0.8

Retraction Check

literature_search

Check if a PMID corresponds to a retracted paper via Retraction Watch.

Usage: 0
Performance: 1.0

Rfam

dataset_discovery

Access to Rfam database - RNA families database of structural RNA alignments.

Usage: 0
Performance: 1.0

RGD (Rat Genome Database)

dataset_discovery

Query rat gene information, orthologs, QTLs, and strains.

Usage: 0
Performance: 1.0

RiceBase — Rice Genetics and Genomics Database

dataset_discovery

Comprehensive rice research database with genomics, genetics, breeding,

Usage: 0
Performance: 1.0

RNA 3D Hub - RNA 3D Structure Database

structure_prediction

Database and analysis platform for RNA 3D structures.

Usage: 0
Performance: 1.0

RNAcentral

dataset_discovery

Comprehensive non-coding RNA sequence database.

Usage: 0
Performance: 1.0

Roadmap Epigenomics

data_retrieval

NIH Roadmap Epigenomics Mapping Consortium data portal.

Usage: 0
Performance: 1.0

rowan

cheminformatics

Rowan is a cloud-native molecular modeling and medicinal-chemistry workflow platform with a Python API. Use for pKa and macropKa prediction, conformer and tautomer ensembles, docking and analogue docking, protein-ligand cofolding, MSA generation, molecular dynamics, permeability, descriptor workflows, and related small-molecule or protein modeling tasks. Ideal for programmatic batch screening, multi-step chemistry pipelines, and workflows that would otherwise require maintaining local HPC/GPU infrastructure.

Usage: 0
Performance: 1.0

SABIO-RK

drug_discovery

SABIO-RK is a biochemical reaction kinetics database containing kinetic data

Usage: 0
Performance: 1.0

SASDB

data_retrieval

SASDB - Small Angle Scattering Biological Data Bank for structural data

Usage: 0
Performance: 1.0

scanpy

bioinformatics

Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, and visualization. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.

Usage: 0
Performance: 1.0

ScanSite - Protein motif and phosphorylation site prediction.

protein_annotation

ScanSite predicts protein-protein interaction sites, kinase phosphorylation sites,

Usage: 0
Performance: 1.0

scholar-evaluation

scientific_comm

Systematically evaluate scholarly work using the ScholarEval framework, providing structured assessment across research quality dimensions including problem formulation, methodology, analysis, and writing with quantitative scoring and actionable feedback.

Usage: 0
Performance: 1.0

SciDEX Scientific Tools

data_retrieval

Wrapper functions for free scientific APIs.

Usage: 0
Performance: 1.0

scientific-brainstorming

scientific_comm

Creative research ideation and exploration. Use for open-ended brainstorming sessions, exploring interdisciplinary connections, challenging assumptions, or identifying research gaps. Best for early-stage research planning when you do not have specific observations yet. For formulating testable hypotheses from data use hypothesis-generation.

Usage: 0
Performance: 1.0

scientific-critical-thinking

scientific_comm

Evaluate scientific claims and evidence quality. Use for assessing experimental design validity, identifying biases and confounders, applying evidence grading frameworks (GRADE, Cochrane Risk of Bias), or teaching critical analysis. Best for understanding evidence quality, identifying flaws. For formal peer review writing use peer-review.

Usage: 0
Performance: 1.0

scientific-schematics

visualization

Create publication-quality scientific diagrams using Nano Banana 2 AI with smart iterative refinement. Uses Gemini 3.1 Pro Preview for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.

Usage: 0
Performance: 1.0

scientific-slides

visualization

Build slide decks and presentations for research talks. Use this for making PowerPoint slides, conference presentations, seminar talks, research presentations, thesis defense slides, or any scientific talk. Provides slide structure, design templates, timing guidance, and visual validation. Works with PowerPoint and LaTeX Beamer.

Usage: 0
Performance: 1.0

scientific-visualization

visualization

Meta-skill for publication-ready figures. Use when creating journal submission figures requiring multi-panel layouts, significance annotations, error bars, colorblind-safe palettes, and specific journal formatting (Nature, Science, Cell). Orchestrates matplotlib/seaborn/plotly with publication styles. For quick exploration use seaborn or plotly directly.

Usage: 0
Performance: 1.0

scientific-writing

scientific_comm

Core skill for the deep research and writing tool. Write scientific manuscripts in full paragraphs (never bullet points). Use two-stage process with (1) section outlines with key points using research-lookup then (2) convert to flowing prose. IMRAD structure, citations (APA/AMA/Vancouver), figures/tables, reporting guidelines (CONSORT/STROBE/PRISMA), for research papers and journal submissions.

Usage: 0
Performance: 1.0

scikit-bio

bioinformatics

Biological data toolkit. Sequence analysis, alignments, phylogenetic trees, diversity metrics (alpha/beta, UniFrac), ordination (PCoA), PERMANOVA, FASTA/Newick I/O, for microbiome analysis.

Usage: 0
Performance: 1.0

scikit-learn

ml_ai

Machine learning in Python with scikit-learn. Use when working with supervised learning (classification, regression), unsupervised learning (clustering, dimensionality reduction), model evaluation, hyperparameter tuning, preprocessing, or building ML pipelines. Provides comprehensive reference documentation for algorithms, preprocessing techniques, pipelines, and best practices.

Usage: 0
Performance: 1.0

scikit-survival

ml_ai

Comprehensive toolkit for survival analysis and time-to-event modeling in Python using scikit-survival. Use this skill when working with censored survival data, performing time-to-event analysis, fitting Cox models, Random Survival Forests, Gradient Boosting models, or Survival SVMs, evaluating survival predictions with concordance index or Brier score, handling competing risks, or implementing any survival analysis workflow with the scikit-survival library.

Usage: 0
Performance: 1.0

SCOP — Structural Classification of Proteins

protein_annotation

API for the SCOP database (Structural Classification of Proteins).

Usage: 0
Performance: 1.0

scvelo

bioinformatics

RNA velocity analysis with scVelo. Estimate cell state transitions from unspliced/spliced mRNA dynamics, infer trajectory directions, compute latent time, and identify driver genes in single-cell RNA-seq data. Complements Scanpy/scVI-tools for trajectory inference.

Usage: 0
Performance: 1.0

scvi-tools

bioinformatics

Deep generative models for single-cell omics. Use when you need probabilistic batch correction (scVI), transfer learning, differential expression with uncertainty, or multi-modal integration (TOTALVI, MultiVI). Best for advanced modeling, batch effects, multimodal data. For standard analysis pipelines use scanpy.

Usage: 0
Performance: 1.0

seaborn

visualization

Statistical visualization with pandas integration. Use for quick exploration of distributions, relationships, and categorical comparisons with attractive defaults. Best for box plots, violin plots, pair plots, heatmaps. Built on matplotlib. For interactive plots use plotly; for publication styling use scientific-visualization.

Usage: 0
Performance: 1.0

SeaLifeBase API wrapper for non-fish marine organisms.

model_organism

SeaLifeBase is the marine counterpart to FishBase, covering 100,000+ species

Usage: 0
Performance: 1.0

Search Annotations

annotation

Search SciDEX annotations by URI, user, tags, or motivation.

Usage: 0
Performance: 1.0

search-trials

tool

Search ClinicalTrials.gov (API v2) for clinical trials and return NCT IDs, titles, status, phase, conditions, interventions, sponsors, enrollment counts, and start/completion dates. Use when a hypothesis touches translational feasibility, a Domain Expert is sizing the competitive landscape, or a Skeptic is probing replicability of a drug/target program in humans.

Usage: 0
Performance: 1.0

Semantic Scholar

literature_search

Academic paper search and citation graph from Allen AI (200M+ papers).

Usage: 0
Performance: 1.0

Semantic Scholar Author Papers

literature_fetch

Fetch papers by a given author.

Usage: 0
Performance: 1.0

Semantic Scholar Author Profile

literature_fetch

Fetch author details including h-index, citation count, paper count.

Usage: 0
Performance: 1.0

Semantic Scholar Batch Paper Details

literature_fetch

Batch fetch details for multiple papers.

Usage: 0
Performance: 1.0

Semantic Scholar Citation Network

network_analysis

Fetch papers that cite the given paper (citation network).

Usage: 0
Performance: 1.0

Semantic Scholar Influential Citations

literature_search

Fetch only highly influential citations for a paper.

Usage: 0
Performance: 1.0

Semantic Scholar Paper Details

literature_fetch

Fetch full details for a single paper by ID.

Usage: 0
Performance: 1.0

Semantic Scholar Paper Recommendations

literature_fetch

Fetch recommended similar papers for a given paper.

Usage: 0
Performance: 1.0

Semantic Scholar Reference Network

network_analysis

Fetch papers referenced by the given paper (reference network).

Usage: 0
Performance: 1.0

semantic-scholar-search

tool

Search Semantic Scholar for scholarly papers and return enriched records with paperId, title, authors, year, citation count, abstract, TLDR summary, venue, and cross-walked PMID/DOI identifiers. Use when citation-weighted ranking, TLDR summaries, or influential-citation counts matter more than raw PubMed recency.

Usage: 0
Performance: 1.0

SGD (Saccharomyces Genome Database)

dataset_discovery

Access to SGD - comprehensive database for yeast genetics and genomics.

Usage: 0
Performance: 1.0

shap

ml_ai

Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model.

Usage: 0
Performance: 1.0

SIDER (Side Effect Resource)

drug_discovery

Access to SIDER - database of marketed drugs and their recorded adverse drug reactions.

Usage: 0
Performance: 1.0

SignaLink

pathway_analysis

SignaLink is an integrated resource for studying signaling pathway cross-talk,

Usage: 0
Performance: 1.0

SignalP — Signal Peptide Prediction

protein_annotation

Predict signal peptides and cleavage sites in protein sequences.

Usage: 0
Performance: 1.0

SIGNOR (SIGnaling Network Open Resource)

pathway_analysis

Curated database of signaling pathways and regulatory interactions.

Usage: 0
Performance: 1.0

SILVA — Ribosomal RNA Database

dataset_discovery

Comprehensive quality-checked and aligned ribosomal RNA sequence database.

Usage: 0
Performance: 1.0

simpy

engineering

Process-based discrete-event simulation framework in Python. Use this skill when building simulations of systems with processes, queues, resources, and time-based events such as manufacturing systems, service operations, network traffic, logistics, or any system where entities interact with shared resources over time.

Usage: 0
Performance: 1.0

SMART

protein_annotation

Simple Modular Architecture Research Tool - protein domain annotation database.

Usage: 0
Performance: 1.0

SMPDB (Small Molecule Pathway Database) API wrapper.

structure_prediction

SMPDB provides detailed, fully annotated pathway diagrams for small molecule

Usage: 0
Performance: 1.0

SOL Genomics Network

network_analysis

SOL (Solanaceae) Genomics Network - Genomic data for tomato, potato, pepper, etc.

Usage: 0
Performance: 1.0

SoyBase — USDA Soybean Genetics and Genomics Database

dataset_discovery

Comprehensive soybean research database with genomics, genetics, breeding,

Usage: 0
Performance: 1.0

SRA (Sequence Read Archive) API wrapper for Forge.

dataset_discovery

NCBI's Sequence Read Archive (SRA) is the largest publicly available repository of

Usage: 0
Performance: 1.0

stable-baselines3

ml_ai

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

Usage: 0
Performance: 1.0

statistical-analysis

data_analysis

Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.

Usage: 0
Performance: 1.0

statsmodels

ml_ai

Statistical models library for Python. Use when you need specific model classes (OLS, GLM, mixed models, ARIMA) with detailed diagnostics, residuals, and inference. Best for econometrics, time series, rigorous inference with coefficient tables. For guided statistical test selection with APA reporting use statistical-analysis.

Usage: 0
Performance: 1.0

STITCH (Search Tool for InTeracting CHemicals)

protein_annotation

Access to STITCH - database of chemical-protein interactions.

Usage: 0
Performance: 1.0

STRING Database

protein_annotation

Protein-protein interaction networks and functional associations.

Usage: 0
Performance: 1.0

STRING Functional Network

network_analysis

Query STRING DB for a functional interaction network across a gene set.

Usage: 0
Performance: 1.0

string-protein-interactions

tool

Query STRING-DB for protein-protein interactions among a list of gene symbols, returning scored edges with evidence types for either a physical or functional network at a configurable confidence threshold. Use when a hypothesis invokes a protein interaction or pathway neighbourhood, when network-level plausibility must be assessed before promotion, or when building a small subgraph around a target for a debate artifact.

Usage: 0
Performance: 0.9

STRING Protein Interactions

data_retrieval

Find physical protein-protein interactions from the STRING database. Enter 2+ gene symbols.

Usage: 0
Performance: 0.1

SuperDRUG2 — Comprehensive Drug Database

structure_prediction

Database of approved drugs with 3D structures, targets, and conformations.

Usage: 0
Performance: 1.0

SuperTarget

network_analysis

SuperTarget - database of drug-target interactions with information on

Usage: 0
Performance: 1.0

SwissLipids - Lipid Structure and Knowledge Database

structure_prediction

Comprehensive database of lipid structures, nomenclature, and biology.

Usage: 0
Performance: 1.0

sympy

engineering

Use this skill when working with symbolic mathematics in Python. This skill should be used for symbolic computation tasks including solving equations algebraically, performing calculus operations (derivatives, integrals, limits), manipulating algebraic expressions, working with matrices symbolically, physics calculations, number theory problems, geometry computations, and generating executable code from mathematical expressions. Apply this skill when the user needs exact symbolic results rather than numerical approximations, or when working with mathematical formulas that contain variables and parameters.

Usage: 0
Performance: 1.0

synthetic_a

api_wrapper

Synthetic test tool

Usage: 0
Performance: 0.1

synthetic_b

api_wrapper

Synthetic test tool

Usage: 0
Performance: 0.1

synthetic_c

api_wrapper

Synthetic test tool

Usage: 0
Performance: 0.1

Tabula Sapiens - Comprehensive human tissue atlas.

expression_data

Tabula Sapiens is a benchmark single-cell transcriptomic atlas of ~500,000

Usage: 0
Performance: 1.0

TAIR (The Arabidopsis Information Resource) API wrapper for Forge.

dataset_discovery

TAIR is the primary database for the model plant Arabidopsis thaliana,

Usage: 0
Performance: 1.0

TarBase

dataset_discovery

TarBase: Database of experimentally validated microRNA targets.

Usage: 0
Performance: 1.0

TargetP — Subcellular Localization Prediction

protein_annotation

Predict subcellular localization of proteins.

Usage: 0
Performance: 1.0

TargetScan API wrapper for Forge.

data_retrieval

TargetScan predicts biological targets of miRNAs by searching for the presence of

Usage: 0
Performance: 1.0

TCDB

dataset_discovery

Transporter Classification Database - comprehensive classification of membrane transport systems.

Usage: 0
Performance: 1.0

TCGA (The Cancer Genome Atlas) API wrapper for Forge.

gene_annotation

TCGA generated comprehensive molecular profiles of more than 20,000 primary cancers

Usage: 0
Performance: 1.0

Therapeutic Target Database (TTD)

protein_annotation

TTD is a database providing information about therapeutic protein and nucleic acid

Usage: 0
Performance: 1.0

tiledbvcf

bioinformatics

Efficient storage and retrieval of genomic variant data using TileDB. Scalable VCF/BCF ingestion, incremental sample addition, compressed storage, parallel queries, and export capabilities for population genomics.

Usage: 0
Performance: 1.0

timesfm-forecasting

ml_ai

Zero-shot time series forecasting with Google's TimesFM foundation model. Use for any univariate time series (sales, sensors, energy, vitals, weather) without training a custom model. Supports CSV/DataFrame/array inputs with point forecasts and prediction intervals. Includes a preflight system checker script to verify RAM/GPU before first use.

Usage: 0
Performance: 1.0

TimeTree

dataset_discovery

Phylogenetic timing database - evolutionary divergence times between species.

Usage: 0
Performance: 1.0

TMHMM — Transmembrane Helix Prediction

protein_annotation

Predict transmembrane helices in protein sequences.

Usage: 0
Performance: 1.0

Tool Validation and Testing Framework

data_retrieval

Tests all Forge tools to ensure they follow standards and can import correctly.

Usage: 0
Performance: 1.0

TOPCONS API wrapper for protein topology prediction by consensus.

protein_annotation

TOPCONS (Topology Consensus) combines multiple prediction methods to provide

Usage: 0
Performance: 1.0

TopDB - Membrane Protein Topology Database

protein_annotation

Database of transmembrane protein topology annotations.

Usage: 0
Performance: 1.0

torchdrug

cheminformatics

PyTorch-native graph neural networks for molecules and proteins. Use when building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, retrosynthesis. For pre-trained models and diverse featurizers use deepchem; for benchmark datasets use pytdc.

Usage: 0
Performance: 1.0

torch-geometric

ml_ai

Guide for building Graph Neural Networks with PyTorch Geometric (PyG). Use this skill whenever the user asks about graph neural networks, GNNs, node classification, link prediction, graph classification, message passing networks, heterogeneous graphs, neighbor sampling, or any task involving torch_geometric / PyG. Also trigger when you see imports from torch_geometric, or the user mentions graph convolutions (GCN, GAT, GraphSAGE, GIN), graph data structures, or working with relational/network data. Even if the user just says 'graph learning' or 'geometric deep learning', use this skill.

Usage: 0
Performance: 1.0

Train Model Version

model_training

Fine-tune a model artifact on a dataset artifact via GPU sandbox. Produces a new model version with parent lineage, code commit SHA, and eval metrics captured. Use dry_run=True to validate and estimate cost before launching.

Usage: 0
Performance: 1.0

transformers

ml_ai

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

Usage: 0
Performance: 1.0

treatment-plans

clinical

Generate concise (3-4 page), focused medical treatment plans in LaTeX/PDF format for all clinical specialties. Supports general medical treatment, rehabilitation therapy, mental health care, chronic disease management, perioperative care, and pain management. Includes SMART goal frameworks, evidence-based interventions with minimal text citations, regulatory compliance (HIPAA), and professional formatting. Prioritizes brevity and clinical actionability.

Usage: 0
Performance: 1.0

TreeBASE — Phylogenetic Tree Database

dataset_discovery

Repository of phylogenetic trees and the data matrices used to generate them,

Usage: 0
Performance: 1.0

TreeFam

dataset_discovery

TreeFam (Tree families database) is a database of phylogenetic trees of gene families

Usage: 0
Performance: 1.0

TRRUST

expression_data

TRRUST - Transcriptional Regulatory Relationships Unraveled by Sentence-based

Usage: 0
Performance: 1.0

Uberon Anatomy Ontology API wrapper via OLS (Ontology Lookup Service).

ontology

Uberon is an integrated cross-species anatomy ontology covering animals and

Usage: 0
Performance: 1.0

UCSC Genome Browser

gene_annotation

Access genome assemblies, annotations, and sequence data.

Usage: 0
Performance: 1.0

UK Biobank — Large-scale Population Biobank

gene_annotation

World's largest biobank with deep genetic and phenotypic data from 500,000+ participants.

Usage: 0
Performance: 1.0

umap-learn

ml_ai

UMAP dimensionality reduction. Fast nonlinear manifold learning for 2D/3D visualization, clustering preprocessing (HDBSCAN), supervised/parametric UMAP, for high-dimensional data.

Usage: 0
Performance: 1.0

Unified Search API for Forge

protein_annotation

Federated search across multiple Forge tools for genes, drugs, diseases, and proteins.

Usage: 0
Performance: 1.0

Unimod — Universal Protein Modifications Database

protein_annotation

Standard reference for protein post-translational modifications (PTMs) in mass spectrometry.

Usage: 0
Performance: 1.0

uniprot-protein-info

tool

Retrieve comprehensive UniProt (Swiss-Prot) annotation for a human protein by gene symbol or accession — function, subcellular location, domains, post-translational modifications, interaction count, and disease associations. Use when a hypothesis needs authoritative protein-level grounding, when mechanism claims must be checked against curated biology, or when a downstream tool requires a canonical UniProt accession.

Usage: 0
Performance: 1.0

UniProt Protein Info

data_retrieval

Comprehensive protein annotation from UniProt/Swiss-Prot: function, domains, subcellular location, disease associations.

Usage: 0
Performance: 0.1

UniProt REST

protein_annotation

Access protein sequence and functional information from UniProtKB.

Usage: 0
Performance: 1.0

usfiscaldata

database_access

Query the U.S. Treasury Fiscal Data API for federal financial data including national debt, government spending, revenue, interest rates, exchange rates, and savings bonds. Access 54 datasets and 182 data tables with no API key required. Use when working with U.S. federal fiscal data, national debt tracking (Debt to the Penny), Daily Treasury Statements, Monthly Treasury Statements, Treasury securities auctions, interest rates on Treasury securities, foreign exchange rates, savings bonds, or any U.S. government financial statistics.

Usage: 0
Performance: 1.0

Utilities

gene_annotation

General utility functions for biological data translation and conversion.

Usage: 0
Performance: 1.0

vaex

data_analysis

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that do not fit in memory.

Usage: 0
Performance: 1.0

VarSome Clinical Variant Interpretation API wrapper.

variant_annotation

VarSome is a knowledge-driven variant interpretation platform that aggregates

Usage: 0
Performance: 1.0

VDJdb - T-Cell Receptor Database

dataset_discovery

Database of T-cell receptor (TCR) sequences with known antigen specificities.

Usage: 0
Performance: 1.0

VectorBase

lab_resource

VectorBase is a NIAID Bioinformatics Resource Center providing genomic and

Usage: 0
Performance: 1.0

venue-templates

scientific_comm

Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.

Usage: 0
Performance: 1.0

VFDB (Virulence Factor Database) API Client

dataset_discovery

Comprehensive database of virulence factors (VFs) from bacterial pathogens.

Usage: 0
Performance: 1.0

ViPR (Virus Pathogen Resource)

data_retrieval

Bioinformatics resource center for viral pathogens.

Usage: 0
Performance: 1.0

ViralZone — SIB Swiss Institute of Bioinformatics Virus Knowledge Resource

data_retrieval

Comprehensive virology resource with virus biology, replication cycles,

Usage: 0
Performance: 1.0

what-if-oracle

scientific_comm

Run structured What-If scenario analysis with multi-branch possibility exploration. Use this skill when the user asks speculative questions like "what if...", "what would happen if...", "what are the possibilities", "explore scenarios", "scenario analysis", "possibility space", "what could go wrong", "best case / worst case", "risk analysis", "contingency planning", "strategic options", or any question about uncertain futures. Also trigger when the user faces a fork-in-the-road decision, wants to stress-test an idea, or needs to think through consequences before committing.

Usage: 0
Performance: 1.0

WheatBase — Wheat Genetics and Genomics Database

dataset_discovery

Comprehensive wheat research database with genomics, genetics, breeding,

Usage: 0
Performance: 1.0

WikiPathways

pathway_analysis

Access to WikiPathways - community-curated biological pathway database.

Usage: 0
Performance: 1.0

WormBase (C. elegans)

dataset_discovery

Access to WormBase - comprehensive database for C. elegans genetics and genomics.

Usage: 0
Performance: 1.0

WoRMS (World Register of Marine Species) API wrapper for marine taxonomy.

dataset_discovery

WoRMS is the authoritative taxonomic database for marine organisms, containing

Usage: 0
Performance: 1.0

XenBase

dataset_discovery

XenBase is the model organism database for Xenopus laevis and Xenopus tropicalis

Usage: 0
Performance: 1.0

xlsx

data_analysis

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like \"the xlsx in my downloads\") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

Usage: 0
Performance: 1.0

YMDB — Yeast Metabolome Database

dataset_discovery

Comprehensive database of small molecule metabolites found in Saccharomyces cerevisiae.

Usage: 0
Performance: 1.0

zarr-python

bioinformatics

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

Usage: 0
Performance: 1.0

Zenodo API Client

data_retrieval

Access research data, software, and publications from Zenodo.

Usage: 0
Performance: 1.0

ZFIN (Zebrafish)

network_analysis

Access to ZFIN - Zebrafish Information Network database.

Usage: 0
Performance: 1.0

ZINC Database

drug_discovery

Drug discovery database with millions of commercially available compounds.

Usage: 0
Performance: 1.0