📗 Cite This Artifact
Experiment: Multi-Ethnic PD GWAS
Experiment: Multi-Ethnic Parkinson's Disease GWAS
Executive Summary
This experiment outlines a comprehensive multi-ethnic genome-wide association study (GWAS) designed to identify population-specific and shared genetic risk factors for Parkinson's disease (PD) across diverse ancestry groups. The study addresses a critical gap in PD genetics research, where approximately 95% of GWAS data has been derived from European-ancestry populations, leaving substantial portions of global genetic diversity uncharacterized[@nalls2019]. By systematically investigating genetic risk factors across European, East Asian, African, South Asian, Latin American, and Middle Eastern populations, this experiment aims to uncover novel risk loci, improve polygenic risk score (PRS) accuracy for underrepresented populations, and advance precision medicine approaches that benefit all patients with PD regardless of ancestry background.
The experimental design incorporates rigorous quality control protocols, state-of-the-art imputation methodologies using multi-ancestry reference panels, trans-ethnic meta-analysis approaches, and machine learning-based polygenic risk score optimization. The study is positioned to significantly advance our understanding of the shared and population-specific genetic architecture of PD while addressing critical health equity concerns in genetic research.
Research Hypothesis and Objectives
Primary Hypothesis
...
Experiment: Multi-Ethnic Parkinson's Disease GWAS
Executive Summary
This experiment outlines a comprehensive multi-ethnic genome-wide association study (GWAS) designed to identify population-specific and shared genetic risk factors for Parkinson's disease (PD) across diverse ancestry groups. The study addresses a critical gap in PD genetics research, where approximately 95% of GWAS data has been derived from European-ancestry populations, leaving substantial portions of global genetic diversity uncharacterized[@nalls2019]. By systematically investigating genetic risk factors across European, East Asian, African, South Asian, Latin American, and Middle Eastern populations, this experiment aims to uncover novel risk loci, improve polygenic risk score (PRS) accuracy for underrepresented populations, and advance precision medicine approaches that benefit all patients with PD regardless of ancestry background.
The experimental design incorporates rigorous quality control protocols, state-of-the-art imputation methodologies using multi-ancestry reference panels, trans-ethnic meta-analysis approaches, and machine learning-based polygenic risk score optimization. The study is positioned to significantly advance our understanding of the shared and population-specific genetic architecture of PD while addressing critical health equity concerns in genetic research.
Research Hypothesis and Objectives
Primary Hypothesis
The Ethnicity-Specific Genetic Architecture Hypothesis proposes that Parkinson's disease risk is influenced by both shared genetic variants conserved across populations and population-specific variants that have arisen through demographic history, founder effects, and adaptive selection. This hypothesis predicts that multi-ethnic GWAS will reveal: (1) risk loci with consistent effects across all ancestries representing core PD biology, (2) population-specific variants with effects limited to particular genetic backgrounds, and (3) variants with differential effect sizes across populations due to linkage disequilibrium (LD) structure differences and gene-environment interactions.
Primary Objectives
Secondary Objectives
Study Design
Population Cohort Assembly
The experimental design encompasses multiple geographically diverse cohorts representing the major continental ancestry groups. Each population stratum includes carefully phenotyped PD cases and neurologically healthy controls to ensure adequate statistical power for association testing.
| Ancestry Group | Target Cases | Target Controls | Data Sources | Minimum Power |
|---------------|--------------|-----------------|-------------|---------------|
| European | 15,000 | 30,000 | IPDGC, GP2, UK Biobank | 0.90 |
| East Asian | 5,000 | 10,000 | J-PDGC, Taiwan Biobank, Korean cohorts | 0.80 |
| African | 3,000 | 6,000 | IPDGC-Africa, African American cohorts | 0.75 |
| South Asian | 2,000 | 4,000 | Indian PD registries | 0.70 |
| Latin American | 1,500 | 3,000 | LASPD, multi-country cohorts | 0.70 |
| Middle Eastern | 1,000 | 2,000 | Regional PD registries | 0.65 |
| Ashkenazi Jewish | 500 | 1,000 | Specialized AJ PD registries | 0.60 |
The sample size targets are derived from power calculations assuming an additive genetic model, allele frequencies ranging from 0.01 to 0.50, and odds ratios of 1.15-1.35 for typical GWAS-discovered variants. These targets represent substantial increases over historical cohorts and reflect the growing international collaboration in PD genetics research.
Phenotype Definition and Quality Assurance
PD Case Definition: Cases meet UK Brain Bank or Movement Disorder Society (MDS) clinical diagnostic criteria for Parkinson's disease, confirmed by board-certified neurologists with movement disorder specialization. All cases have documented disease duration of at least one year to ensure diagnostic accuracy.
Control Definition: Controls are neurologically healthy individuals without PD symptoms or family history of PD in first-degree relatives, matched to cases by ancestry group, sex, and age within 5-year bins.
Phenotype Harmonization: A standardized phenotyping protocol ensures consistency across sites:
Inclusion and Exclusion Criteria
Inclusion Criteria:
- Age 35-90 years at enrollment
- Clinical diagnosis of idiopathic Parkinson's disease (UK Brain Bank criteria)
- Documented ancestry (self-reported and genetically confirmed)
- Written informed consent for genetic research
- Atypical parkinsonism (PSP, CBS, MSA, vascular parkinsonism)
- Known pathogenic mutations in Mendelian PD genes (LRRK2, GBA, SNCA, PRKN, PINK1, DJ-1)
- History of neuroleptic use within 12 months of symptom onset
- Evidence of secondary parkinsonism (drug-induced, traumatic, vascular)
Genotyping and Quality Control
Genotyping Platform Selection
The experiment employs ancestry-diverse genotyping arrays optimized for population-specific variant detection:
Illumina Global Diversity Array (GDA): Designed specifically for multi-ancestry studies with enhanced coverage of low-frequency variants in diverse populations, including rare variants specific to African and Asian ancestries.
Affymetrix Axiom World Array: Provides comprehensive coverage across continental populations with dedicated content for understudied populations, particularly relevant for Latin American admixture mapping.
Custom Multi-Ancestry Chip: A supplementary custom content panel targeting:
- Known PD risk loci with ancestry-specific tagging SNPs
- Fine-mapping regions around established GWAS signals
- Expression quantitative trait loci in brain tissue
- Ancestry-informative markers for PCA correction
Pre- Genotyping Quality Control
Rigorous sample-level quality control ensures data integrity:
SNP-Level Quality Control
Post-genotyping SNP filtering follows established protocols:
| QC Metric | Threshold | Rationale |
|-----------|-----------|-----------|
| SNP call rate | >98% | Maintain high-quality genotype data |
| Hardy-Weinberg equilibrium | p > 1×10⁻⁶ | Remove genotyping artifacts and causal variants |
| Minor allele frequency | >1% (population-specific) | Retain rare variants in each ancestry |
| Imputation quality (INFO) | >0.7 | Ensure accurate genotype inference |
| Differential missingness | p > 1×10⁻⁵ | Remove ancestry-differential SNP artifacts |
Imputation Strategy
Genotype imputation leverages diverse reference panels to maximize variant discovery:
Primary Reference Panel: TOPMed freeze 8 (n = 97,000 genomes) provides the highest quality multi-ancestry reference for African, European, and admixed populations[@topmed].
Secondary Panels: For populations underrepresented in TOPMed:
- Human Genome Diversity Project (HGDP) for Middle Eastern populations
- Singapore Sequencing Malay/Indian projects for South Asian fine-mapping
- Latin American-specific reference panels under development
Statistical Analysis Framework
Population-Specific Association Testing
Within each ancestry group, genome-wide association testing employs:
Statistical Model: Logistic regression under an additive genetic model with the following covariates:
- Age (continuous, centered)
- Sex (binary)
- Genetic principal components (top 10)
- Genotyping array (binary, if applicable)
- Site (categorical, if multi-site)
Multiple Testing Correction: Genome-wide significance threshold of p < 5×10⁻⁸; suggestive threshold of p < 1×10⁻⁶ for secondary analyses.
Trans-Ethnic Meta-Analysis
The experimental design incorporates multiple meta-analysis approaches to leverage shared and heterogeneous genetic effects:
Fixed-Effects Meta-Analysis: Inverse-variance weighted meta-analysis using METAL software, appropriate for variants with consistent effect directions across populations. This approach maximizes power for shared genetic architecture.
Random-Effects Meta-Analysis: DerSimonian-Laird random effects model for variants showing evidence of heterogeneity (Cochran's Q p < 0.05), accommodating differential effect sizes across ancestries.
Bayesian Trans-Ethnic Meta-Analysis: TRAITBASS (Trans-Ancestry Bayesian Meta-Analysis of Summary Statistics) provides probabilistic inference on cross-population effect heterogeneity, generating posterior probabilities for shared versus population-specific effects.
Heterogeneity Assessment: Key metrics include:
- Cochran's Q statistic and p-value
- I² index quantifying heterogeneity proportion
- Genetic effect correlation (r_g) between population pairs
Fine-Mapping and Functional Prioritization
Conditional Analysis: Stepwise conditional analysis within each ancestry group identifies independent signals at each locus, using GCTA-COJO or similar software.
Bayesian Fine-Mapping: Probabilistic fine-mapping using FINEMAP and SusieR to generate credible sets of putative causal variants, leveraging trans-ethnic convergence to narrow causal intervals.
Functional Annotation Integration: Prioritization incorporates:
- RegulomeDB scores for regulatory variant classification
- GTEx eQTL colocalization in brain tissues
- Chromatin state annotations from ENCODE and Roadmap
- Protein-altering predictions from SIFT and PolyPhen
- Evolutionary constraint metrics (GERP++, LoFtool)
Polygenic Risk Score Development
Methodology Development
The experimental design includes comprehensive PRS development to address well-documented performance disparities across ancestries:
Base PRS Construction: Multiple PRS methodologies will be evaluated:
| Method | Software | Key Features |
|--------|----------|--------------|
| LD clumping + pruning | PRSice, PLINK | Standard approach, computational efficiency |
| LD score regression | LDpred | Bayesian integration of SNP heritability |
| Machine learning | Lassosum, SbayesR | Regularized regression, population-specific optimization |
| Transcriptomic imputation | PRS-Targets | Integration of tissue-specific gene expression |
Population-Specific Optimization: For each ancestry group:
Validation and Calibration
Internal Validation: Split-sample validation within each ancestry group, with 70% of data for training and 30% for testing.
External Validation: Independent replication in distinct cohorts not included in discovery meta-analysis, with particular emphasis on non-European validation.
Performance Metrics: Primary metrics include:
- Area under the receiver operating characteristic curve (AUC-ROC)
- Odds ratio per standard deviation (OR/SD)
- Positive predictive value at clinically relevant thresholds
- Calibration (observed versus expected risk)
Clinical Implementation Readiness
The PRS development framework addresses practical implementation requirements:
Expected Outcomes and Deliverables
Primary Deliverables
Secondary Deliverables
Anticipated Impact
The experimental outcomes are expected to substantially advance multiple research and clinical domains:
Genetic Discovery: The study will expand our understanding of PD genetic architecture beyond European-centric findings, potentially revealing novel biological pathways not apparent in single-ancestry analyses.
Precision Medicine: Ancestry-specific PRS models will enable more accurate genetic risk prediction for underrepresented populations, supporting equitable implementation of precision medicine approaches.
Therapeutic Development: Population-specific genetic findings may reveal novel therapeutic targets relevant to particular ancestry groups, while shared findings will continue to inform broadly applicable therapeutic strategies.
Health Equity: By explicitly addressing ancestry-related disparities in genetic research, this experiment contributes to broader efforts to ensure that advances in genetic medicine benefit all populations.
Timeline and Milestones
Phase 1: Cohort Assembly and Harmonization (Months 1-8)
| Milestone | Target Date | Dependencies |
|----------|-------------|--------------|
| Data sharing agreements finalized | Month 2 | IRB approvals, DAC agreements |
| Cohort harmonization protocol complete | Month 3 | Standardized phenotype definitions |
| All genotype data transferred | Month 5 | Genotyping completion at sites |
| Centralized data repository established | Month 6 | Secure computing infrastructure |
| Pre-imputation QC complete | Month 8 | All cohorts passing QC thresholds |
Phase 2: Genotype Imputation and Primary Analysis (Months 9-14)
| Milestone | Target Date | Dependencies |
|----------|-------------|--------------|
| Imputation completed for all cohorts | Month 10 | TOPMed panel access |
| Ancestry-specific GWAS complete | Month 12 | Imputation quality thresholds |
| Trans-ethnic meta-analysis complete | Month 14 | All GWAS results available |
| Novel loci prioritized | Month 14 | Fine-mapping integration |
Phase 3: Polygenic Risk Score Development (Months 15-20)
| Milestone | Target Date | Dependencies |
|----------|-------------|--------------|
| PRS optimization complete | Month 17 | Meta-analysis summary statistics |
| Internal validation complete | Month 18 | Independent cohort access |
| External validation complete | Month 19 | External cohort replication |
| PRS deployment ready | Month 20 | Performance thresholds met |
Phase 4: Dissemination and Translation (Months 21-24)
| Milestone | Target Date | Dependencies |
|----------|-------------|--------------|
| Primary publication submission | Month 21 | All analyses complete |
| Summary statistics release | Month 22 | Publication acceptance |
| Clinical implementation pilot | Month 24 | IRB approval for pilot |
| Open-source pipeline release | Month 24 | Documentation complete |
Budget Justification
Resource Allocation
| Category | Cost (USD) | Justification |
|----------|------------|--------------|
| Genotyping (new samples) | $1,500,000 | 8,000 samples × $187.50 array cost |
| Data processing | $300,000 | Compute infrastructure, cloud storage |
| Statistical analysis | $200,000 | Personnel time, software licenses |
| Personnel (3 FTE) | $600,000 | Lead analyst, coordinators |
| Travel and collaboration | $150,000 | Consortium meetings, site visits |
| Publication and dissemination | $50,000 | Open access fees, documentation |
| Total | $2,800,000 | |
The budget reflects economies of scale achievable through existing IPDGC infrastructure and international collaboration. Approximately 60% of the required samples are anticipated to be available through existing consortium cohorts, reducing new genotyping requirements.
Ethical Considerations and Governance
Informed Consent and Data Governance
The experiment adheres to the highest standards of ethical research conduct:
- Summary statistics rather than individual-level data sharing where possible
- Differential privacy techniques for any individual-level analyses
- No return of individual results to participants (research-only designation)
Ancestry Terminology and Representation
The experiment employs precise, respectful population descriptors:
- Continental ancestry groups are described using established genetic nomenclature (European, African, East Asian, South Asian, Middle Eastern)
- Self-reported ancestry is supplemented with genetically determined ancestry for accuracy
- Population-specific terminology is reviewed by representatives from each included community
- The experiment avoids essentializing ancestry categories while acknowledging their utility for stratified analyses
Benefit Sharing
The experiment is committed to ensuring equitable benefits:
Integration with Related Research Programs
IPDGC Collaboration
The experiment builds upon and complements the International Parkinson Disease Genomics Consortium (IPDGC)[@ipdgc] framework, contributing to its multi-ethnic expansion objectives. Key integration points include:
- Leveraging existing IPDGC data infrastructure and quality control pipelines
- Contributing to IPDGC working groups focused on diversity and precision medicine
- Sharing analytical methods and best practices developed through the experiment
Global Parkinson's Genetics Program (GP2)
The experiment coordinates with the Global Parkinson's Genetics Program (GP2)[@gp2] to maximize sample size and analytical power:
- GP2 Phase 3 data will be incorporated into ancestry-specific analyses
- Standardized analytical protocols ensure consistency with GP2 methods
- Joint analyses will be conducted for trans-ethnic meta-analysis
Related Neurodegeneration Studies
The experimental design draws on methodological advances from related efforts:
- Alzheimer's disease multi-ethnic GWAS (AA-AD)
- Amyotrophic lateral sclerosis genetics consortia
- Huntington's disease population genetics studies
These connections enable methodological cross-fertilization and potential future collaborative analyses examining shared genetic architecture across neurodegenerative diseases.
Conclusion
This multi-ethnic Parkinson's disease GWAS represents a comprehensive experimental framework designed to address critical gaps in our understanding of PD genetics across diverse global populations. By combining rigorous methodology with international collaboration and explicit attention to health equity, the experiment is positioned to deliver scientific advances that benefit all patients with Parkinson's disease regardless of their ancestry background.
The experimental design incorporates state-of-the-art approaches to genotype imputation, trans-ethnic meta-analysis, and polygenic risk score optimization, while maintaining the flexibility to accommodate emerging technologies and analytical methods. The expected deliverables—including novel risk loci, an ancestry-specific effect atlas, and validated PRS models—will substantially advance both basic understanding of PD biology and clinical implementation of precision medicine approaches.
Success of this experiment will depend on sustained international collaboration, commitment to data sharing across institutional and national boundaries, and ongoing engagement with patient communities and advocacy organizations. The experimental framework is designed to be adaptable, with clear decision points for incorporating new methodologies and responding to emerging findings.
See Also
- [Parkinson's Disease](/diseases/parkinsons-disease)
- [LRRK2 Gene](/genes/lrrk2)
- [SNCA Gene](/genes/snca)
- [GBA Gene](/genes/gba)
- [Genetic Epidemiology](/mechanisms/genetic-epidemiology)
- [Polygenic Risk Scores](/mechanisms/polygenic-risk-scores-neurodegeneration)
- [Parkinson's Disease Genetics](/diseases/parkinsons-disease-genetics)
- [International Parkinson Disease Genomics Consortium](/organizations/international-parkinson-disease-genomics-consortium)
- [Multi-Ethnic GWAS Mechanisms](/mechanisms/multi-ethnic-pd-gwas)
References
▸Metadataorigin_type: v1_polymorphic_backfill
| slug | experiments-multi-ethnic-pd-gwas |
| kg_node_id | None |
| entity_type | experiment |
| origin_type | v1_polymorphic_backfill |
| source_table | wiki_pages |
| wiki_page_id | wp-be4fff3a5346 |
| __merged_from | {'merged_at': '2026-05-13', 'unprefixed_id': 'experiments-multi-ethnic-pd-gwas'} |
| _schema_version | 1 |
No provenance edges found
Use ?embed=1 to load the artifact without SciDEX chrome — suitable for iframing into wiki pages or external sites.
<iframe src="http://scidex.ai/artifact/wiki-experiments-multi-ethnic-pd-gwas?embed=1" width="100%" height="600" style="border:0;border-radius:8px"></iframe>
[Experiment: Multi-Ethnic PD GWAS](http://scidex.ai/artifact/wiki-experiments-multi-ethnic-pd-gwas)
http://scidex.ai/artifact/wiki-experiments-multi-ethnic-pd-gwas