Integrative machine learning approach to risk prediction for dementia and Alzheimer's disease.

Stern A, Linial M
Geroscience 2026
Open on PubMed

Dementia, particularly Alzheimer's disease (AD), presents a growing global health challenge characterized by cognitive decline, behavioral changes, and loss of independence. With increasing life expectancy, early diagnosis and improved clinical strategies are urgently needed. This study developed and evaluated machine learning (ML) models to predict AD risk using UK Biobank data, integrating health, genetic, and lifestyle factors. The cohort included 2878 AD cases and 72,366 controls. Among several algorithms, CatBoost performed best (ROC-AUC = 0.773), especially in females. Inputs included ICD-10 codes from 5 years pre-diagnosis, ApoE-ε4 genotype, and large collection of modifiable risk factors. Despite fewer cases, the risk predictive models for vascular dementia (VaD) outperformed the unique AD models. ApoE-ε4 was the most predictive genetic marker, while other common variants had limited utility. Key non-genetic predictors included comorbidities (e.g., diabetes, hypertension), education, physical activity, and diet. These findings highlight the value of integrating diverse data sources for dementia risk prediction and emphasize the role of sex-specific modeling and modifiable factors in early, personalized intervention strategies.

5 Figures Extracted
Fig. 1
Fig. 1 PMC
Age matching protocol. A The distribution of the control and AD groups by age. B Following a protocol for age-matching schemes, a major cofounding...
Fig. 2
Fig. 2 PMC
Performance of the risk factor predictive modes for AD from UKB. A Comparison of selected models’ performance by the mean of the ROC-AUC for ten dif...
Fig. 3
Fig. 3 PMC
Feature importance of AD predictive risk model. A Top 20 features from the “all” model selected iteration using SHAP. The most informative feature i...
Fig. 4
Fig. 4 PMC
Interpretation of AD predictive risk model by the feature properties. A Heatmap of mean SHAP importances of the selected ICD-10 features used in mod...
Fig. 5
Fig. 5 PMC
Partition of dementia cohort in UKB to subgroups by clinical ICD-10. A Venn diagram of different dementia diagnoses. The AD model consists of patien...