Explainable machine learning for early detection of Parkinson’s disease in aging populations using vocal biomarkers
Egbo B. Nigmetolla Z. Khan N.A. Jamwal P.K.
2025Frontiers Media SA
Frontiers in Aging Neuroscience
2025#17
Introduction: Parkinson’s Disease (PD) is a progressive neurodegenerative disorder that significantly affects the aging population, creating a growing burden on global health systems. Early detection of PD is clinically challenging due to the gradual and ambiguous onset of symptoms. Methods: This study presents a machine-learning framework for the early identification of PD using non-invasive biomedical voice biomarkers from the UCI Parkinson’s dataset. The dataset consists of 195 sustained phonation recordings from 31 participants (23 PD and 8 healthy controls, ages 46–85). The methodology includes subject-level stratified splitting and normalization, along with BorderlineSMOTE to address class imbalance. Initially, an XGBoost model is applied to select the top 10 acoustic features, followed by a Bayesian-optimized XGBoost classifier, with the decision threshold tuned via F1-maximization on validation data. Results: On the held-out test set, the model achieves 98.0% accuracy, 0.97 macro-F1, and 0.991 ROC-AUC. This performance exceeds that of a deep neural network baseline by 4.0 percentage points in accuracy (94.0% to 98.0%), 4.3 percentage points in macro-F1 (92.7% to 97.0%), and 0.050 in AUC (0.941 to 0.991). Compared to a classical SVM, it outperforms by 7.0 percentage points in accuracy (91.0% to 98.0%), 6.5 percentage points in macro-F1 (90.5% to 97.0%), and 0.089 in AUC (0.902 to 0.991). Discussion: Model decisions are elucidated using SHAP, offering global and patient-specific insights into the influential voice features. These findings indicate the feasibility of a non-invasive, scalable, and explainable voice-based tool for early PD screening, highlighting its potential integration into mobile or telehealth diagnostic platforms. Copyright
aging-related neurodegeneration , biomedical voice biomarkers , early diagnosis and predictive modeling , explainable machine learning , Parkinson’s disease
Text of the article Перейти на текст статьи
Department of Electrical and Computer Engineering, School of Engineering and Digital Sciences, Nazarbayev University, Astana, Kazakhstan
School of Information Technology and Systems, University of Canberra, Canberra, ACT, Australia
Department of Electrical and Computer Engineering
School of Information Technology and Systems
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026