Parkinson’s Disease Detection Using Machine Learning: A Comparative Study of Classification Algorithms

Rifat, Samiul Haque

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

dc.contributor.author	Rifat, Samiul Haque
dc.date.accessioned	2026-04-21T04:28:43Z
dc.date.available	2026-04-21T04:28:43Z
dc.date.issued	2025-05-14
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/16941
dc.description	Project Report	en_US
dc.description.abstract	Parkinson’s disease (PD) is a neurodegenerative movement disorder resulting from the loss of dopamine neurons that causes tremor, bradykinesia, and rigidity as its cardinal motor symptoms, dramatically affecting patient quality of life. Early and reliable diagnosis of PD is important for its successful treatment and control. In this study, we provide a reference on the comparison of machine learning models to PD detection based on the comprehensive analysis of a dataset on demographic, clinical and voice features. The research report compares the performance of six classifiers (MNB, Logistic Regression, Random Forest Classifier, GNB, Decision Tree Classifier, and SVC) on the classification of normal and PD classes. From the results of our experiments, the best test accuracy of 90.07% was achieved by the Random Forest Classifier and the next best of 87.23% was achieved by the Decision Tree Classifier. Logistic Regression achieves the bestperformed with 79.91% of test accuracy, and Gaussian Naïve Bayes yields 76.12%. On the other hand, Multinomial Naïve Bayes and SVC achieve low accuracies of 68.56% and 62.17% , respectively. It is worth mentioning that Random Forest and Decision Tree models are able to overfit as they capture patterns within data perfectively (the training accuracy for all are 100%), whilst the Scikit learn baseline model achieved almost the same accuracy for the test dataset. But this does have me wondering about over-fitting (especially with Decision Trees). The present work emphasizes the necessity of using suitable models according to the property of the data and the needs of PD detection tasks. The Random Forest, for instance, is a model that has already found applications in this context and performed well, however ensembles like these are more complex and computationally expensive than simpler models such as the Logistic Regression. In addition, the results also highlight the necessity of further data preprocessing (feature scaling and hyperparameter tuning) to improve the convergence and generalization of learning models. By furthering my topic of machine learning in the context of neurodegenerative disease diagnosis, this research provides valuable insights into avenues for enhanced early detection and tailored treatments for Parkinson’s disease.	en_US
dc.description.sponsorship	Daffodil International University	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Parkinson’s Disease	en_US
dc.subject	Neurodegenerative Disease	en_US
dc.subject	Machine Learning	en_US
dc.subject	Healthcare Technology	en_US
dc.subject	Gaussian Naïve Bayes (GNB)	en_US
dc.subject	Support Vector Classifier (SVC)	en_US
dc.title	Parkinson’s Disease Detection Using Machine Learning: A Comparative Study of Classification Algorithms	en_US
dc.type	Other	en_US