Fine Tuned Multitasking Neural Network for Parkinson's Disease Detection from Voice Recordings

López-Santander, Diego Alexander; Ríos-Urrego, Cristian David; Orozco-Arroyave, Juan Rafael

Fine Tuned Multitasking Neural Network for Parkinson's Disease Detection from Voice Recordings

dc.creator	López-Santander, Diego Alexander
dc.creator	Ríos-Urrego, Cristian David
dc.creator	Orozco-Arroyave, Juan Rafael
dc.date	2025-07-28
dc.description	Parkinson's disease (PD) is the second most prevalent neurodegenerative disorder in old age. It is characterized by symptoms such as resting tremor, rigidity, and gait disturbances. It also affects the natural production of speech, causing tremors of the voice and imprecise pronunciation, among others. Given the prevalence of speech disorders in PD, analyzing an individual's speech provides a non-invasive, cost-effective means for detection and monitoring. The objective of this paper was to take advantage of the potential of deep learning, specifically a pre-trained convolutional neural network and a multitasking approach, to classify speech recordings from PD patients and healthy controls (HC) from spectral representations. The proposed multitask analysis methodology aimed to evaluate the effectiveness of pre-trained ResNet models, fine-tuned on Spanish, Italian, and German speech databases, for both single-task and multitask classification approaches. The results indicated that multitask learning, which includes additional tasks such as vowel and sex classification, enhances the model's performance compared to monotask learning by taking advantage of shared representations across related tasks. The multitask approach showed an improvement of up to 5% in classification accuracy and the inclusion of the intermediate models for fine-tuning produced up to 10% better classification accuracy with respect to the implemented baseline. In conclusion, this work contributes to the growing body of literature demonstrating the viability of deep learning methods for non-invasive PD detection and highlights the advantages of multitask learning for pathological speech classification.	en-US
dc.description	La enfermedad de Parkinson (EP) es el segundo trastorno neurodegenerativo más prevalente en la vejez. Se caracteriza por síntomas como temblor en reposo, rigidez y alteraciones de la marcha. También afecta a la producción natural del habla, causando temblor de voz y pronunciación imprecisa. Dada la prevalencia de los trastornos del habla en la EP, el análisis del habla de un individuo proporciona un medio no invasivo y económico para su detección y monitorización. El objetivo de este trabajo consistió en aprovechar el potencial del aprendizaje profundo, específicamente una red neuronal convolucional pre entrenada y un enfoque multitarea, para clasificar grabaciones del habla de pacientes con EP y controles sanos (HC) utilizando representaciones espectrales. La metodología de análisis multitarea propuesta consistió en evaluar la eficacia de los modelos ResNet pre entrenados, afinados en bases de datos en español, italiano y alemán, tanto para enfoques de clasificación de una sola tarea como multitarea. Los resultados indicaron que el aprendizaje multitarea, que incluye tareas adicionales como la clasificación de vocales y la clasificación de sexos, mejora el rendimiento del modelo en comparación con el aprendizaje monotarea al aprovechar las representaciones compartidas entre tareas relacionadas. El enfoque multitarea mostró una mejora de hasta el 5 % en la tasa de acierto de la clasificación, y la inclusión de los modelos intermedios para el ajuste fino produjo una mejora de hasta el 10 % con respecto al modelo baseline implementado. Finalmente, se concluye que este trabajo contribuye al creciente cuerpo de literatura que demuestra la viabilidad de los métodos de aprendizaje profundo para la detección no invasiva de la EP y destaca las ventajas del aprendizaje multitarea para la clasificación patológica del habla.	es-ES
dc.format	application/pdf
dc.format	text/xml
dc.format	application/epub+zip
dc.format	text/html
dc.identifier	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/3307
dc.identifier	10.22430/22565337.3307
dc.language	eng
dc.publisher	Instituto Tecnológico Metropolitano (ITM)	en-US
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/3307/3710
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/3307/3778
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/3307/3779
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/3307/3888
dc.relation	/ref/A. H. V. Schapira, C. Warren Olanow, J. Timothy Greenamyre, and E. Bezard, “Slowing of neurodegeneration in Parkinson's disease and Huntington's disease: future therapeutic perspectives,” Lancet, vol. 384, no. 9942, pp. 545-555, Aug. 2014. https://doi.org/10.1016/S0140-6736(14)61010-2
dc.relation	/ref/J. Jankovic, and A. E. Lang, “Diagnosis and assessment of Parkinson disease and other movement disorders,” in Bradley's Neurology in Clinical Practice E-Book. 8th ed. Oxford, UK: Elsevier, 2021, pp. 310-33. https://www.clinicalkey.com/nursing/#!/content/book/3-s2.0-B9780323642613000243?scrollTo=%23hl0002636
dc.relation	/ref/M. Sapmaz Atalar, O. Oguz, and G. Genc, “Hypokinetic Dysarthria in Parkinson's Disease: A Narrative Review,” Med. Bull. Sisli Etfal Hosp., vol. 57, no. 2, pp. 163-170, 2023. https://doi.org/10.14744/SEMB.2023.29560
dc.relation	/ref/F. Cao, A. P. Vogel, P. Gharahkhani, and M. E. Renteria, “Speech and language biomarkers for Parkinson’s disease prediction, early diagnosis and progression," npj Parkinsons Dis., vol. 11, no. 1, p. 57, Mar. 2025. https://doi.org/10.1038/s41531-025-00913-4
dc.relation	/ref/J. Rusz et al., “Smartphone allows capture of speech abnormalities associated with high risk of developing Parkinson’s disease,” IEEE Transact. Neur. Systems Rehab. Engin., vol. 26 no. 8, pp. 1495-1507, Aug. 2018. https://doi.org/10.1109/TNSRE.2018.2851787
dc.relation	/ref/A. Lowit, A. Marchetti, S. Corson, and A. Kuschmann, “Rhythmic performance in hypokinetic dysarthria: Relationship between reading, spontaneous speech and diadochokinetic tasks,” J. Communic. Disord., vol. 72, no. 26, Mar-Apr. 2018. https://doi.org/10.1016/j.jcomdis.2018.02.005
dc.relation	/ref/P. Kumar Keserwani, S. Das, and N. Sarkar, “A comparative study: prediction of parkinson’s disease using machine learning, deep learning and nature inspired algorithm,” Multimed. Tools Appl., vol. 83, no. 27, pp. 69393-69441, Jan 2024. https://doi.org/10.1007/s11042-024-18186-z
dc.relation	/ref/A. Shrestha, and A. Mahmood, “Review of deep learning algorithms and architectures,” IEEE Acc., vol. 7, pp. 53040-53065, Apr. 2019. https://doi.org/10.1109/ACCESS.2019.2912200
dc.relation	/ref/M. Shaban, “Deep learning for Parkinson’s disease diagnosis: a short survey,” Computers, vol. 12, no. 3, p. 58, Mar. 2023. https://doi.org/10.3390/computers12030058
dc.relation	/ref/J. Rasheed, A. Ali Hameed, N. Ajlouni, A. Jamil, A. Özyavaş, and Z. Orman, “Application of adaptive back-propagation neural networks for Parkinson’s disease prediction,” in 2020 Inter. Conf. Data Analytics Bus. Indust.: Way Towards a Sustainable Economy, Sakheer, Bahrain, 2020, pp. 1-5. https://doi.org/10.1109/ICDABI51230.2020.9325709
dc.relation	/ref/S. Rahman, M. Hasan, A. Krishno Sarkar, and F. Khan, “Classification of Parkinson’s Disease using Speech Signal with Machine Learning and Deep Learning Approaches,” Europ. J. Electr. Engin. Comput. Sci., vol. 7, no. 2, pp.20-27, Mar. 2023. https://doi.org/10.24018/ejece.2023.7.2.488
dc.relation	/ref/M. Little, 2007, “Parkinsons” UCI Machine Learning Repository. https://doi.org./10.24432/C59C74
dc.relation	/ref/A. Rehman, T. Saba, M. Mujahid, F. S. Alamri, and N. ElHakim, “Parkinson’s disease detection using hybrid LSTM-GRU deep learning model,” Electronics, vol. 12, no. 13, p. 2856, Jun. 2023. https://doi.org/10.3390/electronics12132856
dc.relation	/ref/J. Mallela et al., “Voice based classification of patients with Amyotrophic Lateral Sclerosis, Parkinson’s Disease and Healthy Controls with CNN-LSTM using transfer learning,” in 2020 IEEE Inter. Conf. Acoust. Speech Sign. Process, Barcelona, Spain, 2020, pp. 6784-6788. https://doi.org/10.1109/ICASSP40776.2020.9053682
dc.relation	/ref/O. Karaman, H. Çakın, A. Alhudhaif, and K. Polat, “Robust automated Parkinson disease detection based on voice signals with transfer learning,” Expert Syst. Appl., vol. 178, p. 115013, Sep. 2021. https://doi.org/10.1016/j.eswa.2021.115013
dc.relation	/ref/K. G. Dávid Sztahó, and T. Miklós Gábriel, “Deep learning solution for pathological voice detection using LSTM-based autoencoder hybrid with multi-task learning,” in I14th Inter. Joint Conf. Biomed. Engin. Syst. Technol, Vienna, Austria, 2021, pp. 135-141. https://www.scitepress.org/PublishedPapers/2021/101931/101931.pdf
dc.relation	/ref/J. C. Vásquez-Correa, T. Arias-Vergara, J. R. Orozco-Arroyave, and E. Nöth, “A Multitask Learning Approach to Assess the Dysarthria Severity in Patients with Parkinson's Disease,” in Proceed. Interspeech, Hyderabad, India, 2018, pp. 456-460. https://doi.org/10.21437/Interspeech.2018-1988
dc.relation	/ref/J. R. Orozco-Arroyave, J. D. Arias-Londoño, J. F. Vargas-Bonilla, M. C. Gonzalez-Rátiva, and E. Nöth, “New Spanish speech corpus database for the analysis of people suffering from Parkinson's disease,” in Proceed. LREC, 2014, pp. 342-347. https://www5.informatik.uni-erlangen.de/Forschung/Publikationen/2014/Orozco14-NSS.pdf
dc.relation	/ref/C. G. Goetz et al., “Movement Disorder Society‐sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS‐UPDRS): scale presentation and clinimetric testing results,” Movem. Disord., vol. 23, no. 15, pp. 2129-2170, Nov. 2008. https://doi.org/10.1002/mds.22340
dc.relation	/ref/Universidad del Sarre, and Hospital Universitario de Essen, “Saarbrücken Voice Database,” stimmdb.coli. Accessed: Jun. 20. 2024. [Online]. Available: https://stimmdb.coli.uni-saarland.de/
dc.relation	/ref/G. Dimauro, V. Di Nicola, V. Bevilacqua, D. Caivano, and F. Girardi, “Assessment of speech intelligibility in Parkinson’s disease using a speech-to-text system,” IEEE Acc., vol. 5, pp. 22199-22208, Oct. 2017. https://doi.org/10.1109/ACCESS.2017.2762475
dc.relation	/ref/D. A. López-Santander, C. David Rios-Urrego, C. Bergler, E. Nöth, and J. R. Orozco-Arroyave, “Robust Classification of Parkinson’s Speech: An Approximation to a Scenario With Non-controlled Acoustic Conditions,” in Text, Speech, and Dialogue. TSD 2024. Lecture Notes in Computer Science, E. Nöth, A. Horák, P. Sojka, Eds., Cham, Switzerland: Springer, 2024, pp. 252-262. https://doi.org/10.1007/978-3-031-70566-3_22
dc.relation	/ref/K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conf. Comput. Vision Pattern Recogn. (CVPR), Las Vegas, USA, 2016, pp. 770-778. https://doi.org/10.1109/CVPR.2016.90
dc.relation	/ref/S. Ruder, “An Overview of Multi-Task Learning in Deep Neural Networks,” arXiv: 1706.05098, 2017. https://doi.org/10.48550/arXiv.1706.05098
dc.relation	/ref/M. Fontana, M. Spratling, and M. Shi, “When multitask learning meets partial supervision: A computer vision review,” Proceed. IEEE, vol. 112, no. 6, pp. 516-543, Aug. 2024. https://doi.org/10.1109/JPROC.2024.3435012
dc.relation	/ref/G. Pironkov, S. Dupont, and T. Dutoit, “Multi-Task Learning for Speech Recognition: An Overview,” in ESANN 2016 Proceed. Europ. Symp. Artif. Neur. Net., Comput. Intellig. Mach. Learn., Bruges, Belgium, 2016, pp. 189-194. https://www.esann.org/sites/default/files/proceedings/legacy/es2016-154.pdf
dc.relation	/ref/H. Harutyunyan, H. Khachatrian, D. C. Kale, G. Ver Steeg, and A. Galstyan, “Multitask learning and benchmarking with clinical time series data,” Scient. Data, vol. 6, no. 96, Jun. 2019. https://doi.org/10.1038/s41597-019-0103-9
dc.relation	/ref/S. Chen, Y. Zhang, and Q. Yang, “Multi-Task Learning in Natural Language Processing: An Overview,” arXiv: 2109.09138, 2021. https://doi.org/10.48550/arXiv.2109.09138
dc.relation	/ref/F. Amato, L. Borzì, G. Olmo, C. A. Artusi, G. Imbalzano, and L. Lopiano, “Speech impairment in Parkinson’s disease: acoustic analysis of unvoiced consonants in Italian native speakers,” IEEE Acc., vol. 9, pp. 166370-166381, Dec. 2021. https://doi.org/10.1109/ACCESS.2021.3135626
dc.rights	Copyright (c) 2025 TecnoLógicas	en-US
dc.rights	https://creativecommons.org/licenses/by-nc-sa/4.0	en-US
dc.source	TecnoLógicas; Vol. 28 No. 63 (2025); e3307	en-US
dc.source	TecnoLógicas; Vol. 28 Núm. 63 (2025); e3307	es-ES
dc.source	2256-5337
dc.source	0123-7799
dc.subject	aprendizaje profundo	es-ES
dc.subject	aprendizaje multitarea	es-ES
dc.subject	clasificación de habla patológica	es-ES
dc.subject	aprendizaje por transferencia	es-ES
dc.subject	deep learning	en-US
dc.subject	multitask learning	en-US
dc.subject	pathological speech classification	en-US
dc.subject	transfer learning	en-US
dc.title	Fine Tuned Multitasking Neural Network for Parkinson's Disease Detection from Voice Recordings	en-US
dc.title	Red neuronal multitarea para la detección de la enfermedad de Parkinson a partir de grabaciones de voz	es-ES
dc.type	info:eu-repo/semantics/article
dc.type	info:eu-repo/semantics/publishedVersion
dc.type	Research Papers	en-US
dc.type	Artículos de investigación	es-ES

Archivos

Bloque original

Mostrando 1 - 3 de 3

Nombre:: 3307_Diagramado_Eng_V3.pdf
Tamaño:: 642.47 KB
Formato:: Adobe Portable Document Format

Descargar

Nombre:: 344281872006.xml
Tamaño:: 94.42 KB
Formato:: Extensible Markup Language

Descargar

Nombre:: 344281872006.epub
Tamaño:: 1.09 MB
Formato:: Electronic publishing

Descargar

Colecciones

Revista TecnoLógicas - Cosecha Portal de Revistas