Optimizing Human-device Interaction through Real-Time Automated Speech Recognition and NLP

Oishy, Tahiya Rahman

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

dc.contributor.author	Oishy, Tahiya Rahman
dc.date.accessioned	2026-06-25T03:33:49Z
dc.date.available	2026-06-25T03:33:49Z
dc.date.issued	2024-12-28
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17401
dc.description	Project Report	en_US
dc.description.abstract	Automatic speech recognition (ASR) is a technique which enables machines to interpret, convert and translate spoken language into text. To produce a text from spoken language, ASR system receives input from the speaker and subsequently decodesthe input using some patterns, algorithms or model. In this project, the research emphasized how speech recognition systems can be used to automation tasks, prioritizing the performance of both online and offline algorithms such as Google API, PocketSphinx and Vosk in various circumstances. Therefore, in current study, ASR model had been analyzed in detail where Hidden Markove Model and Gaussian Mixture Model (HMM and GMM) symbiosis set as the base of the experiment. The project was built-up on Python to execute three platforms as preliminary target and the algorithms of the platforms are Google API, PocketSphinx and Vosk. All these three platforms had been compared to find robustness and superiority, but interestingly, Vosk was conducted extensively better accuracy than Google API and PocketSphinx. An assessment platform was prepared with the voice of different age groups and considered voice, frequency-noise and word error rate (WER) to highlight the durability of these systems. The findings illustrated that Vosk beat Google API and PocketSphinx in a variety of contexts. Therefore, to overcome the problem, in current study a predefined command list was set up as a methodical foundation for the assessment of every system in automated application. Despite of the limitations, the research was provided the companionship between human and computer especially for disabled people who are facing challenges using devices. Finally, this innovation opened a new window in ASR technique due to its effectiveness with the use of real time data and the evidence of more accuracy.	en_US
dc.description.sponsorship	Daffodil International University	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Automatic Speech Recognition (ASR)	en_US
dc.subject	Machine Learning	en_US
dc.subject	Deep Learning	en_US
dc.subject	Algorithms	en_US
dc.subject	Natural Language Processing (NLP)	en_US
dc.title	Optimizing Human-device Interaction through Real-Time Automated Speech Recognition and NLP	en_US
dc.type	Other	en_US