DSpace Repository

Deep Learning-Based Sentiment Detection in Code-Mixed Language: Exploring RNNs and Transformers Architectures

Show simple item record

dc.contributor.author Akter, Kazi Ayesha
dc.contributor.author Bristy, Prathona Rani
dc.date.accessioned 2026-04-05T04:30:56Z
dc.date.available 2026-04-05T04:30:56Z
dc.date.issued 2025-09-17
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/16574
dc.description Project Report en_US
dc.description.abstract Banglish, the informal hybrid of Bengali and English typed in Latin alphabet, presents its own challenges along with nonstandard spelling and transliteration drift in the form of code-switching. This study aims to create a full pipeline to deal with Banglish from data acquisition and annotation to modeling and analysis over 7 categories (Appearance, Not Hate, Others, Racial, Religious, Sexual, and Slang). In the study, we create and clean a social media corpus, design a preprocessing suite [custom stop word filtering, regex tokenization, and rule-based normalization of spelling variants] tailored for Banglish, and address class imbalance via staged over- and under- sampling to a balanced set of 2,000 instances per class. To understand the model’s performance, we test recurrent architectures (LSTM, GRU, BiLSTM, BiGRU) and their hybrids (LSTM+GRU, BiLSTM+BiGRU) against transformer models (mBERT, XLM-RoBERTa) under equal training conditions. The mBERT model shows the best performance (accuracy 0.88, macro-F1 0.87), followed by BiLSTM+BiGRU among RNN models (accuracy 0.84, macro-F1 0.84), whereas XLM-RoBERTa performs (accuracy 0.75, macro-F1 0.74) the worst, which implies that transformers outperform other models for this task. A confusion-matrix analysis reveals that RNNs consistently fail by collapsing ambiguous classes (Not Hate, Others, Sexual) into Appearance. This failure is substantially reduced by mBERT. We conclude that, with Banglish-specific preprocessing and balanced evaluation, multilingual transformers provide the most reliable basis for moderning Banglish content, while under tighter en_US
dc.description.sponsorship Daffodil International University en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.subject Natural Language Processing (NLP) en_US
dc.subject Banglish Text Classification en_US
dc.subject Code-Switching Detection en_US
dc.subject mBERT en_US
dc.subject Multilingual Transformers en_US
dc.title Deep Learning-Based Sentiment Detection in Code-Mixed Language: Exploring RNNs and Transformers Architectures en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account