dc.description.abstract |
This research project “Modelling of Bangla Real Word Error Correction” is a language
model for finding real-word errors in a Bangla sentence and providing correction on the
error word. This topic is now very relevant in the Natural Language Processing sector as it
is now a topic of huge interest. The syntactical and grammatical rules in Bangla are rather
complex, which poses trouble in handling the language. Words can be obscure where the
meaning is dependent on the context. In this project, we proposed a model with
Bidirectional LSTM model, which is short for Long Short-Term Memory model. LSTM is
a RNN (Recurrent Neural Network) architecture that can not only process single data point
but an entire sequence of data. Firstly, the Trigram sequence was created to get context out
of a sequence, and fed into the LSTM model. Since the Bidirectional LSTM model
remembers the forward as well as the backward relationship of a sequence, it can have a
better understanding of the context of a Bangla sentence. After training the model and
implementing it to detect and provide correction of a real word error we got an accuracy of
74.450% on the test dataset. But in predicting the next word from the sentence context it
was even more successful with 85.47% accuracy. This proposed model was tested in many
ways after implementation and it works successfully in both detecting and correcting the
real word error. |
en_US |