Bangla E-Mail Body to Subject Generation Using Sequence to Sequence RNNs

Talukder, Moyin; Alim, Md. Samiul

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

Bangla E-Mail Body to Subject Generation Using Sequence to Sequence RNNs

Talukder, Moyin; Alim, Md. Samiul

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/9214

Date: 22-09-13

Abstract:

The development of subjects has become one of the major problems facing deep learning and natural language processing in recent years. A brief comment on a lengthy email body is condensed in the subject generation. Our goal is to develop a Bengali subject generator that is effective and efficient and can produce a clear and insightful subject from a given Bengali email body. To do this, we have gathered a variety of emails body, including educational, commercial, etc. and will use our model to generate subject from those texts. In the encoding layer of our model, bi-directional RNNs are employed, while the decoding layer makes use of LSTMs and an attention model. Our model generates subject using a sequence-to-sequence model. While developing this model, we encountered difficulties with text pre-processing, vocabulary and missing words counting, word embedding, detecting new terms and other tasks. Our primary objectives in this model were to generate a subject and lessen its train loss. In our study, we successfully reduced the train loss to 0.001 by producing a smooth concise subject from a provided email body.

Show full item record