dc.description.abstract |
Nowadays image captioning is a large area for research because of its social and commercial usage.
To understand a picture, it is necessary to understand its features first. For this work, we have used
a Convolutional Neural Network (CNN) which has trained images to generate words from its
features. Then the words are arranged to form a sentence. The activation level of the CNN serves
as an input for the Recurrent Neural Network (RNN) and generates a complete caption. These
networks sequentially behave like an encoder and decoder. In existing work, the data used for this
case study were not adequate and had a lack of different types of data. They don’t have multiple
captions. In this paper, we have introduced sunset related image-captioning methods in the Bengali
language based on deep learning. To achieve better results, we have proposed a model, merged
with the LSTM layer and the second last layer of the VGG16 model with a dense layer. We have
achieved 78.26% accuracy with our proposed model for Sunset related image captioning in the
Bengali language. |
en_US |