DSpace Repository

Extractive text rank-based NLP news summarization for multiple domains

Show simple item record

dc.contributor.author Mustofa, Md. Wazih Ullah
dc.date.accessioned 2024-10-03T08:12:29Z
dc.date.available 2024-10-03T08:12:29Z
dc.date.issued 2024-01-26
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/13500
dc.description.abstract This paper provides a thorough analysis of extractive summarization, or the use of Natural Language Processing (NLP) techniques to summarize news articles. Approximately two thousand articles covering a wide range of topics, including business, entertainment, politics, sports, and technology, were gathered from different online platforms, including the well-known "Prothom Alo" newspaper. My method included a thorough preprocessing step that included punctuation and special character removal, as well as spell correction with TextBlob. The primary focus of my study is the implementation of the TextRank algorithm, which was modified from the PageRank algorithm to handle natural language text. Using this technique, text was represented as a graph, with edges denoting the cosine similarity between sentences and vertices representing the sentences themselves. I described my process for vectorizing sentences and creating a similarity matrix by figuring out the cosine similarity between each pair. The paper explores the algorithmic nuances of using a customized sentence similarity function to rank sentences according to their relevance and importance. I then conducted a comparative analysis of the summaries generated against the original texts, calculating similarity scores to evaluate the efficacy of my summarization process. The study aims to highlight the effectiveness of extractive summarization in processing large volumes of news data, offering insights into the potential of NLP in media analytics. By comparing the actual summaries and those generated through my method, I draw conclusions about the precision and utility of extractive summarization in the context of diverse news content. This research contributes to the field by demonstrating a practical application of NLP in the efficient processing and summarization of large-scale news data. en_US
dc.publisher Daffodil International University en_US
dc.subject NLP (Natural Language Processing en_US
dc.subject Domain Adaptation en_US
dc.subject Information Retrieval en_US
dc.subject Automatic Summarization en_US
dc.title Extractive text rank-based NLP news summarization for multiple domains en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account