Extractive text rank-based NPL news summarization for multiple domain

Mustofa, Md. Wazih Ullah

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

dc.contributor.author	Mustofa, Md. Wazih Ullah
dc.date.accessioned	2024-07-15T05:05:18Z
dc.date.available	2024-07-15T05:05:18Z
dc.date.issued	2024-01-22
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/12951
dc.description.abstract	This paper provides a thorough analysis of extractive summarization, or the use of Natural Language Processing (NLP) techniques to summarize news articles. Approximately two thousand articles covering a wide range of topics, including business, entertainment, politics, sports, and technology, were gathered from different online platforms, including the well-known "Prothom Alo" newspaper. My method included a thorough preprocessing step that included punctuation and special character removal, as well as spell correction with TextBlob. The primary focus of my study is the implementation of the TextRank algorithm, which was modified from the PageRank algorithm to handle natural language text. Using this technique, text was represented as a graph, with edges denoting the cosine similarity between sentences and vertices representing the sentences themselves. I described my process for vectorizing sentences and creating a similarity matrix by figuring out the cosine similarity between each pair. The paper explores the algorithmic nuances of using a customized sentence similarity function to rank sentences according to their relevance and importance. I then conducted a comparative analysis of the summaries generated against the original texts, calculating similarity scores to evaluate the efficacy of my summarization process. The study aims to highlight the effectiveness of extractive summarization in processing large volumes of news data, offering insights into the potential of NLP in media analytics. By comparing the actual summaries and those generated through my method, I draw conclusions about the precision and utility of extractive summarization in the context of diverse news content. This research contributes to the field by demonstrating a practical application of NLP in the efficient processing and summarization of large-scale news data	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Natural Language Processing (NLP)	en_US
dc.subject	Online Platform	en_US
dc.subject	Algorithms	en_US
dc.subject	Data Science	en_US
dc.title	Extractive text rank-based NPL news summarization for multiple domain	en_US
dc.type	Other	en_US