Design And Develop A Web Crawler For Web

Brinto, Mahzabin Ferdous

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF SOFTWARE ENGINEERING
→
Project Report
→
View Item

Design And Develop A Web Crawler For Web

Brinto, Mahzabin Ferdous

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/9632

Date: 22-12-15

Abstract:

A web crawler is a computer program or piece of software that systematically and consistently browses the Internet and downloads web content.A web crawler automatically detects and organizes resources from the internet in an organized manner in accordance with user needs.The web crawler is a equipment that is essential to search engine optimization. The crawler of a search engine may browse webpages and gather important links from the internet, it will evaluate each web page's significance based on metrics such as the number of pages that link to it, Search engines would maintain records of the webpages their web crawlers had visited and crawled after indexing. Your website's pages won't appear in search results if they aren't indexed.In a short period of time, the crawler contacts millions of websites, consuming a sizable amount of network, storage, and memory in the process. Therefore, web crawlers are becoming more and more prominent over time. While scalability and robustness were previously the focus of research, Intersecting sub-problems, lesser scalability, increasing runtime and delayed network loading, low load balancing rate, lower rate of failure tolerance, etc. are some research gaps in this area. In this paper, an effort to cope with failure is the major focus is on deployment and tolerance between internal and external links. The crawling programmatic approach and its different phases are only explained briefly in a small number of papers. In this paper, we addressed the implementation of web crawling's underlying knowledge, to make it more easier for the audience to understand. Keywords : Web crawler, Search Engines, Web pages, web spider.

Show full item record