Crawlets: agents for high performance web search engines

By

http://dx.doi.org/10.1007/3-540-45647-3_9

Abstract

Some of the reasons for unsatisfactory performance of today’s search engines are their centralized approach to web crawling and lack of explicit support from web servers. We propose a modification to conventional crawling in which a search engine uploads simple agents, called crawlets, to web sites. A crawlet crawls pages at a site locally and sends a compact summary back to the search engine. This not only reduces bandwidth requirements and network latencies, but also parallelizes crawling. Crawlets also provide an effective means for achieving the performance gains of personalized web servers, and can make up for the lack of cooperation from conventional web servers. The specialized nature of crawlets allows simple solutions to security and resource control problems, and reduces software requirements at participating web sites. In fact, we propose an implementation that requires no changes to web servers, but only the installation of a few (active) web pages at host sites.

BibTeX

@inproceedings{conf/ma/ThatiCA01,
author = "Thati, Prasannaa and Chang, Po-Hao and Agha, Gul",
editor = "Picco, Gian Pietro",
title = "Crawlets: Agents for High Performance Web Search Engines",
booktitle = "Mobile Agents",
crossref = "conf/ma/2001",
ee = "http://dx.doi.org/10.1007/3-540-45647-3_9",
keywords = "formal methods, multi-agent systems, web computing,
p2p",
pages = "119-134",
year = "2001",
}

@proceedings{conf/ma/2001,
editor = "Picco, Gian Pietro",
title = "Mobile Agents, 5th International Conference, MA 2001
Atlanta, GA, USA, December 2-4, 2001, Proceedings",
isbn = "3-540-42952-2",
publisher = "Springer",
series = "Lecture Notes in Computer Science",
volume = "2240",
year = "2002",
}