Computer Science and
     Software Engineering

Computer Science and Software Engineering

Crawlers, Agents, and Web Retrieval

Ricardo Baeza-Yates

Universidad de Chile

Fri Mar 12 15:10:00 NZDT 2004 in Room 031, MSCS

Abstract

Agents are everywhere. They are embedded in many systems with different goals, such as personalization, adaptation, negotiation, etc. Crawlers, on the other hand, are a simple type of programs that wander the Web every day collecting data for search engines. Crawlers, however, are becoming more complex. But, can they be categorized as agents? In this talk we survey crawlers and we relate them to generic agents by analyzing their main goals, i.e., page quality, page quantity, and page freshness. As the Web grows and changes every day, crawlers have become the main bottleneck for Web retrieval. It becomes interesting then, to explore how agent technology or variants of it could help develop search engines that are more effective, efficient, and scalable. We present several schemes on how the Web server and the search engine could cooperate to help each other. This raises the issue of what is the impact of agents in the information market. We describe the weaknesses of this approach and how they can be partially addressed, opening the door to many research problems.


View past or future seminars; or view the CSSESS Home Page.