TR-IIS-05-015 Fulltext
Design and Implementation of Domain-Based Proxy Prefetching
Ray-I Chang, Jan-Ming Ho
Abstract
Users are usually interested in some specific domains
while surfing the Internet. Based on such domain-preferential browse-behavior,
the Domain-Top (DT) proxy
prefetching method is proposed. DT uses the popular pages in the same popular
domain to model users’ future demands. If there is a request for any one
of the pages in the popular domain, the popular pages in the same domain are
considered as its future demands and will be prefetched. The development of
DT prefetching is based on a hypothesis that the browse-behavior is always domain-preferential.
However, clients may explore the Internet aimlessly and will aceess different
domains in the near future. Analyzing proxy logs without considering diverse
browse-behavior may acquire wrong anticipation in prefetching. This paper proposes
the DTC (DT prefetching with Classification) method that tries to improve DT
prefetching by removing unreliable logs. DTC adopts the concept of entropy to
discriminate the browse-behavior from "domain mode" and "exploratory
mode". Only access logs in domain mode are considered in calculating the
popular domains. Different from DT that considers a constant number of popular
pages in prefetching, we ssign each domain a suitable number of popular pages.
Experiments on real traces show that the proposed DTC method can achieve higher
hit ratio than that of the DT method. As DTC utilizes only the historical logs
to offline decide the popular pages and the popular domains for prefetching,
only few function modules on the present proxy need to be revised. It imposes
small burden and can be easily implemented in Squid -- the most famous open
source proxy server.
Keywords: proxy caching, web prefetching, open source software