Adaptive Web caching considers the problem of global data dissemination [136] [89]. It deals in particular with the problem created by the ”hot spot” phenomenon where specific content can, suddenly, become massively popular and high in de- mand. Adaptive web caching proposes an approach based on multicast to resolve the increasing scale of data dissemination. In that context, when multiple users are
USER−1 USER−2 G1 G2 G3 G4 G5 G6 G7 C2 C1 C3 C4 C5 C6 C7 Data source
Figure B.1: An example of multicast caching
interested in the same data, a copy of data can be obtained from the original server and then forwarded using muticast to all interested users.
Data content requests on the Internet, however, is asynchronous because different users request web content at different times. Therefore, its proposed to use caches to multicast data dissemination. Adaptive caching consists of multiple distributed cache groups called meshes. These meshes are organized based on content, demand, and also include web servers. In this architecture web servers and caches are orga- nized into multiple, overlapping multicast groups, like shown in the figure B.1. When a user requests a data object it sends the request to a nearby cache. If it does not find the requested data object in its local cache it multicasts the request to a nearby local group of which it is a member; if some cache in rhe group has the requested data it multicasts the requested data object and the initial cache will forward a copy back to the user. However, in case of a cache miss within the local group, the cache joins more than one multicast group, so that all the cache groups heavily overlap each other. When there is a cache miss in one group each cache of the current group checks to see if its other group lies in the direction towards the originating server of the requested Web document. When a cache finds itself in the right position to forward the request it also informs the current group when doing so.
In case the second cache group has a miss again the request will be forwarded further following the same rules. Proceeding in this fashion the request either reaches a cache group with the data object, or otherwise is forwarded through a chain of overlaping cache groups between the client and the originating server until it reaches the group that includes the original server of the requested data object. Once the request reaches a group in which one or more servers have the requested data, the node
holding the page multicasts the response to the group.
In order for this caching infrastructure to be scalable the organization of Web caches into overlapping groups must be self-configuring. Self-organizing algorithms and pro- tocols are proposed that allow cache groups to dynamically adjust themselves ac- cording to changing conditions in network topology, traffic load, and user demands. Thus, the adaptive caching uses the Cache Group Management Protocol (CGMP) and the Content Routing Protocol (CRP). CGMP specifies how meshes are formed and how individual caches join and leave those meshes. In general, caches are orga- nized into overlapping multicast groups which use voting and feedback techniques to estimate the usefulness of admitting or excluding members from that group. The ongoing negotiation of mesh formation and membership results in a virtual topol- ogy. CRP is used to locate cached content from within the existing meshes. CRP takes advantage of the overlapping nature of the meshes as a means of propagating object queries betw een groups as well as propagating popular objects throughout the mesh.
An important assumption of the adaptive caching approach is that the deployment of cache clusters across administrative domains is not an issue. If the topologies are more flexible the administrative cache policies must be relaxed so that groups form naturally in different locations of the network.
LSAM
The LSAM Large Scale Active Middleware [119] is an architecture that uses a self- organizing multicast push based on interest groups. The LSAM proxy is deployed near both clients and servers. Near the client the proxy acts as an intelligent cache, allowing multicast channels to preload it with relevant pages. Near the server the proxy acts as an intelligent pump, managing multicast groups and detecting page affinities to multicast related information to a set of interested client caches.
The LSAM uses multicast for automated push of popular web pages. LSAM proxies are deployed as a server pump and a distributed filter hierarchy. These components automatically track the popularity of web page groups, and also automatically man- age server push. In this architecture web pages are organized in affinity groups in relationship with their popularity. Individual requests trigger multicast responses when these pages are members of active affinity groups. A request is checked at intermediate proxies and forwarded to the server. The response is multicast to the filters in the group by the pump and unicast from the final proxy back to the orig- inating client. Subsequent requests are handled locally from the filters near the clients.
IMPPS
The IMPPS Intelligent Multicast Push and Proxy System [83] is a system for inter- active multicast cache running both at the end-user location and at the base station. IMPPS uses reliable multicast instead of TCP/IP to interactively request and reply web based content. Furthermore, IMPPS can be used to keep the web caches up to date by pushing fresh and popular web contents. This architecture is proposed in the context of LMDS (Local Multipoint Distribution Service) [72], a broadband wireless access technology that provide two-way transmission of data and multimedia. Requests from the clients are received by the IMPPS proxy-cache through unicast IP. If the data is not available the request is forwarded via reliable multicast to the authenticated remote IMPPS proxy-caches. The remote IMPPS proxy/cache checks whether the requested web object can be served from the remote IMPPS proxy-cache or whether it has to be fetched from the Internet. In case of a local hit, the remote IMPPS proxy-cache sends the web data object back via multicast. If the web data object is not cached, it is requested from a nearby proxy via multicast. IMPPS proxies/caches communicate with each other using the MCP (Multicast Cache Protocol) that is provided by a specific transport multicast protocol that replaces the IP multicast. The IMPPS proposes the utilisation of push tecniques to disseminate fresh web content via multicast into the web-caches of the end-users.