In this paper we present our investigation on developing a service overlay network on top of CDNs for delivering content-oriented web services. Our goal is to make the rich content services part of the Internet infrastructure services accessible to content providers and content consumers. A prototype system is reported in the paper.
Content delivery network, web services, adaptive content delivery, personalization
As the Internet is moving toward a service-centric model, more and more storages and computational resources are being put into the Internet infrastructure and provided as services to customers. In the content networking world, for instance, this trend can be seen on the development of content delivery networks (CDNs)  which make content distribution a network infrastructure service available to content providers and network access providers. On the other hand, the progress on standardization and development of web services based on XML, SOAP, UDDI, and WSDL has marked the beginning of a new era that every computational resource and service on the Internet can be connected to provide new user experiences on accessing, sharing, and using information anytime, anywhere, from any device.
In this paper, we present our recent progress on bridging the content networking world and the web services world based on our previous work on content services network . Our goal is to build a service overlay network on top of content delivery networks to provide value-added services for content while it is flowing through the Internet. We began our work by focusing on services such as content adaptation for universal access and content personalization, but the framework we describe is extensible to other content-oriented services.
To achieve this goal, there are several technical challenges which need to be tackled. First of all, service-enabled web caches which do things beyond than caching have to be deployed within CDNs. Second, we need to have a way to instruct these service-enabled web caches to perform desirable actions. These actions include performing local light-weight processing and communicating with other network entities such as application servers to provide more complicated services.
Our system constitutes two layers of network infrastructures: content delivery overlay (i.e., CDNs) and service delivery overlay. The content delivery overlay constitutes a network of service-enabled web caches which extend the functionalities of traditional web caches for performing value-added processing. The service delivery overlay consists of a large number of application servers which act as remote call-out servers for service-enabled web caches. They are managed by so called services delivery and management (SDM) servers . These two overlays work together to provide content-oriented web services to content.
The service-enabled web cache is a key component to enable content delivery overlay to interact with the service delivery overlay. Figure 1 shows the service-enabled web cache, which consists of six main parts: an instruction parser, a message parser, an instruction processor, a service execution module, an instruction cache, and a result cache.
Figure 1. The service-enabled web cache.
The basic operations of the service-enabled web cache are as follows:
Before a service becomes available, it needs to be registered in the UDDI registry first. The received components such as service specifications and binaries from service providers are stored the service database. It then publishes the service information to the UDDI for public discovery and access. Along with service registration and publication, the corresponding application or executable module is provided to an SDM server, which then dynamically selects a set of edge servers to deploy the service based on estimated demand and required geographical coverage.
One fundamental requirement to enable content-oriented web services in the content delivery path is determining whether current content (e.g. HTTP messages) should be serviced and how to invoke the service. A straightforward approach is to label the content (e.g. HTTP messages) to indicate that it needs special handling. A service-enable web cache that intercepts the labeled content will take appropriate actions according to the instruction. For content labeling, there are two choices: 1) embed service instructions directly into the content (e.g. HTTP headers), or 2) attach an indicator (URI) to the content which refers to an external document for service instruction. However, both methods require a change on origin servers or user agents, a significant obstacle for service deployment.
Another approach is to let the subscriber and the service provider determine on the service contract, and then let the system convert that contract into a service binding. A service binding describes the association between the subscribed services and the subscriber (e.g. the domain name of the subscriber if it is a content provider). Service bindings are maintained by the service delivery overlay which constantly updates the service-enabled web caches with the latest service instructions.
Content providers and content consumers find their interested services in the UDDI registry first, and then follow the access point to the service subscription interface. The resulting service binding is stored in the database. This binding enables the future delivery of the subscribed services to the content.
After service subscription and binding, service instructions will be generated. The service instructions are transferred from the SDM servers to the service-enabled web caches that the subscriber is associated with. The service-enabled web cache determines if a message needs services according to the instructions. If the message satisfies the condition specified in an instruction, the service-enabled web cache will provide the service locally or execute a remote callout service to the application servers.
We have built a prototype system called Media Companion to evaluate our ideas. This system is currently deployed in the corporate network of Microsoft Research Asia.
The service-enabled web cache is built based on Microsoft Internet Security and Acceleration (ISA) Server 2000. We use the HTTP port to upload service instructions from the SDM servers to the ISA server. In our prototype, an instruction module contains instructions for a content provider or a content consumer. A newly uploaded module will completely replace the old one. An instruction module typically contains several instructions. We use Microsoft Visual Studio .NET to develop service binding and enabling. The service interface was written in ASP.NET and C# to receive the requests from the subscriber. The service database which stores the binding relationships between subscribers and services was implemented using Microsoft Access. The service database also stores other service-related information such as execution conditions, parameters and location.
Five different content-oriented services are available in our corporate network. They are web-page adaptation, video summarization (as shown in Figure 2), face extraction, personalized content insertion and language translation.
Figure 2. Adaptive video delivery to mobile devices.
In this paper we presented our work on delivering content-oriented web services to Internet media. We described the framework of our system which consists of a new service delivery overlay and the service-enabled web cache in the content delivery overlay. The services are enabled based on a subscription model which allows content providers and content consumers to easily leverage the edge resources to deliver or access information more effectively.