News Column

Patent Issued for Optimizing Multi-Hit Caching for Long Tail Content

February 13, 2014

By a News Reporter-Staff News Editor at Computer Weekly News -- A patent by the inventors Khakpour, Amir (Los Angeles, CA); Peters, Robert J. (Santa Monica, CA), filed on December 19, 2012, was published online on January 28, 2014, according to news reporting originating from Alexandria, Virginia, by VerticalNews correspondents.

Patent number 8639780 is assigned to EdgeCast Networks, Inc. (Santa Monica, CA).

The following quote was obtained by the news editors from the background information supplied by the inventors: "Content delivery networks (CDNs) have greatly improved the way content is transferred across data networks such as the Internet. One way a CDN accelerates the delivery of content is to reduce the distance that content travels in order to reach a destination. To do so, the CDN strategically locates surrogate origin servers, also referred to as caching servers or edge servers, at various points-of-presence (PoPs) that are geographically proximate to large numbers of end users and the CDN utilizes a traffic management system to route requests for content hosted by the CDN to the caching server that can optimally deliver the requested content to the requesting end user. As used hereafter optimal delivery of content refers to the most efficient available means with which content can be delivered from a server to an end user machine over a data network. Optimal delivery of content can be quantified in terms of latency, jitter, packet loss, distance, and overall end user experience.

"Determination of the optimal caching server may be based on geographic proximity to the requesting end user as well as other factors such as load, capacity, and responsiveness of the caching servers. The optimal caching server delivers the requested content to the requesting end user in a manner that is more efficient than when origin servers of the content provider deliver the requested content. For example, a CDN may locate caching servers in Los Angeles, Dallas, and New York. These caching servers may cache content that is published by a particular content provider with an origin server in Miami. When a requesting end user in San Francisco submits a request for the published content, the CDN will deliver the content from the Los Angeles caching server on behalf of the content provider as opposed to the much greater distance that would be required when delivering the content from the origin server in Miami. In this manner, the CDN reduces the latency, jitter, and amount of buffering that is experienced by the requesting end user. The CDN also allows the content provider to offload infrastructure costs, configuration management, and maintenance to the CDN while being able to rapidly scale resources as needed. Content providers can therefore devote more time to the creation of content and less time to the creation of an infrastructure that delivers the created content to the end users. As a result of these and other benefits, many different CDNs are in operation today. Edgecast, Akamai, Limelight, and CDNetworks are some examples of operating CDNs.

"CDNs differentiate themselves on the basis of cost and performance. One area in which CDNs strive to improve in terms of cost and performance is caching. However, it is often the case that improved caching performance begets increased costs. For example, a CDN can deploy additional storage to each of its caching servers at added cost in order to increase the amount of available cache at each of its caching servers. Similarly, the CDN can deploy more expensive solid state disks (SSDs) in its caching servers instead of cheaper magnetic disk in order to improve responsiveness of its caching servers.

"To avoid these tradeoffs in cost and performance, CDNs and other cache operators are continually in search of new caching techniques, devices, etc. that improve caching performance without added cost. One such area of focus is the efficiency with which existing caching servers cache content.

"CDNs typically utilize first hit caching to determine when to cache content. First hit caching has been preferred because of its simplicity and relative good performance. When performing first hit caching, a caching server will retrieve requested content from an origin, pass the retrieved content to the requesting end user, and store the content to local cache when the content is requested for the first time. The next time that content is requested, the caching server will retrieve and serve the content from its local cache rather than from the origin.

"However, first hit caching performance is greatly affected by caching of 'long-tail' content. As a result, first hit caching yields suboptimal resource utilization. FIG. 1 illustrates the long-tail distribution of content for purposes of explaining its impact on first hit caching.

"In FIG. 1, the x-axis represents content that is requested at a caching server over an interval of time. The y-axis represents the number of requests for each item of content during that interval of time. As shown, some percentage of 'hot' content 110 is requested frequently and some percentage of content, also referred to as the 'long-tail' content 120, is infrequently requested (i.e., once or a small number of times). A caching server performing first hit caching caches all such content the first time it is requested. In so doing, caching servers with scarce cache availability may replace hot content with long-tail content in cache. This in turn increases cache miss rates. This issue can be resolved with added cost to the caching server operator by increasing the available storage at each cache server. Doing so however introduces other inefficiencies and performance degradations that result from caching of long-tail content. Specifically, long-tail content is rarely, if ever, served from cache. Consequently, a caching server wastes resource intensive write operations to cache long-tail content and to purge long-tail content from cache when the content expires. Such extraneous write operations could potentially degrade the responsiveness of the caching server by introducing delay when having to respond to other operations. Such extraneous write operations reduce the ability of the caching server to handle greater loads. Such extraneous write operations also reduce the useful life for the storage hardware at the caching server. Specifically, magnetic disk drives are more likely to suffer mechanical failure sooner and SSDs are more likely to suffer from failing memory cells sooner when performing the extraneous writes associated with caching the long-tail content. Further still, increased disk fragmentation results at the caching server because of the additional writing and purging of the long-tail content. Such disk fragmentation has been shown to slow access to content and thereby degrade caching performance.

"To avoid these and other shortcomings associated with first hit caching and, more specifically, the shortcomings associated with caching long-tail content, some CDNs have utilized second hit caching or multi-hit caching that cache content when it is requested two or more times. This avoids caching some of the long-tail content that is requested only once or a few times. However, these multi-hit caching techniques suffer from other shortcomings that reintroduce the tradeoff between performance and cost. Some such shortcomings include increased processor and memory overhead needed to track content request counts, to track when to cache content, and to track what content has been cached. For example, some existing multi-hit caching techniques store the uniform resource locators (URLs) or textual names of the content being requested in conjunction with the number of times that content is requested, thereby imposing onerous memory overhead. As another example, some existing multi-hit caching techniques identify whether content is cached or has been previously requested one or more times with a sorted list or similar structure where the searching of such a structure imposes log(n) complexity and onerous processing overhead as a result. These inefficiencies and overhead increase latency, access times, and overall responsiveness of the caching server, thus offsetting the performance gains that are realized from avoiding caching long-tail content.

"Moreover, some second hit caching or multi-hit caching techniques impose added cost in the form of infrastructure modifications and additions that are needed to maintain content request counts and where content is cached. For example, United State Patent Publication 2010/0332595 entitled 'Handling Long-Tail Content in a Content Delivery Network (CDN)' introduces a new server, referred to as a popularity server, into existing infrastructure to track the number of times content is requested. In addition to the added costs for deploying and maintaining the popularity server, the centralized framework also introduces performance reducing delay as a result of the communication that occurs between the caching servers and the popularity server.

"Accordingly, there is a need to improve CDN performance without increased cost. One specific area of need is to improve cache performance without increasing cost and without offsetting other areas of performance. Specifically, there is a need for an optimized multi-hit caching technique that avoids the performance impact that long-tail content has on cache performance while still achieving similar performance as first hit caching in terms of identifying what content to cache and identifying whether content is cached."

In addition to the background information obtained for this patent, VerticalNews journalists also obtained the inventors' summary information for this patent: "It is an object of the embodiments described herein to provide an optimized multi-hit caching technique. More specifically, it is an object to minimize the effect of long-tail content on cache performance while retaining much of the efficiency and minimal overhead that is associated with first hit caching in determining when to cache content and in determining which content has been cached. Stated differently, it is an object to minimize the performance impact associated with caching of long-tail content without imposing onerous processing and memory overhead to track and identify content request counts and cached content. In satisfying these objects, it is an object to reduce (when compared to first hit caching techniques) the number of writes and purges that are performed by the caching server, thereby (1) extending the life of the computer readable storage medium of the caching server, (2) improving latency and access times of the caching server by freeing the caching server from performing extraneous write and purge operations, (3) reducing the amount of storage needed at the caching server to maintain a sufficiently high cache hit rate and a sufficiently low cache miss rate, and (4) reducing the disk defragmentation that occurs at the caching server. The optimized multi-hit caching technique is intended for execution by caching servers operating in a distributed environment such as a content delivery network (CDN) whereby each of the caching servers of the CDN can perform the optimized multi-hit caching independently without a centralized framework. It should be apparent that the optimized multi-hit caching technique is also applicable to any server that performs caching in an intranet, wide area network (WAN), internet, or with other communicably coupled set of networked devices.

"In some embodiments, the optimized multi-hit caching technique performs N-hit caching whereby content is cached to a computer readable storage medium of a caching server when that content is requested N times, wherein N is an integer value greater than one. In some embodiments, the optimized multi-hit caching technique efficiently tracks the number of requests using N-1 bit arrays when performing N-hit caching. In some embodiments, the optimized multi-hit caching technique is interval restricted such that content is cached when the requisite number of requests for that content is received within a particular specified interval. Such N-hit interval restricted caching avoids much of the performance impact that is associated with caching of long-tail content, where long-tail content is content that is requested less than N times during the particular specified interval. The performance impact associated with caching long-tail content includes (1) greater utilization of the computer readable storage medium thereby necessitating that each caching server be deployed with a larger storage medium or be subject to greater thrashing (content replacement), (2) increased load on the caching server as a result of having to perform additional write and purge operations to cache the long-tail content, (3) less uptime for the caching server as the storage medium of the caching server is more likely to suffer failure because of greater utilization, and (4) decreased performance for the caching server as a result of greater disk fragmentation.

"To simplify the discussion, the optimized multi-hit caching technique will be described by example of a second hit caching implementation that is interval restricted. To efficiently implement such second hit caching and avoid much of the processing and memory overhead associated with tracking the number of times content has been requested and performing the lookup associated therewith, some embodiments utilize hashing in conjunction with a single-bit bit array or bitmap. For an N-hit implementation of the optimized caching technique, N-1 bit arrays may be used.

"In some embodiments, the optimized caching technique is performed when a request for content (i.e., content request) is received at a caching server. A lookup in cache determines if the requested content is already cached. If so, the content is served from cache. Otherwise, hashing is performed to convert an identifier that is extract from the content request (e.g., filename, URL, etc.) into a set of bit indices that uniquely identify the content request according to the positions of the produced bit indices in the bit array. More specifically, the caching server extracts from the content request an identifier for identifying the content being requested. The caching server uses the extracted identifier as input to each hash function of a set of hash functions. Each hash function produces an index at a particular position of the bit array and the collective set of produced indices and their corresponding positions in the bit array uniquely represent the content being requested. Each produced index is then compared with indices previously entered in the bit array. When the corresponding bit array indices representing the requested content are not set in the bit array, the bit array identifies the content request as the first request or first hit for such content. Accordingly, the requested content is retrieved from an origin and served to the requestor without caching to avoid caching on the first hit. The bit array indices representing the requested content are also populated in the bit array to record the first hit for such content. When the corresponding bit array indices representing the requested content are already set in the bit array, the requested content is retrieved from an origin, forwarded to the requestor, and cached as the current request will be indicative of the second request or second hit for the content. In some embodiments, a second bit array is used in conjunction with second hit caching to efficiently perform the lookup in cache.

"As noted above, the optimized multi-hit caching is interval restricted in some embodiments to further mitigate the performance impact that is associated with caching long tail content. This is because in traditional multi-hit caching techniques that are not interval restricted, the Nth hit to cache the long-tail content is an eventuality and given an infinite duration, such traditional multi-hit caching techniques will receive the Nth request for the long-tail content and therefore cache the long-tail content. By restricting the interval with which the requisite Nth hit occurs before caching content, the effectiveness of multi-hit caching in avoiding the performance impact associated with long-tail content is increased. Specifically, the multi-hit caching is optimized to define long-tail content in terms of at least two dimensions that include 1) a requisite number of N hits and 2) a particular specified duration. For example, given a ten second interval and second-hit caching, the optimized multi-hit caching technique caches content that is requested at least twice during a particular ten second interval. Content that is requested less than two times in each ten second interval is considered long-tail content. To restrict the interval for the optimized multi-hit caching technique, some embodiments flush or clear the bit array at periodic intervals or upon defined events.

"Using the bit indices to represent requested content eliminates much of the onerous memory overhead that is associated with storing URLs, filenames, or other identifiers for the requested content. The bit array in conjunction with the bit array indices representing requested content allows the hit count for all content requests to be tracked using a single fixed sized storage structure. Moreover, hashing enables searching of the bit array in constant time to determine if requested content has not yet been requested or has been requested at least once.

"The hashing and bit array are consistent with a standard bloom filter implementation. However, the standard bloom filter is not suited for purposes of content caching. This is because the standard bloom filter, or more specifically the array of the standard bloom filter, does not provide functionality to remove indices representing one particular piece of content from the array without affecting identification of other content that may be represented with one or more indices overlapping with the indices representing the particular piece of content. As the array of the standard bloom filter is continually populated with new indices and stale indices are not removed, the ratio of false positives increases, thereby lessening the accuracy and effectiveness with which the standard bloom filter identifies content request counts over time. Furthermore, simply flushing the array of the standard bloom filter causes request counts for relevant or actively monitored content to be lost in addition to request counts for stale or expired content. This loss of information can lead to N+1 hit caching when N-hit caching is being performed.

"Accordingly to perform the optimized multi-hit caching using a set of hash functions with at least one bit array that is interval restricted and periodically flushed, some embodiments implement a proprietary modified bloom filter. The modified bloom filter, also referred to as a rolling flushed bloom filter, stores a copy of the bit array prior to flushing or clearing each of the indices of the bit array at specified intervals. A copy of the bit array is made to avoid losing track of content that was requested during the previous interval. In some such embodiments, the bit indices representing requested content are thus compared against a previous copied state of the bit array and a current state of the bit array (1) to avoid caching of long-tail content that is not requested a requisite number of times during the previous and current intervals, (2) to ensure the effectiveness of the bit array in accurately representing content request counts by reducing the possibility of a false positive by flushing stale bit indices representing long-tail content from the bit array, and (3) to avoid the potential for N+1 hit caching.

"Some embodiments of the optimized multi-hit caching technique further incorporate tiered caching to negate the load increase that an origin would otherwise experience when performing N-hit caching in place of first hit caching. The optimized multi-hit caching with tiered caching uses a first cache tier that performs the optimized multi-hit caching using the modified bloom filter and a second cache tier that performs first hit caching. In this manner, optimized multi-hit caching is performed to avoid the performance impact of long-tail content with the load to the origin being the same as what the origin would experience if only first hit caching was performed."

URL and more information on this patent, see: Khakpour, Amir; Peters, Robert J.. Optimizing Multi-Hit Caching for Long Tail Content. U.S. Patent Number 8639780, filed December 19, 2012, and published online on January 28, 2014. Patent URL:

Keywords for this news article include: EdgeCast Networks Inc.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC

For more stories covering the world of technology, please see HispanicBusiness' Tech Channel

Source: Computer Weekly News

Story Tools