News Column

Patent Application Titled "Predictive Cache Replacement" Published Online

June 3, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- According to news reporting originating from Washington, D.C., by VerticalNews journalists, a patent application by the inventor Frachtenberg, Eitan (Palo Alto, CA), filed on November 12, 2012, was made available online on May 22, 2014.

No assignee for this patent application has been made.

Reporters obtained the following quote from the background information supplied by the inventors: "Caching is a mechanism that accelerates access (i.e., reduces latency) to data on slower storage media by managing a subset of the data in a smaller, faster, and typically more expensive storage medium. As a result, the average latency of the data access will be between the latency of the slower and faster storage. Caches come in many shapes and forms. For example, caches can be embodied in hardware (such as CPU caches) and/or software (such as Memcached). In some cases, caches can also be layered across several storage layers or tiers.

"Typically, data access is not uniformly random. Instead, data access often has spatial and/or temporal locality properties. Spatial locality refers to accessing data within a relatively close or similar location to other data that has previously been accessed. Temporal locality refers to the reuse of data that has previously been accessed within a relatively close time period. These locality properties give a measure of predictability to the data that will be accessed, allowing only those data elements predicted to be recalled soon to be stored in the smaller cache.

"The ratio of all data accesses that can be served by the cache is called the hit ratio, and is one of the main metrics of success of a cache. Hit ratios can have a significant impact on performance and, as a result, have high economic implications. While there have been many efforts to come up with better ways to determine which items to store in the cache and which to evict to make room for more likely items, the traditional caching policies have been typically based on coarse locality metrics (e.g., oldest items are always evicted). Consequently, data which is more likely to be accessed may be evicted. As such, there are a number of challenges and inefficiencies found in traditional caching policies."

In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventor's summary information for this patent application: "Systems and methods are described for predictive cache replacement policies. In some embodiments, a method includes counting data access observations that fall into disjoint categories. The disjoint categories may be multi-dimensional in some cases (e.g., time of day and region). A prediction model can be dynamically built to predict future access patterns based on the data access observations. Using this prediction model, a determination can be made as to which data should be evicted from a cache. In some embodiments, building the prediction model includes dynamically building a tree data structure representing the data access observations. The tree data structure can include a plurality of nodes that are added at each time interval and include a cumulative distinct key count. The nodes added at the current time interval can include a current key count. When the current key count is zero, a data access pattern associated with a key associated with the current key count of zero can be recorded.

"In some embodiments, the data access observations can be represented with a string pattern by assigning a distinct letter to each data access that corresponds to one of the disjoint categories. Then, the string pattern can be analyzed using a compression algorithm. The string may preserve temporal relations of the data observations.

"Some embodiments provide for a computer-implemented method that includes generating a histogram of recorded data access observations that fall into disjoint categories representing time intervals (e.g., uniform or non-uniform intervals) since insertion of data into a cache. A set of access patterns can be generated that can be used to predict a probability of accesses on future data stored in the cache. In some embodiments, the set of access patterns are captured based on the disjoint categories and the histogram of recorded data access observations. By identifying the probability of access on the future data as being lower than any other data based on the set of access patterns, a determination can be made as to which data should be evicted from the cache.

"In some embodiments, each element in a data set (e.g., data representing access patterns of data within a cache) can be mapped into one of a plurality of disjoint bins or categories. A string representing the data set can then be dynamically mapped by assigning one or more letters to each of the plurality of disjoint bins or categories. As a result, each element in a data set is represented with the letter assigned to the disjoint bin or category each element was mapped into. A pattern matching analysis can then be performed based on the string. The pattern matching analysis can include a Markov Chain model and/or partial pattern matching by using a pattern's prefix to predict a suffix.

"Embodiments of the present invention also include computer-readable storage media containing sets of instructions to cause one or more processors to perform the methods, variations of the methods, and other operations described herein.

"Various embodiments of the present invention can include a system having a processor, memory, database, cache, observation engine, prediction module, eviction module, histogram generator, pattern generator, string converter, and/or other modules or components. In some embodiments, the cache can have data stored thereon. The observation engine can be configured to record data access observations regarding how the data is accessed within the cache. The prediction module can dynamically build a statistically aggregated prediction model to predict future access patterns based on the data access observations over time. The eviction module can be configured to evict data from the cache which has the lowest probability of future access as predicted by the statistically aggregated prediction model.

"In some embodiments, the histogram generator can be configured to generate a histogram based on the data access observations. The pattern generator can be configured to build a set of access patterns by creating a tree data structure that is updated over time by adding a new set of nodes having a current key count representing a number of data access observations recorded during a current time interval. Each node in the new set of nodes branch (e.g., representing disjoint categories) from previous nodes associated with the previous time interval having a cumulative key count. The string converter can be used to map the data access patterns to a string of characters. In some cases, the string of characters preserves temporal relations of the data access patterns.

"While multiple embodiments are disclosed, still other embodiments of the present invention will become apparent to those skilled in the art from the following detailed description, which shows and describes illustrative embodiments of the invention. As will be realized, the invention is capable of modifications in various aspects, all without departing from the scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not restrictive.

BRIEF DESCRIPTION OF THE DRAWINGS

"Embodiments of the present invention will be described and explained through the use of the accompanying drawings in which:

"FIG. 1 illustrates an example of a networked-based environment in which some embodiments of the present invention may be utilized;

"FIG. 2 is a block diagram with a set of components that may be used in a storage system in accordance with various embodiments of the present invention;

"FIG. 3 is a flowchart with a set of operations for evicting data in accordance with one or more embodiments of the present invention;

"FIG. 4 is a flowchart with a set of operations for generating a histogram-based tree data structure that can be used in accordance with some embodiments of the present invention;

"FIG. 5 illustrates examples of data access patterns in accordance with various embodiments of the present invention;

"FIG. 6 is a histogram generated based on the data access patterns illustrated in FIG. 5;

"FIG. 7 is an illustration of a dynamically generated histogram-based tree data structure that can be used in some embodiments of the present invention;

"FIG. 8 is a flowchart with a set of operations for performing a compression analysis on a data set in accordance with various embodiments of the present invention;

"FIG. 9 is a block diagram of a system architecture of a social networking system with which one or more embodiments of the present invention may be utilized; and

"FIG. 10 illustrates an example of a computer system with which some embodiments of the present invention may be utilized.

"The drawings have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be expanded or reduced to help improve the understanding of the embodiments of the present invention. Similarly, some components and/or operations may be separated into different blocks or combined into a single block for the purposes of discussion of some of the embodiments of the present invention. Moreover, while the invention is amenable to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and are described in detail below. The intention, however, is not to limit the invention to the particular embodiments described. On the contrary, the invention is intended to cover all modifications, equivalents, and alternatives falling within the scope of the invention as defined by the appended claims."

For more information, see this patent application: Frachtenberg, Eitan. Predictive Cache Replacement. Filed November 12, 2012 and posted May 22, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=512&p=11&f=G&l=50&d=PG01&S1=20140515.PD.&OS=PD/20140515&RS=PD/20140515

Keywords for this news article include: Patents, Information Technology, Information and Data Architecture.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters