News Column

Patent Application Titled "On-Disk Multimap" Published Online

August 5, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- According to news reporting originating from Washington, D.C., by VerticalNews journalists, a patent application by the inventors Kirazci, Ulas (Mountain View, CA); Banachowski, Scott (Mountain View, CA), filed on August 20, 2013, was made available online on July 24, 2014.

The assignee for this patent application is Google Inc.

Reporters obtained the following quote from the background information supplied by the inventors: "Computing devices use data structures to efficiently store and access data. Different data structures may be better at handling large amounts of data, accessing data, or minimizing storage space. Some data structures are a collection of key-value pairs where the computing device uses the key to access one or more values that correspond to the key.

"A multimap is a data structure that stores key-value pairs. The data type for each key and value can vary according to the type of multimap, but can include strings and integers. Each key can map to more than one value. For example, the key 'chocolate' can map to both 'cake' and 'pudding' such that the multimap will return both values 'cake' and 'pudding' when given the key 'chocolate.'

"A trie is a data structure that stores key-value pairs. In a trie, the keys and the values can be strings or integers, and each key can only map to one value. Some tries, such as the ones discussed in this application, may limit the values to integers. A trie is tree structure such that a root node is linked to a number of nodes and each of those nodes can link to other nodes. Linked nodes contain keys. Nodes can also link to leaves. A leaf is a type of node that contains a value that corresponds to a key. Leaves do not link to additional nodes. Each node contains one character, and each leaf contains an integer. For example, to access the value for the key 'chocolate,' a computing device traverses the nodes of the trie that contain the letters 'c,' 'h,' 'o,' 'c,' 'o,' 'l,' 'a,' 't,' and 'e.' The node 'e' links to the leaf that contains the value for '386' which is the value for the key 'chocolate.'"

In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventors' summary information for this patent application: "According to an innovative aspect of the subject matter described in this specification, a device can use a trie to store data. The key-value pairs stored in the trie are not limited to string-integer pairs, and each key can map to more than one value. To insert an original key-value pair into the trie, a computing device encodes the original key and inserts the encoded original key into the trie. The device determines a hash value of the original value and inserts the hash value of the original value as the value of the encoded original key. The device appends the value of the encoded original key, the hash value of the original value, and a fixed string and inserts the appended value of the encoded original key, hash value of the original value, and fixed string as a second key.

"To retrieve a value for a given an original key, the device encodes the original key and uses the encoded original key to retrieve a value of the encoded original key. Using the value of the encoded original key, the device determines a second key. The device traverses the trie along the nodes of the second key. Once the device reaches the end of the second key, the device continues to traverse the trie until the device reaches a leaf. The nodes between the end of the second key and the leaf is the value that matches the original key.

"In general, another innovative aspect of the subject matter described in this specification may be embodied in methods that include the actions of receiving a key-value pair including a key k and a value v; encoding the key-value pair as (i) a first key-value pair including a first key k1 and first value v1, and (ii) a second key-value pair including a second key k2; and inserting the first key-value pair and the second key-value pair in a trie.

"These and other embodiments can each optionally include one or more of the following features. The action of encoding the key-value pair as (i) a first key-value pair including a first key k1 and first value v1, and (ii) a second key-value pair including a second key k2 includes determining a length of the key k; and encoding a prefix of the first key k1, the encoded prefix of the first key k1 including a first predetermined byte; data reflecting the length of the key k; and the key k. The actions further including determining a number of values that are in the trie and that correspond to the prefix of the first key k1.

"The action of encoding the key-value pair as (i) a first key-value pair including a first key k1 and first value v1, and (ii) a second key-value pair including a second key k2 includes encoding the first key k1, the encoded first key k1 including the prefix of the first key k1; and the number of values that are in the trie and that correspond to the prefix of the first key k1. The action of encoding the key-value pair as (i) a first key-value pair including a first key k1 and first value v1, and (ii) a second key-value pair including a second key k2 includes encoding the first value v1 as a hash of the value v. The action of encoding the key-value pair as (i) a first key-value pair including a first key k1 and first value v1, and (ii) a second key-value pair including a second key k2 includes encoding a prefix of the second key k2, the encoded prefix of the second key k2 including a second predetermined byte; and a hash of the value v.

"The action of claim 1, wherein encoding the key-value pair as (i) a first key-value pair including a first key k1 and first value v1, and (ii) a second key-value pair including a second key k2 includes encoding the second key k2, the encoded second key k2 including the prefix of the second key k2; and the first value v1. The second key-value pair includes a second value v2 including a number of occurrences of the value v associated in the trie with the prefix of the first key k1. The first value v1 is an integer.

"Another innovative aspect of the subject matter described in this specification may be embodied in methods that include the actions of receiving a key k; encoding the key k; retrieving a value v1 for the encoded key k from a trie in which each key-value pairs are stored as (i) a first key-value pair including a first key and first value, and (ii) a second key-value pair including a second key; encoding a key k1 based on the value v1; accessing the trie using the encoded key k1; extracting a particular key associated with the encoded key k1; and outputting the particular key as a value associated with the key k.

"These and other embodiments can each optionally include one or more of the following features. The encoded key k includes a first predetermined byte; data reflecting a length of the key k; a delimiter; and the key k. The actions further include retrieving a second value v2 for the encoded key k from the trie; encoding a key k2 based on the value v2; accessing the trie using the encoded key k2; extracting a second particular key associated with the encoded key k1; and outputting the second particular key as a second value associated with the key k.

"The action of encoding a key k1 based on the value v1 includes converting the value v1 to hexadecimal form; and setting the encoded key k1 to the converted value v1. The action of extracting a particular key associated with the encoded key k1 includes accessing subsequent nodes of the trie after accessing the trie using the encoded key k1; and outputting the subsequent nodes as the particular key. The value v1 is an integer.

"Other embodiments of this aspect include corresponding systems, apparatus, and computer programs recorded on computer storage devices, each configured to perform the operations of the methods.

"Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages. For example, the trie can store key-value pairs where the key is a string and the value is a string. The trie can store multiple values that correspond to one key.

"The details of one or more embodiments of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

"FIG. 1 is an example computing device with a trie and illustrates an example storage and retrieval processes.

"FIG. 2 is an example trie with one key-value pair inserted.

"FIG. 3 is an example trie with two key-value pairs inserted.

"FIG. 4 is an example trie with three key-value pairs inserted.

"FIG. 5 is a flow chart of an example process to insert a key-value pair into a trie.

"FIG. 6 is a flow chart of an example process to retrieve a value from a trie using a key.

"Like reference numbers and designations in the various drawings indicate like elements."

For more information, see this patent application: Kirazci, Ulas; Banachowski, Scott. On-Disk Multimap. Filed August 20, 2013 and posted July 24, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=404&p=9&f=G&l=50&d=PG01&S1=20140717.PD.&OS=PD/20140717&RS=PD/20140717

Keywords for this news article include: Google Inc., Information Technology, Information and Data Architecture.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters