News Column

"Efficiently Updating and Deleting Data in a Data Storage System" in Patent Application Approval Process

February 11, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- A patent application by the inventors Dean, Jeffrey Adgate (Palo Alto, CA); Ghemawat, Sanjay (Mountain View, CA); Fikes, Andrew (Los Altos, CA), filed on June 4, 2013, was made available online on January 30, 2014, according to news reporting originating from Washington, D.C., by VerticalNews correspondents.

This patent application has not been assigned to a company or institution.

The following quote was obtained by the news editors from the background information supplied by the inventors: "Systems storing a large amount of data must efficiently store, update, and retrieve the data in order to be useful. Some such systems employ a distributed data storage system that uses a plurality of servers to allow even a large number of geographically diverse users to easily update and retrieve data. When a request to store, update, or delete stored data is received the request is routed to the appropriate server. The data then changed in accordance with the request. Some requests will alter only a single data value, while others will request to alter a large range of data values.

"In addition, data storage systems may find it useful to provide historical information for a particular piece of data. Historical data allows changes in a particular piece of data to be tracked over time. Depending on the amount and quality of stored historical information, it may be possible to determine the exact value of particular data at a specific point in the past. Thus, storing this historical data necessitates a large amount of historical data storage. The data storage system must be able to efficiently store historical data information for the data storage system to be useful and cost effective. Techniques which improve the efficiency of adding, storing, and removing data from a data storage system greatly improve the usefulness of such systems."

In addition to the background information obtained for this patent application, VerticalNews journalists also obtained the inventors' summary information for this patent application: "In accordance with some implementations, a method of storing data is disclosed. The method is performed on a data storage server having one or more processors and memory storing one or more programs for execution by the one or more processors. The data storage server receives a first data request, the request including a first range of one or more keys and an associated first value. The data storage server then receives a second data request, the request including a second range of one or more keys and an associated second value. The data storage server determines whether the first range overlaps, at least in part, the second range. In accordance with a determination that the first range overlaps, at least in part, the second range, the data storage server identifies one or more overlap points associated with the first range and the second range. For each of the identified overlap points, the data storage server then creates respective data items including respective ranges of keys, the respective ranges of each respective data item including one or more keys that are either: (a) the one or more keys between a terminal key of the first or second range and the overlap point, or (b) the one or more keys between two adjacent overlap points. The data storage server then stores the created data items in the memory, wherein all the keys included in the first and second data requests are included in one of the created data items.

"In accordance with some implementations, a method for data storage and retrieval is disclosed. The method is performed on a data storage server having one or more processors and memory storing one or more programs for execution by the one or more processors. The data storage server stores one or more immutable data files, each immutable data file including one or more key ranges, and each key range including one or more keys and an associated value. In response to receiving a data request, the data storage server reads one or more respective key ranges out of each immutable file in the one or more immutable files and stores the respective key ranges in memory. The data storage server then identifies, for all the respective key ranges, overlap points that identify, in one or more key ranges, keys that overlap with keys in other key ranges. The data storage server creates one or more data items, each data item including a range of one or more keys, each range including one or more keys that are either 1) the one or more keys between two overlap points, 2) the one or more keys between a terminal key of a key range and an overlap point, or 3) the one or more keys between two terminal keys of a single key range, such that the none of the created data items include any overlap points. For each respective created data item the data storage server determines whether the respective created data item key range matches the key range of another created data item. In accordance with a determination that the respective created data item key range matches the key range of another created data item, the data storage server removes the older data item. The data storage server determines whether any other data item has a key range which is adjacent to the key range of the respective data item. In accordance with a determination that a second data item has a key range adjacent to the key range of the respective data item; the data storage server determines whether a value associated with the key ranges in the respective created data item and a value associated with the key ranges in the identified second created data item are the same. In accordance with a determination that the value associated with the key ranges in the respective created data item is the same as the value associated with the key ranges in the identified second created data item, the data storage server merges the respective data item and the identified second data item into a single data item including a key range covering the combined key ranges of both data items.

"In accordance with some implementations, a system for storing data is disclosed. The system has one or more processors and memory storing one or more programs to be executed by the one or more processors. The one or more programs include instructions which cause the system to receive a first data request, the request including a first range of one or more keys and an associated first value. The system then receives a second data request, the request including a second range of one or more keys and an associated second value. The system determines whether the first range overlaps, at least in part, the second range. In accordance with a determination that the first range overlaps, at least in part, the second range, the system identifies one or more overlap points associated with the first range and the second range. For each of the identified overlap points, the system then creates respective data items including respective ranges of keys, the respective ranges of each respective data item including one or more keys that are either: (a) the one or more keys between a terminal key of the first or second range and the overlap point, or (b) the one or more keys between two adjacent overlap points. The system then stores the created data items in the memory, wherein all the keys included in the first and second data requests are included in one of the created data items.

"In accordance with some implementations, a system for data storage and retrieval is disclosed. The system has one or more processors and memory storing one or more programs to be executed by the one or more processors. The one or more programs include instructions which cause the system to store one or more immutable data files, each immutable data file including one or more key ranges, and each key range including one or more keys and an associated value. In response to receiving a data request, the system reads one or more respective key ranges out of each immutable file in the one or more immutable files and stores the respective key ranges in memory. The system then identifies, for all the respective key ranges, overlap points that identify, in one or more key ranges, keys that overlap with keys in other key ranges. The system creates one or more data items, each data item including a range of one or more keys, each range including one or more keys that are either 1) the one or more keys between two overlap points, 2) the one or more keys between a terminal key of a key range and an overlap point, or 3) the one or more keys between two terminal keys of a single key range, such that the none of the created data items include any overlap points. For each respective created data item the system determines whether the respective created data item key range matches the key range of another created data item. In accordance with a determination that the respective created data item key range matches the key range of another created data item, the system removes the older data item. The system then determines whether any other data item has a key range which is adjacent to the key range of the respective data item. In accordance with a determination that a second data item has a key range adjacent to the key range of the respective data item; the system determines whether a value associated with the key ranges in the respective created data item and a value associated with the key ranges in the identified second created data item are the same. In accordance with a determination that the value associated with the key ranges in the respective created data item is the same as the value associated with the key ranges in the identified second created data item, the system merges the respective data item and the identified second data item into a single data item including a key range covering the combined key ranges of both data items.

"In accordance with some implementations, a non-transitory computer readable storage medium storing one or more programs configured for execution by a data storage server system is disclosed. The one or more programs include instructions which cause the system to receive a first data request, the request including a first range of one or more keys and an associated first value. The system then receives a second data request, the request including a second range of one or more keys and an associated second value. The system determines whether the first range overlaps, at least in part, the second range. In accordance with a determination that the first range overlaps, at least in part, the second range, the system identifies one or more overlap points associated with the first range and the second range. For each of the identified overlap points, the system then creates respective data items including respective ranges of keys, the respective ranges of each respective data item including one or more keys that are either: (a) the one or more keys between a terminal key of the first or second range and the overlap point, or (b) the one or more keys between two adjacent overlap points. The system then stores the created data items in the memory, wherein all the keys included in the first and second data requests are included in one of the created data items.

"In accordance with some implementations, a non-transitory computer readable storage medium storing one or more programs configured for execution by a data storage server system is disclosed. The one or more programs include instructions which cause the system to store one or more immutable data files, each immutable data file including one or more key ranges, and each key range including one or more keys and an associated value. In response to receiving a data request, the system reads one or more respective key ranges out of each immutable file in the one or more immutable files and stores the respective key ranges in memory. The system then identifies, for all the respective key ranges, overlap points that identify, in one or more key ranges, keys that overlap with keys in other key ranges. The system creates one or more data items, each data item including a range of one or more keys, each range including one or more keys that are either 1) the one or more keys between two overlap points, 2) the one or more keys between a terminal key of a key range and an overlap point, or 3) the one or more keys between two terminal keys of a single key range, such that the none of the created data items include any overlap points. For each respective created data item the system determines whether the respective created data item key range matches the key range of another created data item. In accordance with a determination that the respective created data item key range matches the key range of another created data item, the system removes the older data item. The system then determines whether any other data item has a key range which is adjacent to the key range of the respective data item. In accordance with a determination that a second data item has a key range adjacent to the key range of the respective data item; the system determines whether a value associated with the key ranges in the respective created data item and a value associated with the key ranges in the identified second created data item are the same. In accordance with a determination that the value associated with the key ranges in the respective created data item is the same as the value associated with the key ranges in the identified second created data item, the system merges the respective data item and the identified second data item into a single data item including a key range covering the combined key ranges of both data items.

BRIEF DESCRIPTION OF THE DRAWINGS

"The implementations disclosed herein are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings. Like reference numerals refer to corresponding parts throughout the drawings.

"FIG. 1 is a block diagram illustrating a distributed storage system, according to some implementations.

"FIG. 2 is a block diagram illustrating components of the distributed storage system, according to some implementations.

"FIG. 3 is a block diagram illustrating the components of a data storage server, according to some implementations.

"FIG. 4 is a block diagram that illustrates the process of resolving overlaps between two data ranges according to some implementations.

"FIG. 5 is a block diagram illustrating a merging of multiple immutable data files into a single, new immutable data file according to some implementations.

"FIG. 6 is a block diagram illustrating a server system 120, in accordance with some implementations.

"FIG. 7 is a flow diagram illustrating the process of resolving overlaps between key ranges in accordance with some implementations.

"FIG. 8 is a flow diagram illustrating the process of resolving overlaps between key ranges in accordance with some implementations.

"FIG. 9 is a flow diagram illustrating the process of resolving overlaps between key ranges when performing a merge in accordance with some implementations.

"FIG. 10 is a flow diagram illustrating the process of resolving overlaps between key ranges when performing a merge in accordance with some implementations.

"FIG. 11 is a flow diagram illustrating the process of resolving overlaps between key ranges when performing a merge in accordance with some implementations."

URL and more information on this patent application, see: Dean, Jeffrey Adgate; Ghemawat, Sanjay; Fikes, Andrew. Efficiently Updating and Deleting Data in a Data Storage System. Filed June 4, 2013 and posted January 30, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=380&p=8&f=G&l=50&d=PG01&S1=20140123.PD.&OS=PD/20140123&RS=PD/20140123

Keywords for this news article include: Patents, Information Technology, Information and Data Storage.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly


Story Tools