News Column

Patent Application Titled "Systems and Methods of Simulating the State of a Distributed Storage System" Published Online

February 20, 2014



By a News Reporter-Staff News Editor at Computer Weekly News -- According to news reporting originating from Washington, D.C., by VerticalNews journalists, a patent application by the inventors Zunger, Yonatan (Mountain View, CA); Drobychev, Alexandre (San Jose, CA); Kesselman, Alexander (Sunnyvale, CA); Vickrey, Rebekah C. (Mountain View, CA); Dachille, Frank Clare (Mountain View, CA); Datuashvili, George (Cupertino, CA), filed on September 25, 2013, was made available online on February 6, 2014.

No assignee for this patent application has been made.

Reporters obtained the following quote from the background information supplied by the inventors: "The enterprise computing landscape has recently undergone a fundamental shift in storage architectures in which the central-service architecture has given way to distributed storage systems. Distributed storage systems built from commodity computer systems can deliver high performance, availability, and scalability for new data-intensive applications at a fraction of cost compared to monolithic disk arrays. To unlock the full potential of distributed storage systems, data is replicated across multiple instances of the distributed storage system at different geographical locations, thereby increasing availability and reducing network distance from clients.

"In a distributed storage system, objects are dynamically created and deleted in different instances of the distributed storage system. However, different replication requests may have different priorities. It is important to execute replication requests in priority order so as to replicate the more important objects first. For example, a newly uploaded object has just one replica. Thus, it is more important to create replicas of the new object before creating replicas of existing objects that already has a plurality of replicas in order to minimize the probability of data loss in the new object. Another example is a video that becomes a hit over night. In this case, the number of replicas of the video needs to be increased as soon as possible in order to handle the increased demand. Therefore, it is desirable to properly prioritize replication requests and execute them in a timely fashion while sustaining very high loads.

"In a small-scale distributed storage system, managing replicas of objects is a tractable problem. However, there are no existing techniques for managing replicas of objects in a planet-wide distributed storage system that includes trillions of objects, petabytes of data, dozens of data centers across the planet."

In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventors' summary information for this patent application: "To address the aforementioned deficiencies, some embodiments provide a system, a computer-readable storage medium including instructions, and a computer-implemented method for generating replication requests for objects in a distributed storage system. Replication requests for objects in a distributed storage system are generated based at least in part on replication policies for the objects and a current state of the distributed storage system, wherein a respective replication request for a respective object instructs a respective instance of the distributed storage system to replicate the respective object so as to at least partially satisfy a replication policy for the respective object, wherein a respective replication policy includes criteria specifying at least storage device types on which replicas of object are to be stored. At least a subset of the replication requests is then distributed to the respective instances of the distributed storage system for execution.

"In some embodiments, prior to distributing at least the subset of the replication requests to the respective instances of the distributed storage system for execution, the replication requests are partitioned into groups of respective replication requests corresponding to respective instances of the distributed storage system at which the respective replication requests are to be performed.

"In some embodiments, prior to distributing the at least the subset of the replication requests for each group of respective replication requests to the respective instances of the distributed storage system, priorities of the replication requests are calculated. For each group of replication requests, the replication requests in the group of replication requests are sorted by priority to produce a sorted group of replication requests.

"In some embodiments, the priority of a respective replication request is calculated as a difference between a metric corresponding to a benefit of performing the respective replication request and a metric corresponding to a cost of performing the respective replication request.

"In some embodiments, prior to distributing a respective subset of replication requests for a respective group of replication requests to a respective instance of the distributed storage system for execution, the respective subset of replication requests for the respective group that can be completed within a predetermined time interval is determined. In some embodiments, replication requests for the respective group of replication requests that are not included in the subset of replication requests for the respective group of replication requests are discarded.

"In some embodiments, the predetermined time interval is the time interval between iterations of the generating, the partitioning, and the distributing.

"In some embodiments, a respective subset of replication requests for a respective group of replication requests is distributed to a respective instance of the distributed storage system for execution by distributing a respective sorted group of replication requests to the respective instance of the distributed storage system.

"In some embodiments, the current state of the distributed storage system includes a current network state, current user quotas for storage space in the distributed storage system, storage space in the distributed storage system that are currently used by users, current storage space available at instances of the distributed storage system, current statuses of replication queues at instances of the distributed storage system, current planned maintenance operations zones, and a list of current replicas of objects in the distributed storage system.

"In some embodiments, a replication policy for an object includes criteria selected from the group consisting of, a minimum number of replicas of the object that must be present in the distributed storage system, a maximum number of the replicas of the object that are allowed to be present in the distributed storage system, storage device types on which the replicas of the object are to be stored, locations at which the replicas of the object may be stored, locations at which the replicas of the object may not be stored, and a range of ages for the object during which the replication policy for the object applies.

"In some embodiments, the replication requests are background replication requests.

"In some embodiments, a respective object is a binary large object (blob).

"Some embodiments provide a system, a computer-readable storage medium including instructions, and a computer-implemented method for generating replication requests for objects in a distributed storage system. For each object in a distributed storage system, replication policies for the object that have not been satisfied are determined. Next, the replication requests for the object whose replication policies have not been satisfied are ranked based on a number of replicas of the object that need to be created in order to satisfy the replication policies for the object. Replication requests for the object are generated based at least in part on the replication policies for the object that have not been satisfied, costs and benefits for performing the replication requests, and a current state of the distributed storage system, wherein a respective replication request for a respective object instructs a respective instance of the distributed storage system to replicate the respective object so as to at least partially satisfy a replication policy for the respective object. At least a subset of the replication requests for the objects in the distributed storage system are distributed to respective instances of the distributed storage system corresponding to the replication requests for execution.

"In some embodiments, prior to distributing the at least the subset of the replication requests for the objects in the distributed storage system to respective instances of the distributed storage system corresponding to the replication requests for execution, the replication requests are partitioned into groups of respective replication requests corresponding to respective instances of the distributed storage system at which respective predetermined actions of the respective replication requests are to be performed. Priorities of the replication requests are then calculated. For each group of replication requests, the replication requests in the group of replication requests are sorted by priority to produce a sorted group of replication requests.

"In some embodiments, a priority of a respective replication request is calculated as a difference between a metric corresponding to a benefit of performing the respective replication request and a metric corresponding to a cost of performing the respective replication request.

"In some embodiments, prior to distributing the at least the subset of the replication requests for the objects in the distributed storage system to respective instances of the distributed storage system corresponding to the replication requests for execution, the at least the subset of replication requests that can be completed within a predetermined time interval is determined. In some embodiments, replication requests for the respective group of replication requests that are not included in the subset of replication requests for the respective group of replication requests are discarded.

"In some embodiments, replication requests are distributed to a replication queue in a respective instance of the distributed storage system.

"Some embodiments provide a system, a computer-readable storage medium including instructions, and a computer-implemented method for simulating a state of a distributed storage system. A current state of a distributed storage system and replication policies for the objects in the distributed storage system is obtained. Proposed modifications to the current state of the distributed storage system are received. The state of the distributed storage system over time is simulated based on the current state of the distributed storage system, the replication policies for the objects in the distributed storage system, and the proposed modifications to the current state of the distributed storage system. Reports relating to the time evolution of the current state of the distributed storage system are generated based on the simulation.

"In some embodiments, a respective proposed modification to the current state of the distributed storage system includes information relating to the respective proposed modification to the current state of the distributed storage system and a time at which the respective proposed modification to the current state of the distributed storage system is to Occur.

"In some embodiments, a respective proposed modification to the current state of the distributed storage system is selected from the group consisting of an addition of storage space in the distributed storage system, a removal of storage space in the distributed storage system, an addition of instances of the distributed storage system, a removal of instances of the distributed storage system, an increase in the amount of data stored in the distributed storage system, a decrease in the amount of data stored in the distributed storage system, a modification to replication policies for objects in the distributed storage system, an addition of network resources in the distributed storage system, and a modification to an algorithm that generates replication requests.

"In some embodiments, at least one of the proposed modifications to the current state of the distributed storage system are implemented based on the reports.

"Some embodiments provide a system, a computer-readable storage medium including instructions, and a computer-implemented method for generating and distributing replica removal requests for objects in a distributed storage system. Replica removal requests for objects in a distributed storage system are generated based at least in part on replication policies for the objects, wherein a respective replica removal request instructs a respective instance of the distributed storage system to remove a respective replica of the respective object so as to at least partially satisfy replication policies for the respective object. The replica removal requests for the objects in the distributed storage system are then distributed to respective instances of the distributed storage system corresponding to the replica removal requests for execution.

"In some embodiments, a replica removal request for a respective object in the distributed storage system is generated based at least in part on replication policies for the respective object as follows. Replication policies for the respective object that have been violated are identified. Next, a replica of the respective object to be removed from an instance of the distributed storage system is selected based at least in part on last access times of replicas of the respective object and the current storage space available at instances of the distributed storage system including the replicas of the respective object. The replica removal request for the replica of the respective object is then generated.

"In some embodiments, a replica removal request for the replica of the respective object is generated as follows. It is determined that an instance of the distributed storage system including a replica of the respective object is being deactivated. It is then determined whether the deactivation of the instance of the distributed storage system causes a number of replicas of the respective object to be below a minimum number of replicas of the respective object as specified by the replication policies for the respective object. If the deactivation of the instance of the distributed storage system causes the number of replicas of the respective object to be below the minimum number of replicas of the respective object, a replication request to replicate the respective object is generated based at least in part on replication policies for the respective object and a current state of the distributed storage system. Next, the replication request is distributed to a respective instance of the distributed storage system for execution. The replica removal request for the respective object is generated only after the replication request to replicate the respective object has been completed.

"In some embodiments, replica removal requests are generated for an object whose replicas violate replication policies for the object.

"In some embodiments, replica removal requests are generated for an object for which dynamic replication requests caused the number of replicas of the object to exceed the number of replicas of the object specified in the replication policies for the object, wherein a dynamic replication request generates a replica of the object based at least in part on a current level of demand for the object.

"Some embodiments provide a system, a computer-readable storage medium including instructions, and a computer-implemented method for generating and distributing replica removal requests for objects in a distributed storage system. The following operations are performed for each object in a distributed storage system. One or more replicas of the object to be removed from the distributed storage system are identified based at least in part on replication policies for the object. Next, teplica removal requests for the one or more replicas of the object are generated, wherein a respective replica removal request instructs a respective instance of the distributed storage system to remove a respective replica of the respective object so as to at least partially satisfy replication policies for the respective object. The replica removal requests for the object in the distributed storage system are then distributed to respective instances of the distributed storage system corresponding to the replica removal requests for execution.

"In some embodiments, the replica removal requests for the one or more replicas of the object are generated as follows. Replication policies for the object that have been violated are identified. Next, the one or more replicas of the object to be removed from instances of the distributed storage system are selected based at least in part on last access times of replicas of the respective object and the current storage space available at the instances of the distributed storage system including the replicas of the respective object. The replica removal requests for the one or more selected replica of the respective object are then generated.

"In some embodiments, a replica removal requests for the one or more replicas of the object are generated as follows. It is determined that instance of the distributed storage system including the replica of the object is being deactivated. Next, it is determined whether the deactivation of the instance of the distributed storage system causes a number of replicas of the object to be below a minimum number of replicas of the object as specified by the replication policies for the object. If the deactivation of the instance of the distributed storage system causes the number of replicas of the object to be below the minimum number of replicas of the object, a replication request to replicate the object is generated based at least in part on replication policies for the object and a current state of the distributed storage system. The replication request is then distributed to a respective instance of the distributed storage system for execution. The replica removal request for the object is generated only after the replication request to replicate the object has been completed.

BRIEF DESCRIPTION OF THE DRAWINGS

"FIG. 1A is a conceptual illustration for placing multiple instances of a database at physical sites all over the globe, according to some embodiments.

"FIG. 1B illustrates basic functionality at each instance according to some embodiments.

"FIGS. 1C-1G illustrate ways that a distributed storage system may be integrated with systems that provide user applications according to some embodiments.

"FIG. 2 is a block diagram illustrating multiple instances of a replicated database, with an exemplary set of programs and/or processes shown for the first instance according to some embodiments.

"FIG. 3 is a block diagram that illustrates an exemplary instance for the system, and illustrates what blocks within the instance with which a user interacts, according to some embodiments.

"FIG. 4 is a block diagram of an instance server that may be used for the various programs and processes, according to some embodiments.

"FIG. 5 illustrates a typical allocation of instance servers to various programs or processes, according to some embodiments.

"FIG. 6A is a block diagram illustrating the creation and the initial replication of an object, according to some embodiments.

"FIG. 6B is a block diagram illustrating the background replication of the object, according to some embodiments.

"FIG. 6C is a block diagram illustrating a dynamic replication of the object, according to some embodiments.

"FIG. 6D is a block diagram illustrating the removal of a replica of the object, according to some embodiments.

"FIG. 7 is a flowchart of a method for generating replication requests for objects in a distributed storage system, according to some embodiments.

"FIG. 8 is a flowchart of another method for generating and distributing replication requests for objects in a distributed storage system, according to some embodiments.

"FIG. 9 is a flowchart of a method for generating replica removal requests for objects in a distributed storage system, according to some embodiments.

"FIG. 10 is a flowchart of a method for generating a replica removal request for an object in the distributed storage system, according to some embodiments.

"FIG. 11 is a flowchart of another method for generating a replica removal request for an object in the distributed storage system, according to some embodiments.

"FIG. 12 is a flowchart of another method for generating and distributing replica removal requests for objects in a distributed storage system, according to some embodiments.

"FIG. 13 is a flowchart of a method for generating replica removal requests for the one or more replicas of an object, according to some embodiments.

"FIG. 14 is a flowchart of another method for generating replica removal requests for the one or more replicas of an object, according to some embodiments.

"FIG. 15 is a flowchart of a method for simulating a state of a distributed storage system, according to some embodiments.

"Like reference numerals refer to corresponding parts throughout the drawings."

For more information, see this patent application: Zunger, Yonatan; Drobychev, Alexandre; Kesselman, Alexander; Vickrey, Rebekah C.; Dachille, Frank Clare; Datuashvili, George. Systems and Methods of Simulating the State of a Distributed Storage System. Filed September 25, 2013 and posted February 6, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=1188&p=24&f=G&l=50&d=PG01&S1=20140130.PD.&OS=PD/20140130&RS=PD/20140130

Keywords for this news article include: Patents.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Computer Weekly News


Story Tools