News Column

Patent Issued for Affinitizing Datasets Based on Efficient Query Processing

September 9, 2014

By a News Reporter-Staff News Editor at Information Technology Newsweekly -- From Alexandria, Virginia, VerticalNews journalists report that a patent by the inventors Zhou, Jingren (Bellevue, WA); Helland, Patrick James (Seattle, WA); Forbes, Jonathan (Bellevue, WA); Burd, Yaron (Kirkland, WA), filed on October 15, 2010, was published online on August 26, 2014.

The patent's assignee for patent number 8819017 is Microsoft Corporation (Redmond, WA).

News editors obtained the following quote from the background information supplied by the inventors: "Query processing typically requires that a group of datasets be processed together. However, when the group of datasets is stored, datasets are broken into extents that are randomly placed across a data center to allow for even load distribution across the data center. Accordingly, storing extents of datasets randomly across the data center fails to account for inefficiencies that result from randomized storage structures."

As a supplement to the background information on this patent, VerticalNews correspondents also obtained the inventors' summary information for this patent: "This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify essential features of the claimed subject matter, nor is it intended to be used as an aid in isolation to determine the scope of the claimed subject matter. Embodiments of the present invention provide methods for affinitizing datasets based on efficient query processing. In particular, methods are provided for assigning affinity identifiers to datasets that are related in query processing. The related datasets are transparently broken into extents by a distribution component at a data center. Additionally, the extents of related datasets are preferentially distributed based on their shared affinity identifier to be within close proximity to other extents with the same affinity identifier.

"Data is generally stored in data centers based on an equal distribution algorithm in order to prevent data skew. By distributing data throughout a data center, general data traffic is also spread across the data center, thereby minimizing data traffic jams. However, the way in which data is distributed across data centers does not account for affinitization of data. Accordingly, data that is processed together is not stored together. By storing data within close proximity of other related data, responses to queries may be sped up while overall traffic across the data center may be decreased. As described above, assigning identifiers to related datasets may be used to affinitize the related datasets and, additionally, the extents of the related datasets. As such, affinitized extents of related datasets may be distributed within close proximity when storage space that is close to related extents is available."

For additional information on this patent, see: Zhou, Jingren; Helland, Patrick James; Forbes, Jonathan; Burd, Yaron. Affinitizing Datasets Based on Efficient Query Processing. U.S. Patent Number 8819017, filed October 15, 2010, and published online on August 26, 2014. Patent URL:

Keywords for this news article include: Information Technology, Information and Data Traffic, Microsoft Corporation.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC

For more stories covering the world of technology, please see HispanicBusiness' Tech Channel

Source: Information Technology Newsweekly

Story Tools Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters