Researchers Submit Patent Application, "Techniques for Data Assignment from an External Distributed File System to a Database Management System", for Approval
The patent's assignee is
News editors obtained the following quote from the background information supplied by the inventors: "After over two-decades of electronic data automation and the improved ability for capturing data from a variety of communication channels and media, even the smallest of enterprises find that the enterprise is processing terabytes of data with regularity. Moreover, mining, analysis, and processing of that data have become extremely complex. The average consumer expects electronic transactions to occur flawlessly and with near instant speed. The enterprise that cannot meet expectations of the consumer is quickly out of business in today's highly competitive environment.
"Consumers have a plethora of choices for nearly every product and service, and enterprises can be created and up-and-running in the industry it mere days. The competition and the expectations are breathtaking from what existed just a few short years ago.
"The industry infrastructure and applications have generally answered the call providing virtualized data centers that give an enterprise an ever-present data center to run and process the enterprise's data. Applications and hardware to support an enterprise can be outsourced and available to the enterprise twenty-four hours a day, seven days a week, and three hundred sixty-five days a year.
"As a result, the most important asset of the enterprise has become its data. That is, information gathered about the enterprise's customers, competitors, products, services, financials, business processes, business assets, personnel, service providers, transactions, and the like.
"Updating, mining, analyzing, reporting, and accessing the enterprise information can still become problematic because of the sheer volume of this information and because often the information is dispersed over a variety of different file systems, databases, and applications.
"In response, the industry has recently embraced a data platform referred to as Apache Hadoop.TM. (Hadoop.TM.). Hadoop.TM. is an Open Source software architecture that supports data-intensive distributed applications. It enables applications to work with thousands of network nodes and petabytes (1000 terabytes) of data. Hadoop.TM. provides interoperability between disparate file systems, fault tolerance, and High Availability (HA) for data processing. The architecture is modular and expandable with the whole database development community supporting, enhancing, and dynamically growing the platform.
"However, because of Hadoop's.TM. success in the industry, enterprises now have or depend on a large volume of their data, which is stored external to their core in-house database management system (DBMS). This data can be in a variety of formats and types, such as: web logs; call details with customers; sensor data, Radio Frequency Identification (RFID) data; historical data maintained for government or industry compliance reasons; and the like. Enterprises have embraced Hadoop.TM. for data types such as the above referenced because Hadoop.TM. is scalable, cost efficient, and reliable.
"One challenge in integrating Hadoop.TM. architecture with an enterprise DBMS is efficiently assigning data blocks and managing workloads between nodes. That is, even when the same hardware platform is used to deploy some aspects of Hadoop and a DBMS the resulting performance of such a hybrid system can be poor because of how the data is distributed and how workloads are processed."
As a supplement to the background information on this patent application, VerticalNews correspondents also obtained the inventors' summary information for this patent application: "In various embodiments, techniques for data assignment from an external distributed file system (DFS) to a DBMS are presented. According to an embodiment, a method for data assignment from an external DFS to a DBMS is provided.
"Specifically, an initial assignment for first nodes to second nodes is received in a bipartite graph. The first nodes represent data blocks in an external distributed file system and the second nodes represent access module processors of a database management system (DBMS). A residual graph is constructed with a negative cycle having the initial assignment. The residual graph is processed through iterations, with each of which the initial assignment is adjusted to eliminate negative cycles. Finally, a final assignment is achieved by removing all negative cycles of the residual graph, for each of the data blocks to one of the access module processors as an assignment flow.
BRIEF DESCRIPTION OF THE DRAWINGS
"FIG. 1 is a diagram depicting an even assignment of data from a HDFS to a parallel DBMS, according to an example embodiment.
"FIG. 2 is a diagram showing a bipartite graph for the example presented in the FIG. 1, according to an example embodiment.
"FIG. 3 is a diagram illustrating an even assignment with minimal cost as shown in the FIG. 2, according to an example embodiment.
"FIG. 4 is a diagram illustrating an assignment of a block of data using an Approximate-Greedy Algorithm, according to an example embodiment.
"FIG. 5 is a diagram of a method for data assignment to an external DFS to a DMBS, according to an example embodiment.
"FIG. 6 is a diagram of another method for data assignment to an external DFS to a DMBS, according to an example embodiment.
"FIG. 7 is a diagram of yet? method for data assignment to an external DFS to a DMBS, according to an example embodiment."
For additional information on this patent application, see: Qi, Yan; Xu, Yu; Kostamaa, Olli Pekka; Wen, Jian. Techniques for Data Assignment from an External Distributed File System to a Database Management System. Filed
Keywords for this news article include:
Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC
Most Popular Stories
- Study: Recessions Can Postpone Motherhood Forever
- Tim Cook Has Proved That Apple is His Baby
- Hispanic Entrepreneurs Short-changed in Texas
- China Approves iPhone 6 After Security Assurances
- Meet the YouTube Tech Review Sensation
- U.S. Home Prices Rose at Slowest Pace in 20 Months
- Who Is Daniel Ivascyn?
- Hispanics Carry Big Clout: Census
- Netflix Eyes Hollywood With Feature Film
- Washington's 'The Equalizer' Debuts With $35 Million