The assignee for this patent application is
Reporters obtained the following quote from the background information supplied by the inventors: "Embodiments disclosed herein provide techniques for detecting reference data tables in Extract, Transform, and Load (ETL) processes.
"ETL processes are implemented in programs organized in project folders for a data integration solution. Thus, the installation of data integration software might have one or multiple processes with one or multiple jobs grouped into each of the processes. ETL processes typically integrate data from multiple, heterogeneous data sources into a central repository, such as a data warehouse (DW) or a master data management (MDM) system. Reference data generally defines a set of values that describes other data. Some examples of reference data are: gender, country codes, courtesy titles (Mr., Mrs., Miss, Dr., etc.), units of measure, and so on. Reference data can be found in applications by code tables, lookup tables, properties files, or it may be hard-coded. Consistent reference data is the cornerstone of many information centric applications such as data warehousing, master data management (MDM), as well as in operational business applications such as customer relationship management (CRM) and enterprise resource planning (ERP). Without consistent reference data, many business problems can occur. For example, in DW environments, revenue reports by country and customer type, created using reference data describing these entities, may produce incorrect results due to the inconsistent reference data. In MDM environments, product categorization may produce unexpected results, and customer information cannot be established, without consistent reference data for each type of entity.
"Reference Data Management (RDM) systems have emerged to ensure consistency of reference data across applications and between enterprises. RDM systems vary from implementation to implementation, but generally an RDM solution provides a single place for business owners to create, update, review and distribute reference data across an enterprise.
"Reference data management solutions are particularly useful in data integration projects. Typically, at any given point in time in medium to large enterprises, there are one or more data integration projects being implemented to, for example, add additional sources to a data warehouse and standardize data from multiple legacy systems prior to integration into SAP applications.
"In many ETL processes, reference data is used to transcode source reference data values to target reference data values, such that reference data is harmonized in the target system when a process is complete. Transcoding is needed where one or more code values in the source system has a different meaning in the target system, or where the code values for the same meaning are different in the source and target system. Both issues are addressed by implementing transcoding tables harmonizing the reference data while data is exchanged between one or more source and target systems. Reference data is also used in every ETL process in order to validate data in order to ensure its 'loadability' into the target against reference data tables from the target."
In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventors' summary information for this patent application: "Embodiments disclosed herein provide a method, computer program product, and system for identifying reference data tables in an extract-transform-load data integration process, by identifying, by operation of one or more computer processors, at least a first reference data operator in the process, wherein the first reference data operator references one or more tables and evaluating at least a first table referenced by the reference data operator to determine whether the first table is a reference data table by assigning a score to the first table, wherein the score is indicative of the likelihood that the first table is a reference data table and wherein a reference data table contains a set of values that describes other data.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
"So that the manner in which the above recited aspects are attained and can be understood in detail, a more particular description of embodiments of the disclosure, briefly summarized above, may be had by reference to the appended drawings.
"It is to be noted, however, that the appended drawings illustrate only typical embodiments of this disclosure and are therefore not to be considered limiting of its scope, for the disclosure may admit to other equally effective embodiments.
"FIG. 1 is a block diagram illustrating a system for detecting reference data tables in an ETL process, according to one embodiment disclosed herein.
"FIG. 2 is a flowchart depicting a method for detecting reference data tables in an ETL process, according to one embodiment disclosed herein.
"FIG. 3 is a flowchart depicting a method for identifying candidate reference data tables in an ETL process, according to one embodiment disclosed herein.
"FIG. 4 is a flowchart depicting a method for determining whether a candidate reference data table is a reference data table, according to one embodiment disclosed herein.
"FIG. 5 is a flowchart depicting a method for calculating a maximal value partition, according to one embodiment disclosed herein.
"FIG. 6 is a flowchart depicting a method for detecting an indirect relationship between a table and a concept in an ontology, according to one embodiment disclosed herein.
"FIG. 7 is a flowchart depicting a method for a method for scoring candidate reference data tables, according to one embodiment disclosed herein.
"FIG. 8 illustrates an exemplary graphical user interface (GUI) screen displaying an exemplary list of candidate reference data tables presented to a user, according to one embodiment disclosed herein."
For more information, see this patent application: MANDELSTEIN, Dan J.; Milman, Ivan M.; Oberhofer, Martin A.; Pandit, Sushain. Detecting Reference Data Tables in Extract-Transform-Load Processes. Filed
Keywords for this news article include: Information Technology, Information and Data Management, Information and Data Tabulation,
Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC
Most Popular Stories
- Chinese May Have Spotted Malaysia Airlines Debris
- 3 Shot Dead in Venezuela Unrest
- Why Buffett Bets Big on Green Energy
- Better Pay Means Bigger Profits: Strategist
- Banks Buying Little From Minority Firms: Study
- Several Texas Cities Top Job Search List
- G7 Presses Russia to Pull Troops Out of Crimea
- Obama's 'Between Two Ferns' Appearance Has Conservatives Upset
- Senate Committee OKs Bill to Sanction Russia
- Wall Street Rally Heads Off 3rd Day of Decline