News Column

Researchers Submit Patent Application, "Database Analyzer and Database Analysis Method", for Approval

May 20, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- From Washington, D.C., VerticalNews journalists report that a patent application by the inventors HASHIMOTO, Yasunori (Tokyo, JP); MIBE, Ryota (Tokyo, JP); YOSHIMURA, Kentaro (Tokyo, JP); DANNO, Hirofumi (Tokyo, JP); ISHIKAWA, Sadahiro (Tokyo, JP); YAMAGUCHI, Kiyoshi (Tokyo, JP), filed on October 23, 2013, was made available online on May 8, 2014.

The patent's assignee is Hitachi, Ltd.

News editors obtained the following quote from the background information supplied by the inventors: "Recently, databases which retain a large amount of data have been being actively used; however, regarding development of a database, it is necessary to adjust various parameters relating to the database, such as the size of resources to be allocated within the database (tuning of the database). A general method for tuning a database appropriately is to perform a test to impose load on the database by using dummy test data and thereby evaluate the status of the database.

"A commercially available test data generation tool can be used to create such test data, but a user needs to set characteristics of data to be generated with respect to, for example, the range of data values and occurrence frequency. In order to do so, it is important to definitely understand what data having what kind of characteristics are stored in an analysis target database.

"For example, Patent Literature 1 describes a test data generator for generating dummy test data from data stored in an existing database. The test data generator described in Patent Literature 1 can generate dummy test data which is suited for actual circumstances, by calculating characteristics of the data from the data stored in the existing database, which is actually in operation, and generating necessary test data for a target database to be developed by utilizing the calculated characteristics."

As a supplement to the background information on this patent application, VerticalNews correspondents also obtained the inventors' summary information for this patent application: "Problems to be Solved by the Invention

"Meanwhile, the test data generator described in Patent Literature 1 obtains characteristics of data by focusing attention on the characteristics of the data between table columns with respect to a data group which is an analysis target but cannot obtain table-column-based data characteristics. So, there is a problem of difficulty to generate an appropriate amount of test data which secures exhaustivity, based on data-column-based characteristics.

"An explanation will be given below by giving a specific example. For example, if data-column data of data groups in a certain database are divided into three types of data groups, that is, a 'null value,' 'half-size character strings,' and 'full-size character strings,' it can be expected that exhaustivity of a test of the database can be secured by conducting the test by creating test data for respective cases in which the above-mentioned three types of information is handled. However, in a case of the test data generator described in Patent Literature 1, it cannot acquire characteristics of data on a table column basis, so that you have no choice but to select a method of conducting the test by using all pieces of test data generated by the test data generator or conducting the test by using data randomly selected from all the pieces of test data generated by the test data generator. When all the pieces of test data are used under this circumstance, there is a possibility that the test data more than an essentially necessary test amount may be used in order to secure exhaustivity of the test, which results in a problem in terms of test cost and test time efficiency. Moreover, when the randomly-selected data are used, there is a problem of incapability to secure exhaustivity. Specifically speaking, it is difficult for the test data generator described in Patent Literature 1 to generate appropriate test data based on the data-column-based characteristics.

"The present invention was devised in consideration of the above-described circumstances and aims at proposing a database analyzer and database analysis method capable of exhaustively analyzing a database and providing a data pattern obtained by classifying data groups of the database in terms of table-column-based characteristics.

"Means for Solving the Problems

"In order to solve the above-mentioned problems, provided according to the present invention is a database analyzer for analyzing a data group stored in an analysis target database by focusing attention on a designated table column in the data, the database analyzer including: a storage unit storing data; a data sorting unit for sorting a data group acquired from the analysis target database based on data values of the table column and storing it as analysis target data in the storage unit; a data pattern creation processing unit for creating a group for each of the data values based on differences between the data values of the analysis target data and storing a data pattern, which is a collection of the groups, in the storage unit; a data pattern judgment processing unit for judging validity of the data pattern stored in the storage unit based on a first judgment standard; and a data pattern transformation processing unit for transforming and reconstructing the data pattern and storing the reconstructed data pattern in the storage unit if a negative result is obtained for the validity judgment by the data pattern judgment processing unit; wherein the data pattern transformation processing unit reconstructs the data pattern with respect to constituent elements of each group included in the data pattern by transforming each group in accordance with a specified conversion rule for converting the constituent elements, which are conceptually similar to each other, into the same constituent element.

"Furthermore, in order to solve the above-mentioned problems, provided according to the present invention is a database analysis method by a database analyzer for analyzing a data group stored in an analysis target database by focusing attention on a designated table column in the data, the database analyzer including a storage unit storing data, the database analysis method including: a data sorting step executed by the data analyzer sorting a data group acquired from the analysis target database based on data values of the table column and storing it as analysis target data in the storage unit; a data pattern creation step executed by the data analyzer creating a group for each of the data values based on differences between the data values of the analysis target data and storing a data pattern, which is a collection of the groups, in the storage unit; a data pattern judgment step executed by the data analyzer judging validity of the data pattern stored in the storage unit based on a first judgment standard; and a data pattern reconstruction step executed, if a negative result is obtained for the validity judgment by the data pattern judgment unit, by the data analyzer reconstructing the data pattern with respect to constituent elements of each group included in the data pattern by transforming each group in accordance with a specified conversion rule for converting the constituent elements, which are conceptually similar to each other, into the same constituent element and storing the reconstructed data pattern in the storage unit.

"Advantageous Effects of Invention

"The present invention can exhaustively analyze a database and provide a data pattern obtained by classifying data groups of the database in terms of table-column-based characteristics.

BRIEF DESCRIPTION OF DRAWINGS

"FIG. 1 is a block diagram illustrating a configuration example of a database analyzer according to a first embodiment.

"FIG. 2 is a flowchart illustrating a database analysis processing sequence for analyzing data groups of a database.

"FIG. 3 is a schematic diagram for explaining analysis target data.

"FIG. 4 is schematic diagram (1) for explaining processing for creating an initial data pattern.

"FIG. 5 is schematic diagram (2) for explaining the processing for creating the initial data pattern.

"FIG. 6 is schematic diagram (3) for explaining the processing for creating the initial data pattern.

"FIG. 7 is a table illustrating an example of a data pattern evaluation standard.

"FIG. 8 is a schematic diagram for explaining processing for evaluating validity of a data pattern.

"FIG. 9 is a table illustrating an example of data pattern transformational rules.

"FIG. 10 is schematic diagram (1) for explaining processing for transforming a data pattern.

"FIG. 11 is schematic diagram (2) for explaining processing for transforming a data pattern.

"FIG. 12 is schematic diagram (3) for explaining processing for transforming a data pattern.

"FIG. 13 is a schematic diagram for explaining processing for deciding a reconstructed data pattern from among data patterns after the transformation processing.

"FIG. 14 is a schematic diagram for explaining validity evaluation of the reconstructed data pattern.

"FIG. 15 is a schematic diagram for explaining an example of processing for outputting a data pattern.

"FIG. 16 is a block diagram illustrating a configuration example of a database analyzer according to a second embodiment.

"FIG. 17 is schematic diagram (1) for explaining processing for creating an initial data pattern according to the second embodiment.

"FIG. 18 is schematic diagram (2) for explaining processing for creating the initial data pattern according to the second embodiment.

"FIG. 19 is a table showing an example of a data pattern evaluation standard according to the second embodiment.

"FIG. 20 is a schematic diagram for explaining data pattern validity evaluation according to the second embodiment.

"FIG. 21 is a table showing an example of data pattern transformational rules according to the second embodiment.

"FIG. 22 is a schematic diagram for explaining exception pattern judgment processing based on data pattern rejection rules according to the second embodiment.

"FIG. 23 is a schematic diagram for explaining exception pattern rejection processing according to the second embodiment.

"FIG. 24 is a schematic diagram for explaining processing for evaluating validity of a finally reconstructed data pattern according to the second embodiment.

"FIG. 25 is a schematic diagram for explaining an example of processing for outputting a data pattern according to the second embodiment."

For additional information on this patent application, see: HASHIMOTO, Yasunori; MIBE, Ryota; YOSHIMURA, Kentaro; DANNO, Hirofumi; ISHIKAWA, Sadahiro; YAMAGUCHI, Kiyoshi. Database Analyzer and Database Analysis Method. Filed October 23, 2013 and posted May 8, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=915&p=19&f=G&l=50&d=PG01&S1=20140501.PD.&OS=PD/20140501&RS=PD/20140501

Keywords for this news article include: Hitachi Ltd., Information Technology, Information and Data Generators.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly