News Column

Patent Application Titled "Determining Similarity of Unfielded Names Using Feature Assignments" Published Online

June 24, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- According to news reporting originating from Washington, D.C., by VerticalNews journalists, a patent application by the inventor Patman Maguire, Frankie E. (Washington D.C., WA), filed on July 11, 2013, was made available online on June 12, 2014.

No assignee for this patent application has been made.

Reporters obtained the following quote from the background information supplied by the inventors: "Data storage systems typically break personal names into multiple parts (i.e., parse the personal names) and store these parts in different fields, which may be labeled with terms such as 'given name,' 'middle name,' 'surname,' etc. Such a parsed name may be referred to as a fielded name, and parts of the name may be referred to as terms. Record retrieval systems then compare members of the same field to each other to determine which names are a match for a query. For example, a search for a database record with the name-related fields 'GivenName=Mary', 'Surname=Smith' would compare 'Mary' to terms stored in the field named 'GivenName' and 'Smith' to terms stored in the field named 'Surname.'

"Fielded names contribute to match failures in searches based on name because there is not always a strict correspondence between the terms used in a name and the fields into which the terms are parsed. This is especially true when names from various linguistic and cultural origins are stored in a system designed around one name model. For example, a typical male name in Saudi Arabia is made up of a person's given name, his father's name, his grandfather's name, and a family or tribal name. Western data storage systems may store names in the following fields: 'given name,' 'middle name,' and 'surname'. In such systems, the given name portion of the Saudi Arabian name corresponds to the given name field found in the Western data storage systems. Other parts of the Saudi Arabian name may be distributed across the available fields in various ways in different data storage systems. When a name search is done, the inconsistent fielding may lead to there being no corresponding name parts within the same fields as those of the query.

"Some search systems allow multiple parses of the names to be compared, and then searching on each of the possible parses. For example, 'Islam Azam Muhammed Metwali' might be variously represented as 'Metwali, Islam Azam Muhammed,' 'Muhammed Metwali, Islam Azam,' and 'Azam Muhammed Metwali, Islam.' While this strategy may reduce the chance that relevant names will be missed altogether, it also tends to increase the number of false positives returned by a search. For example, 'Mohammedi, Islam Baahi' would be allowed by the third parse, even though it is not a variant form of 'Islam Azam Muhammed Metwali.' This approach also requires multiple comparisons, which increases search times.

"Other systems match on tokens rather than names, then return the full names containing the matching tokens. A token may be described as a space-delimited sequence of characters representing a word in a name. In these other systems, returned names may be sorted for presentation based on various filtering or relevance criteria. The sorting criteria may be based on factors other than token similarity. For example, in one system, a search on 'Fernando Gomes' with no further qualifying information returns 'Fernando Jose Ferreira Gomes' and 'Fernando Gomes da Gama,' as the top two out of twenty matching names, ahead of the exact match 'Fernando Gomes,' and then also returns 'Paulo Francisco Gomes Fernandez' ahead of 'Fernando Luciano Gomes de Mendezes,' even though the latter name is more similar to the query."

In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventor's summary information for this patent application: "Provided are a method, computer program product, and system for comparing names. A first phrase score is obtained by comparing a name phrase in a first name to a name phrase in a second name. A second phrase score is obtained by comparing another name phrase in the first name to another name phrase in the second name. An overall score is generated based on the obtained first phrase score and the obtained second phrase score. The overall score is updated based on comparing features of the first name with features of the second name.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

"Referring now to the drawings in which like reference numbers represent corresponding parts throughout:

"FIG. 1 illustrates a computing environment in accordance with certain embodiments.

"FIG. 2 illustrates, in a flow diagram, operations for comparing two names using features in accordance with certain embodiments. FIG. 2 is formed by FIG. 2A, FIG. 2B, and FIG. 2C.

"FIG. 3 depicts a cloud computing node in accordance with certain embodiments.

"FIG. 4 depicts a cloud computing environment in accordance with certain embodiments.

"FIG. 5 depicts abstraction model layers in accordance with certain embodiments."

For more information, see this patent application: Patman Maguire, Frankie E. Determining Similarity of Unfielded Names Using Feature Assignments. Filed July 11, 2013 and posted June 12, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=1209&p=25&f=G&l=50&d=PG01&S1=20140605.PD.&OS=PD/20140605&RS=PD/20140605

Keywords for this news article include: Patents, Information Technology, Information and Data Storage.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters