News Column

Patent Issued for Method and Device for Improved Ontology Engineering

September 2, 2014

By a News Reporter-Staff News Editor at Information Technology Newsweekly -- Collibra NV/SA (Brussels, BE) has been issued patent number 8812553, according to news reporting originating out of Alexandria, Virginia, by VerticalNews editors.

The patent's inventors are Trog, Damien (Oudergem, BE); Christiaens, Stijn (Zele, BE); De Leenheer, Pieter (Brussel, BE); Van De Maele, Felix Urbain Yolande (B-Oostende, BE); Meersman, Robert Alfons (Deurne, BE).

This patent was filed on April 29, 2010 and was published online on August 19, 2014.

From the background information supplied by the inventors, news correspondents obtained the following quote: "Internet and other open connectivity environments create a strong demand for sharing the semantics of data. Ontology systems are becoming increasingly essential for nearly all computer applications. Organizations are looking towards them as vital machine-processable semantic resources for many application areas. An ontology is an agreed understanding (i.e. semantics) of a certain domain, axiomatized and represented formally as logical theory in the form of a computer-based resource. By sharing an ontology, autonomous and distributed applications can meaningfully communicate to exchange data and thus make transactions interoperate independently of their internal technologies.

"Ontologies capture domain knowledge of a particular part of the real-world, e.g., knowledge about product delivery. Ontologies can be seen as a formal representation of the knowledge by a set of concepts and the relationships between those concepts within a domain. Ontologies must capture this knowledge independently of application requirements (e.g. customer product delivery application vs. deliverer product delivery application). Application-independence is the main disparity between an ontology and a classical data schema (e.g., EER, ORM, UML) although each captures knowledge at a conceptual level. For example, many researchers have confused ontologies with data schemes, knowledge bases, or even logic programs. Unlike a conceptual data schema or a 'classical' knowledge base that captures semantics for a given enterprise application, the main and fundamental advantage of an ontology is that it captures domain knowledge highly independent of any particular application or task. A consensus on ontological content is the main requirement in ontology engineering, and this is what mainly distinguishes it from conceptual data modelling.

"The main foundational challenge in ontology engineering is the trade-off between ontology usability and reusability. The more an ontology is independent of application perspectives, the less usable it will be. In contrast, the closer an ontology is to application perspectives, the less reusable it will be.

"Certain prior art systems use XML schemas as so-called ontologies. However, XML schemas are not ontologies for the following reasons. They define a single representation syntax for a particular problem domain but not the semantics of domain elements. They define the sequence and hierarchical ordering of fields in a valid document instance, but do not specify the semantics of this ordering. For example, there is no explicit semantics of nesting elements. They do not aim at carving out re-usable, context-independent categories of things--e.g. whether a data element 'student' refers to the human being or the role of being as student. Quite the opposite, one can often observe that XML schema definitions tangle very different categories in their element definitions, which hampers the reuse of respective XML data in new contexts.

"Ontology systems are typically used for querying multiple information systems. The ontology system typically comprises a union of the elements within said information systems. Prior art systems, as described in US2006/101073 and WO2008/088721, typically describe a system and method for data integration whereby multiple XML source schemas are queried through a common XML target schema.

"However, recent developments in open connectivity applications demand communication between two or more information systems. Any communication between two or more information systems occurs in some format serialized in a language such as XML. In order to align the different formats (e.g., the format of the sending party and the format expected by the receiving party), people responsible for the systems have to align as well, until they reach an agreement on what to send, and how exactly it will be represented. Currently, this problem is solved ad hoc by creating some case specific solution (e.g., an XSLT script). However, there is absolutely no extra value or means for reusability created by taking this approach.

"Current solutions mostly consist of creating custom transformations between every format. Point to point approaches are fast but difficult to make, manage and maintain. Hub and spoke approaches are more efficient but more difficult to develop and maintain, and have problems with flexibility.

"Typical prior art systems, such as EP 1 260 916, model entities and the binary relations between them. This is like speaking a two-word language. However, real world natural language consists of sentences, linking multiple words in a semantical relationship. It is inherent that sentences comprise more meaning.

"In the paper 'Towards Ontological Commitments with .OMEGA.-RIDL Markup Language' (D. Trog et al., Advances in Rule Interchange and Applications, Lecture Notes in Computer Science, pp. 92-106) a markup language (XML) representation of the .OMEGA.-RIDL language is described. The different constraints are presented in both controlled natural language and markup language. A representation of a conceptual path is shown. It is to be noted that a conceptual path provides the basis to compose a conceptual query. However, the paper remains silent on how such a query can be composed or executed. The paper does not discuss performing data format translations. Only conceptual querying is discussed, which involves reading. Updates, which involve both reading and writing operations to perform a translation, are not discussed.

"In 'Ontology Engineering--the DOGMA approach' (M. Jarrar et al, Advances in Web Semantics I, vol. 4891, 2009-01-01, pp. 7-34) the authors describe the DOGMA ontology approach compared with other approaches. The paper is about the motivation behind splitting the Ontology Base (also Lexon Base) and axiomatizations (also commitments), what they dub the Double Articulation Principle. The constructs are formalized in first order logic, with discussion about description logics. Only a search/retrieval scenario is given as an example without actually explaining how it would work. Again this is limited to conceptual querying.

"Patent application EP1327941 A2 describes a method for transforming data from one data schema to another by mapping the schemas into an ontology model, and deriving a transformation. The result of the method is a unidirectional transformation script, such as XSLT, which is processed by a pre-existing transformation engine. The patent application describes a frame-based approach (i.e. classes having properties), where data schema elements are mapped on properties of classes.

"Hence, there is a need for more natural language and re-usability in ontology engineering and more specifically in the communication between information data systems."

Supplementing the background information on this patent, VerticalNews reporters also obtained the inventors' summary information for this patent: "The solution of the present invention adheres a fact-based approach (wherein objects play roles with each other) that allows construction of complex paths. These paths are a form of controlled natural language that improve readability and reduce the number of concepts needed.

"In a first aspect the invention provides a method for populating a data system for use in a computer application, whereby the data system has a structure addressable by at least one application path. The method comprises the steps of: a. mapping the at least one application path of the data system to at least one conceptual path of an ontology system, said at least one conceptual path addressing a part of the structure of the ontology system, and b. populating the data system at a location addressed by the application path with data values contained in the conceptual path.

"In the proposed method some given ontology system is used. The ontology system has a certain syntactic structure. Conceptual paths can be defined that are capable of addressing parts of the structure of the ontology system. A data system used by a computer application is provided with a given structure that can be addressed (or parts thereof can be addressed) by application paths. When linking the data system to the ontology system, the present invention proposes performing a mapping between the application paths of the data system and the conceptual paths of the ontology system. In this invention these mappings are interpreted in real-time by a translation engine as described in detail below. In this way, different data structures with different representations, syntax and terminology are mapped to a shared and agreed upon ontology, resulting in increased transparency, compliance and reuse, as well as automated translation between disparate systems.

"After the mapping step the data system is populated with data values comprised in the conceptual path and this at a location in the data system addressed by the application path.

"In a preferred embodiment the method comprises the initial step of generating the data system. This is possible by exploiting the structure of the given ontology system.

"In one embodiment the method further comprises the step of linking an additional data system acting as a source data system to the ontology system. This is achieved by mapping at least one application path of the additional data system to the at least one conceptual path of the ontology system. The other data system then acts as target data system.

"The ontology system is then shared, so that transformations between said data systems can be derived, i.e., between the source and the target data system. Also transformations between more than one source data system and more than one target data system can be envisaged. The transformations provide to read data into the ontology system from one data system. The at least one conceptual path of the ontology system is then populated with data values contained in the at least one application path of the additional data system. The transformation further also provides to write said data to a data system. Said data are written from conceptual paths of the ontology system to the (target) data system. The present invention allows for querying as well as for updating data systems.

"In a typical embodiment the additional data system has a structure different from the other data system. However, in a specific embodiment it is possible that their structure is the same. In one embodiment the schemas of the data system and the additional data system acting as source data system are the same. Using this approach a data system can be trimmed, enriched or its contents validated. Data system trimming results in a data system that contains only a subset of the data contained within. This can occur e.g., in a scenario where the data system contains information about customers, orders and invoices. By trimming the data system, a new data system can be produced that only contains the customer information. In enrichment additional data is added. E.g., when one wants to enrich the customer data with information from accounting, one can connect to the accounting data and add the additional data. Validation allows checking whether the data in the data system follows the rules as they were specified in the commitment. Since the rules are richer this offers better checks than validation of the data schema.

"The present invention provides conceptual reading and writing in any format and is as such not limited to relational database formats or XML messages. This allows true format translations between data systems of different formats. As already mentioned, the format translations may be provided between more than one source data system and/or more than one target data system. In a specific embodiment said source and target data systems are the same data system.

"In one aspect the present invention provides a data storage system for storing data instances in the ontology system. In one embodiment said storage data system is a relational database. Alternatively, said storage data system is memory.

"At least one conceptual path may contain one or more identifiers of one or more of the source and/or target and/or storage data systems, and the commitment layer comprises commitments for mapping said identifiers. A first identifier of a sending party may be provided and a second identifier of a receiving party.

"At least one conceptual path may contain metadata related to messaging or translation between the source and/or target and/or storage data systems, and the commitment layer comprises commitments for mapping the metadata. In a particular embodiment an identifier is provided of the parties that sent or received the metadata. The metadata may be a line number or a message sequence and may be used for security, logging, message ordering and message duplication.

"In an embodiment at least one conceptual path contains virtual concepts having no corresponding data value in the data system (or optionally in the source data system). Virtual concepts add to the real-world conception and may be needed for a better conceptual understanding.

"In a further embodiment mappings onto queries may be provided. Said queries may comprise calculations. The queries are preloaded with the values found in the data system. If those values cannot be found the queries are executed and calculated.

"In yet a further embodiment for one or more of the mappings the instances need to be manipulated after reading by executing a function.

"In another embodiment mappings onto functions may be provided. Said functions comprise procedural logic that performs operations on the data in the input and results in data as output for further processing in the translation. These functions can be custom, written in specific scripting language (e.g., Groovy). Mappings onto rules may be provided. Said rules check if the data in the input are consistent and correct. Custom rules can be written via said functions.

"In another embodiment said ontology system provides a transformation from at least a part of the logical representation of data in said source data system to at least a part of the logical representation of data in said target system.

"In yet a further aspect a program is provided executable on a programmable device containing instructions, which, when executed, perform the method as set out above.

"In a further aspect the invention relates to a device for populating a data system for use in a computer application, whereby the data system has a structure addressable by at least one application path. The device comprises means for receiving mapping information for mapping the at least one application path of the data system to at least one conceptual path of an ontology system, said at least one conceptual path addressing a part of the structure of the ontology system, and means for populating the data system based on the received mapping information."

For the URL and additional information on this patent, see: Trog, Damien; Christiaens, Stijn; De Leenheer, Pieter; Van De Maele, Felix Urbain Yolande; Meersman, Robert Alfons. Method and Device for Improved Ontology Engineering. U.S. Patent Number 8812553, filed April 29, 2010, and published online on August 19, 2014. Patent URL:

Keywords for this news article include: Engineering, Information Technology, Information and Data Systems.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC

For more stories covering the world of technology, please see HispanicBusiness' Tech Channel

Source: Information Technology Newsweekly

Story Tools Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters