News Column

Patent Application Titled "Utilizing Classification and Text Analytics for Annotating Documents to Allow Quick Scanning" Published Online

August 19, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- According to news reporting originating from Washington, D.C., by VerticalNews journalists, a patent application by the inventors Emanuel, Barton W. (Manassas, VA); Paulis, Mark W. (New York, NY); Roboff, Mark L. (New York, NY), filed on March 27, 2014, was made available online on August 7, 2014.

The assignee for this patent application is International Business Machines Corporation.

Reporters obtained the following quote from the background information supplied by the inventors: "The present invention relates generally to annotating documents, and in particular, to a method, apparatus, and article of manufacture for utilizing classification and text analytics to annotate lengthy documents to allow quick scanning

"When faced with quickly scanning large documents or publications, humans may miss many important facts and fail to understand key points and issues. Understanding and retention of concepts can be improved by manual techniques such as highlighting key phrases and concepts, or making marginal notations. Such manual techniques take time for humans to perform. Productivity may be increased and manual effort reduced by automatically notating texts to highlight key concepts and to list salient facts."

In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventors' summary information for this patent application: "A computer-implemented method provides the ability to annotate a document. A document is obtained. The type and subject domain of the document are determined and an annotation strategy and domain model to load are determined based on the document type and subject domain respectively. The document is segmented into paragraphs and sections based on a document structure. A text analytics system provides annotations for each paragraph of the document based on the domain model and annotation strategy. Text in the document is annotated by applying the annotations to the original text of the document. The document (including the annotations) is then rendered

"A system is utilized to annotate a document. The system includes a classifier, an annotation model, a text analytics system, and a custom viewer/renderer application. The classifier has domain and document-type taxonomies. The classifier determines a type of the document and a subject domain of the document. The annotation model has information that it uses to determine and drive an annotation strategy based on various document types. The text analytics system has multiple domain models, and loads the appropriate domain model based on the subject domain The text analytics system is also configured to provide annotations of each paragraph of the document based on the domain model and annotation model. The custom viewer/renderer application annotates the document with the annotations and renders the document including the annotations.

"A computer program product annotates a document. The computer program product comprises a computer readable storage medium having computer readable program code embodied therewith. Computer readable program code is configured to obtain the document. Computer readable program code is configured to determine a type of the document. Computer readable program code is configured to determine a subject domain of the document. Computer readable program code is configured to determine an annotation strategy based on the type of document. Computer readable program code is configured to determine a domain model to load based on the subject domain. Computer readable program code is configured to segment the document into paragraphs and sections based on a document structure. Computer readable program code is configured to provide annotations for each paragraph of the document based on the domain model and annotation strategy. Computer readable program code is configured to annotate text in the document by applying the annotations to original text of the document. Computer readable program code is configured to render the document including the annotations.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

"Referring now to the drawings in which like reference numbers represent corresponding parts throughout:

"FIG. 1 illustrates a pictorial representation of a network data processing system used in accordance with one or more embodiments of the invention;

"FIG. 2 illustrates a block diagram of a data processing system that may be implemented as a server in accordance with an embodiment of the present invention;

"FIG. 3 illustrates a block diagram of a data processing system in accordance with an embodiment of the present invention;

"FIG. 4 illustrates a system used to annotate a document in accordance with one or more embodiments of the invention; and

"FIG. 5 illustrates the logical flow for annotating a document in accordance with one or more embodiment of the invention."

For more information, see this patent application: Emanuel, Barton W.; Paulis, Mark W.; Roboff, Mark L. Utilizing Classification and Text Analytics for Annotating Documents to Allow Quick Scanning. Filed March 27, 2014 and posted August 7, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=369&p=8&f=G&l=50&d=PG01&S1=20140731.PD.&OS=PD/20140731&RS=PD/20140731

Keywords for this news article include: Information Technology, Information and Data Processing, International Business Machines Corporation.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters