News Column

Patent Issued for Automated Patient/Document Identification and Categorization for Medical Data

June 27, 2014



By a News Reporter-Staff News Editor at Health & Medicine Week -- From Alexandria, Virginia, NewsRx journalists report that a patent by the inventors Chung, Stanley (Royersford, PA); Farooq, Faisal (King of Prussia, PA); Fung, Glenn (Madison, WI); Krishnapuram, Balaji (King of Prussia, PA); Rao, R. Bharat (Berwyn, PA); Rosales, Romer E. (Downingtown, PA); Weis, John (Collegeville, PA); Yu, Shipeng (Exton, PA), filed on September 28, 2010, was published online on June 10, 2014 (see also Siemens Medical Solutions USA, Inc.).

The patent's assignee for patent number 8751495 is Siemens Medical Solutions USA, Inc. (Malvern, PA).

News editors obtained the following quote from the background information supplied by the inventors: "The present invention relates to extracting and classifying medical data from a medical data storage source, and more particularly, to extracting and classifying medical data that is in unstructured form from such a source.

"In general, an electronic medical record (EMR) is a computerized legal medical record created in an organization that delivers care, such as a hospital or doctor's office. In an EMR, various data elements may be associated to a patient or a patient visit; for example, diagnosis codes, lab results, pharmacy, insurance, doctor notes, radiological images, genotypic information, etc. EMRs tend to be part of a local stand-alone health information system that allows storage, retrieval and manipulation of records.

"Data in an EMR is stored in structured or unstructured form. FIG. 1 shows an exemplary EMR 100 with structured and unstructured data. In FIG. 1, the patient's name 'John Doe' in field 110 and the examination date 'Jan. 1, 2007' in field 120 are examples of structured data. The medical report (e.g., doctor's note) 'Patient presents . . . ' in field 130 is an example of unstructured data. Other examples of structured data may include date of birth (mm/dd/yyyy), zip code (a five-digit number), smoke status (either yes/no), insurance type (either medicare/medicade/private), or medication list (medication A, medication B . . . ). Other examples of unstructured data may include images, lab reports, biological sequences and other forms of written reports.

"The distinction between these two data types is that desired information can be easily extracted from structured data by using a standard database query language, such as Structured Query Language (SQL). This is so, because the format of the structured data is generally fixed and already known. In contrast, it is not easy to extract desired information from unstructured data. This is so, because the format of the unstructured data is generally not fixed or it is too generic.

"For example, with reference to FIG. 1, it is straightforward for a computer to determine the patient's name from the name of patient field 110, or the date of the patient's examination from the date of examination field 120, in both cases assuming the computer knows the formatting of fields 110 and 120. However, due to the freeform entry of data into the medical report field 130, it is not straightforward for a computer to determine what the patient's prescription is from field 130.

"As can be gleaned, unstructured data is an essential source of patient information. In fact, it is widely accepted that key clinical information in an EMR is stored in unstructured form. However, by their inherent nature discussed above, it is difficult to automatically extract useful information contained in unstructured data and make it available in a readily usable form. Such information is typically found through manual search."

As a supplement to the background information on this patent, NewsRx correspondents also obtained the inventors' summary information for this patent: "in an exemplary embodiment of the present invention, there is provided a method comprising: receiving a data source selection from a user or software application, the data source including medical information of a plurality of patients; receiving, from the user or software application, a data pattern that is related to a concept to be explored in the data source; querying the data source to find information that approximately matches the data pattern; and receiving the information from the data source, wherein the information includes unstructured data, assigning a classification to individual parts of the information based on the part's relationship to the data pattern, and outputting the classified information to the user or software application, and wherein the method is performed using a processor.

"The classified information is arranged in tabular form with a row containing an individual part of the information in one column and the part's classification in another column.

"The row further includes a link to the source containing the individual part of the information.

"The row further includes a numerical score indicating a strength of the classification.

"The method further comprises grouping individual parts of the information in adjacent rows, wherein the grouping is based on similarity of the individual pails to each other.

"An unstructured data search algorithm is used to find the information that approximately matches the data pattern.

"The data source includes electronic medical records, radiological images, or gene sequences.

"The unstructured data includes text, images or biological sequences.

"The classified information includes structured data.

"The concept is a medical question.

"The data pattern includes a keyword, regular expression or a context-free grammar.

"The data pattern includes an image part or an image filter.

"The data pattern includes genetic data.

"In an exemplary embodiment of the present invention, there is provided a system comprising: a memory device for storing a program; a processor in communication with the memory device, the processor operative with the program to: receive a data source selection, the data source including medical information of a plurality of patients; receive a data pattern that is related to a concept to be explored in the data source; query the data source to find information that approximately matches the data pattern; and receive the information from the data source, wherein the information includes unstructured data assign a classification to individual parts of the information based on the part's relationship to the data pattern, and output the classified information.

"The classification indicates when the individual part of the information is positive, negative or not applicable to the data pattern.

"The processor is further operative with the program to display the output on a graphical user interface (GUI).

"The classified information is browsable, editable, or processible via the GUI.

"In an exemplary embodiment of the present invention, there is provided a computer program product comprising: a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising: computer readable program code configured to perform the steps of: querying a data source to find data that exactly or approximately matches a data pattern, wherein the data source includes medical information of a plurality of patients; and receiving the data from the data source, wherein the data includes unstructured data, assigning a classification or score to individual parts of the data based on the part's relationship to the data pattern, and outputting the classified/scored data.

"The data source or the data pattern is pre-determined.

"The data pattern is determined from a concept, the concept being related to a medical question

BRIEF DESCRIPTION OF THE DRAWINGS

"FIG. 1 is an exemplary electronic medical record (EMR) including structured and unstructured data;

"FIG. 2 is a flowchart illustrating an exemplary embodiment of the present invention;

"FIG. 3 is a table illustrating an exemplary output of the present invention;

"FIG. 4 is a computer system in which an exemplary embodiment of the present invention may be implemented; and

"FIGS. 5A and 5B are part of the same screen-shot that illustrates an exemplary output of the present invention."

For additional information on this patent, see: Chung, Stanley; Farooq, Faisal; Fung, Glenn; Krishnapuram, Balaji; Rao, R. Bharat; Rosales, Romer E.; Weis, John; Yu, Shipeng. Automated Patient/Document Identification and Categorization for Medical Data. U.S. Patent Number 8751495, filed September 28, 2010, and published online on June 10, 2014. Patent URL: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=8751495.PN.&OS=PN/8751495RS=PN/8751495

Keywords for this news article include: Software, Legal Issues, Records as Topic, Electronic Medical Records, Siemens Medical Solutions USA Inc..

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Health & Medicine Week


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters