Patent Issued for Graph Search System and Method for Querying Loosely Integrated DataInternational Business Machines CorporationNewsRx.com
By a News Reporter-Staff News Editor at Information Technology Newsweekly -- From Alexandria, Virginia, VerticalNews journalists report that a patent by the inventors Balmin, Andrey (San Jose, CA); Hwang, Heasoo (La Jolla, CA); Pirahesh, Mir Hamid (San Jose, CA); Reinwald, Berthold (San Jose, CA), filed on March 22, 2008, was cleared and issued on December 4, 2012.
The patent's assignee for patent number 8326847 is International Business Machines Corporation (Armonk, NY).
News editors obtained the following quote from the background information supplied by the inventors: "The exponential growth in the amount and accessibility of data has raised many challenges in the field of information search and retrieval. These challenges are compounded by the heterogeneous nature of real world data, which may exist in a structured, semi-structured or unstructured state. The goal of much research has been the automatic or semiautomatic discovery of common entities and relationships across such disparate kinds of data. This may be done, for example, by crawling thousands of data sources, for example, on networks such as the internet. Another factor in the complexity of information search and retrieval is the multitude of ways of situational integration of data. One way to deal with these challenges is by using extensible data structures and creative ways for data retrieval across disparate data sources. In the case of the internet, one example is by crawling thousands of data sources and using search engines to index the crawled web documents.
"One approach to information retrieval is to model data as graphs of objects connected by relationships. However, it is not easy to formulate precise, yet flexible queries that will find different meaningful connections between objects in such graphs. Standard database query languages, such as XQuery, are too rigid, and require full knowledge of the database schema from the user. Conventional search systems have very limited functionality and typically only find objects that contain all the keywords in a search.
"An example of a query which illustrates the difficulties in dealing with relationships across disparate data is as follows. Consider a product manager looking for employees in a certain department who somehow (directly or indirectly) contributed to a shipped product. One approach may be to take the product plan data coming from a content repository and dynamically combine it with the company employee data to find employees. The product manager expects to find employees who, for example, owned components of the product, developed components, or consulted employees on the development of components.
"For the above-described search, the product manager is looking for data retrieval with 'high recall' rather than 'high precision', which is usually the case with users of search engines. Since large amounts of data may be related to the query, it is important to be able to perform the search quickly and efficiently and to be able to summarize the results, for example, by identifying the highest ranking objects and relationships individually, and aggregating the less important ones.
"Another challenge is in finding efficient and user-friendly ways to represent the results of the search, where the results may be voluminous and complex.
"Accordingly, there is a need for improved systems that can search across large volumes of heterogeneous, real world data. There is also a need for ways to formulate precise, yet flexible queries that will find meaningful connections between data objects. There is also a need for such techniques which are fast and do not require full knowledge of the database schema."
As a supplement to the background information on this patent, VerticalNews correspondents also obtained the inventors' summary information for this patent: "To overcome the limitations in the prior art briefly described above, the present invention provides a method, computer program product, and system for supporting flexible querying of graph datasets.
"In one embodiment of the present invention, a method of executing a query on linked data sources comprises: generating an instance graph expressing relationships between objects in the linked data sources; receiving a query including at least first and second search terms; executing the first search term on the instance graph; generating a summary graph using the results of the execution; and executing the second search term on the summary graph.
"In another embodiment of the present invention, a method of finding relationships between objects in a database comprises: generating an instance graph expressing relationships between objects in the linked data sources; receiving a query including at least first and second search terms; executing a first search term in a query by using the first term as a filter to derive a subset of the database; performing a relationship search that ranks each object in the instance graph with respect to the subset; generating a summary graph using the results of the execution; and executing the second search term on the summary graph.
"In another embodiment of the present invention, a system comprises: a plurality of databases; a query processor coupled to a databases, the query processor having a filter module which receives a query including a relationship search term; and a relationship search engine coupled to the query processor and receiving an instance graph from one of the databases, the relationship search engine processes the relationship search term on the instance graph to determine a ranking of objects in the instance graph that indicates how related the objects are to the relationship search term.
"In a further embodiment of the present invention, a computer program product comprising a computer usable medium having a computer readable program, wherein the computer readable program when executed on a computer causes the computer to: generate an instance graph expressing relationships between objects in the linked data sources; receive a query including at least first and second search terms; execute the first search term on the instance graph; generate a summary graph using the results of the execution; and execute the second search term on the summary graph.
"Various advantages and features of novelty, which characterize the present invention, are pointed out with particularity in the claims annexed hereto and form a part hereof. However, for a better understanding of the invention and its advantages, reference should be made to the accompanying descriptive matter together with the corresponding drawings which form a further part hereof, in which there are described and illustrated specific examples in accordance with the present invention."
For additional information on this patent, see: Balmin, Andrey; Hwang, Heasoo; Pirahesh, Mir Hamid; Reinwald, Berthold. Graph Search System and Method for Querying Loosely Integrated Data. U.S. Patent Number 8326847, filed March 22, 2008, and issued December 4, 2012. Patent URL: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=21&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1041&f=G&l=50&co1=AND&d=PTXT&s1=20121204.PD.&OS=ISD/20121204&RS=ISD/20121204
Keywords for this news article include: Information Technology, Information and Data Retrieval, International Business Machines Corporation.
Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2012, NewsRx LLC