Conceptual structures in modern information retrieval

Claudio Carpineto

Summary

Motivated by a desire to go beyond keywords, the use of conceptual structures to improve the effectiveness of information retrieval has been around for a long time without producing impressive results. However, things have changed considerably over the last few years. The growth of the web has favoured the emergence of new search applications, usage patterns, data formats, and interaction paradigms. Traditional information retrieval assumptions and techniques have thus been deeply questioned; for instance, it is inherently more difficult to retrieve the information of interest if the user queries are very short and the collections being searched are highly heterogeneous, as is the case in web retrieval. Furthermore, a number of more challenging information finding tasks have emerged that seem to require a better understanding of the meaning of queries and documents and at least some ability of interpretation and manipulation of text data. These include, among others, question answering, information retrieval with structured queries, homepage finding, information retrieval from mobile devices, recommender systems, and mining of specialised collections. As a result, much of the current research in information retrieval has focused on the exploitation of a richer query or document context, from which to extract concepts or knowledge that may improve the system's retrieval effectiveness. Retrieval feedback, ontologies, XML, and web links are popular examples of a contextual source used for enhanced information retrieval. In this talk, I will consider the use of various forms of conceptual structures in several modern information retrieval tasks and discuss why they represent both a need and an opportunity for the accomplishment of such tasks. Then I will present some research efforts that are under way at Fondazione Ugo Bordoni on the integration of statistical and conceptual text processing techniques for more effective information retrieval, including the use of concept data analysis for document ranking and mining.

Download Power Point Presentation