Novelty as a Form of Contextual Re-ranking: Efficient KLD Models and Mixture Models

Current Information Retrieval systems are often based on topicality. They estimate relevance by comparing the similarity between the user query and each document. These systems do not take into account important contextual information. More specifically, they do not often apply mechanisms to filter out redundant information. We interpret context here as the set of chunks of text from the ranked set of documents that the user has already seen. This is a valuable contextual information to guide the retrieval processes in a way that avoids redundancy. It is desirable that the ranking of results is composed by relevant but also novel material. This means that each document must provide to the user unseen information which is related to his need. In this work we study different novelty detection approaches that make good use of this contextual information. We show that these techniques can be applied effectively and efficiently at the sentence level.

keywords: