MIT CSAIL Research Abstracts

CSAIL Publications and Digital Archive header

Technical Reports

Work Products

Research Abstracts

Historical Collections

horizontal line

Research Abstracts - 2006
horizontal line

horizontal line

Ranking Learning

Giorgos Zacharia & Tomaso Poggio

The Problem

In this work we investigate regularized ranking algorithms for user preference modeling and information retrieval problems. We focus on learning ranking functions from choice based comparisons of the form “the user prefers choice i=j than all other choices i Î {1,n}". In other supervised information retrieval problems, the training data includes explicit or implicit relevance rankings, either by experts, or by empirical observations. In the choice based conjoint problem we only know which the winning combination is.

Motivation

In some application domains, like user preference modeling, and several information retrieval tasks, like content-word prediction, or fuzzy text matching, we want to learn how to predict the best candidate, given a possible set of competing candidate combinations of features. The competing candidates are usually filtered using unsupervised, or some other weaker ranking algorithms. In many of these ranking tasks, the available training data do not provide metric based comparisons between the candidates (like user ratings of how they would rank each possible product configuration), so we cannot approach the task as a supervised regression problem. Therefore, we need to create algorithms that learn from such choice based comparisons, and are robust to the noise introduced by user inconsistencies, heterogeneous user preference functions, and non-linear preferences.

Previous Work

Our work is related to preference modeling research in the market research (Carroll and Green 1995; Green and Srinivasan 1978; Green and Srinivasan 1990), and the machine learning community (Evgeniou et al 2005), and to supervised information retrieval research (Collins 2002; Herbrich et al 1999; Joachims 2002; Schultz and Joachims 2003). Several statistical approaches to information retrieval, and other ranking problems like preference modeling, assume that there is explicit metric information available (Cui and Curry 2004), or other side information like transitive rankings (Herbrich et al (1999), or frequency of clicks (Joachims 2002). In choice based data, we only know which combination is the highest ranking, among the available options.

Approach

We transform the problem to a binary classification task of pairwise comparisons, and use regularized loss functions to account for noise in the users’ choices. We incorporate prior knowledge, such as positivity constraints of the underlying features, and the symmetric nature of the comparisons (if "A" better than "B" and "B" is worse than "A"). We also incorporate information from seemingly unrelated tasks, in a semi-supervised learning approach, where the seemingly unrelated tasks provide additional training examples.

Progress

We compared our method with standard algorithms used by the marketing community for utility function estimation: logistic regression, polyhedral estimation, and hierarchical bayes (HB). We evaluate on standard, widely used, simulation data. The experiments show that SVM adapted to ranking problems handle noise and nonlinearities significantly better than the baseline approaches, and significantly faster than HB, the best performing baseline approach.

Future

We will evaluate our approach in domains beyond the user preference modeling problems. Specifically, we will evaluate the performance of our approaches in information retrieval tasks, such as content-word learning, and fuzzy text matching. For the supervised content-word learning task, we will use a large corpus of news articles, and their human generated summaries, to learn which words from the main article are likely to appear in the summary. Such algorithms can help develop corpus specific automatic summarizers. For the fuzzy text matching task we will learn how to select the right company record from a corpus of 15 million US business records, when the user query might misspell, abbreviate, or have incomplete name or address data. Such algorithms are useful for data cleansing and database integration tasks.

References:

[1] Douglas Carroll and Paul Green. “Psychometric Methods in Marketing Research: Part I, Conjoint Analysis”. Journal of Marketing Research, 32, November 1995, p. 385-391, 1995

[2] Paul Green and V. Srinivasan. “Conjoint Analysis in Consumer Research: Issues and Outlook” Consumer Research, 5, 2, p. 103-123, 1978

[3] Paul Green and V. Srinivasan, “Conjoint Analysis in Marketing: New Developments with Implications for Research and Practice” Journal of Marketing, 54, 4, p. 3-19. 1990

[4] Michael Collins. Ranking Algorithms for Named-Entity Extraction: Boosting and the Voted Perceptron. ACL, 2002

[5] Ralf Herbrich, Thore Graepel, and Klaus Obermayer, Large Margin Rank Boundaries for Ordinal Regression. Advances in Large Margin Classifiers, Smola et al. Eds., The MIT Press, 29-53, 1999

[6] Thorsten Joachims. Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2002.

[7] MAtthew Schultz, and Thorsten Joachims, Learning a Distance Metric from Relative Comparisons, Proceedings of the Conference on Advance in Neural Information Processing Systems (NIPS), 2003.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu