CSAIL Publications and Digital Archive header
bullet Technical Reports bullet Work Products bullet Research Abstracts bullet Historical Collections bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line

 

Research Abstracts - 2006
horizontal line

horizontal line

vertical line
vertical line

Learning Seemingly Unrelated Tasks with Regularized Manifolds

Giorgos Zacharia & Tomaso Poggio

The Problem

In this work we investigate how to incorporate data available from seemingly unrelated tasks to improve task specific learning models. 

Motivation

Recent work in multitask learning (Evgeniou et al, 2005; Chapelle, and Harchaoui 2004, Girosi 2003) has shown that data from related tasks can be used effectively by regularized learning algorithms, with nonlinear loss functions that penalize errors based on aggregate data less than the errors from the individual data. The same approach can apply to different regularized algorithms, including SVMs, Regularized Least Squares Classification (RLSC) (Rifkin 2002), or Regularized Logistic Regression (Minka 2001). In this work we extend these approaches with the graph Laplacian transformation, to show how to pose the same problem as a special case of semi-supervised learning, with regularized manifolds (Belkin, and Niyogi 2003).

Previous Work

In previous work (Evgeniou, Boussios, and Zacharia 2005) we introduced a combined classifier approach that allows us to exploit information from the aggregate data set. The weighted aggregate information (estimated through cross validation) improved the individual specific models.  Our previous work has focused on applications of user preference modeling, and was evaluated on a widely available toy dataset generator (Toubia et al 2004)

Approach

Regularized Laplacian algorithms (Belkin and Niyogi, 2003) have been used successfully in other semi-supervised learning settings. The loss function of the Laplacian RLSC penalizes the weighted deviation of the estimated function , for instances that fall close to each other in the geodesic space of a manifold (high weight in the manifold space ). The manifold is estimated on both the labeled and the unlabeled data.  In our problem setting, we already have an estimate for the labels, namely the choices another individual made in the particular instance. Therefore, we modify the Laplacian RLSC formulation introduced by Belkin, and Niyogi (2003) to use the actual labels for the additional instances , and we again penalize the seemingly unrelated information with a penalty term estimated by cross-validation on the training set.

Progress

We evaluate our approach on the publicly available preference data provided by Sawtooth Software (Sawtooth). The dataset includes data from 100 individuals, with 10 metric instances of products with five attributes (the users provide metric ratings for each product configuration). We transform the problem to choice based comparisons by creating the vectors of differences of the instances, and classifying each comparison with a “+1” or “-1” for a winning and losing comparisons respectively.  We subsample l=10 comparisons per individual and u={10,20,30,50,100} comparisons from the other 99 individuals. We report the results in Table 1 below.

Table 1 Results of Laplacian RLSC experiments with10 instances of individual specific data,  u instances of seemingly unrelated data, and weight μ on loss contributed by the seemingly unrelated data.

 

u=10

u=20

u=30

u=50

u=100

μ=0

17.50%

18.50%

18.38%

18.20%

17.54%

μ=0.000001

17.34 %

19.46 %

17.52 %

18.11 %

20.10 %

μ=0.00001

18.30 %

18.20 %

17.54 %

18.46 %

18.10 %

μ=0.0001

18.56 %

18.76 %

18.02 %

17.73 %

17.90 %

μ=0.001

17.20 %

18.12 %

18.28 %

17.87 %

18.00 %

μ=0.01

16.92 %

17.52 %

17.98 %

17.70 %

18.15 %

μ=0.1

16.86 %

16.68 %

16.04 %

15.58 %

16.30 %

μ=0.2

14.80 %

14.68 %

14.86 %

14.89 %

14.30 %

μ=0.3

16.22 %

16.76 %

16.74 %

16.57 %

18.60 %

μ=0.4

15.94 %

16.54 %

17.94 %

17.93 %

20.75 %

μ=0.5

17.90 %

16.64 %

18.74 %

19.48 %

20.60 %

μ=0.6

17.74 %

20.20 %

20.60 %

22.38 %

25.35 %

As we can see from the results, the optimal μ=0.2 seems to not depend on amount of additional data, and the addition of more data from other users does not seem to affect the performance significantly.

Future

We will further investigate the Laplacian approach with other similar algorithms like Laplacian SVMs, and work on mathematical understanding of the interaction between the intrinsic regularization penalty, and the penalty term we introduced for the seemingly unrelated data.  We will also apply the same approach to other applications of seemingly unrelated data, such as the Inner London Education Authority examination data.

References:

[1] Theodoros Evgeniou, Charles Micchelli, and Massimiliano Pontil. Learning multiple tasks with kernel methods. J. Machine Learning Research, 6: pp. 615--637. 2005.

[2] Olivier Chapelle and Zaid Harchaoui: A Machine Learning Approach to Conjoint Analysis. Advances in Neural Information Processing Systems 17, pp. 257-264, MIT Press, Cambridge, MA, USA, 2005

[3] Federico Girosi. Demographic Forecasting. PhD Thesis, Harvard University, Cambridge MA, USA, 2003

[4] Ryan Rifkin. Everything Old Is New Again: A Fresh Look at Historical Approaches in Machine Learning. PhD Thesis, MIT, Cambridge MA, USA, 2002

[5] Tom Minka. Algorithms for maximum-likelihood logistic regression. Technical Report 758, Department of Statistics, Carnegie Mellon University. 2001

[6] Mikhail Belkin., and Partha Niyogi, Semi-supervised Learning on Riemannian Manifolds, Machine Learning, 56, Machine Learning, Special Issue on Clustering, 209-239, 2004.

[7] Theodoros Evgeniou, Constantinos Boussios, and Giorgos Zacharia“Generalized Robust Conjoint Estimation”, Marketing Science, Vol. 24, N°. 3, pp. 415-429, 2005

[8] Olivier Toubia, John Hauser and Duncan Simester, "Polyhedral Methods for Adaptive Choice-Based Conjoint Analysis," Journal of Marketing Research, Vol. XLI, 116-131, 2004

[9] Sawtooth Software, Inc, HB-Reg: Hierarchical Bayes Regression, URL: http://www.sawtoothsoftware.com/hbreg.shtml 

vertical line
vertical line
 
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu