Exercises for the course Abteilung Maschinelles Lernen
Institut für Softwaretechnik und theoretische Informatik
Machine Learning 2 Fakultät IV, Technische Universität Berlin
Summer semester 2021 Prof. Dr. Klaus-Robert Müller
Email:
[email protected] Exercise Sheet 2
Recall: For a sample of d1 - and d2 -dimensional data of size N , given as two data matrices X ∈ Rd1 ×N , Y ∈
Rd2 ×N (assumed to be centered), canonical correlation analysis (CCA) finds a one-dimensional projection
maximizing the cross-correlation for constant auto-correlation. The primal optimization problem is:
Find wx ∈ Rd1 , wy ∈ Rd2 maximizing wx> Cxy wy
subject to wx> Cxx wx = 1 (1)
wy> Cyy wy = 1,
where Cxx = N1 XX > ∈ Rd1 ×d1 and Cyy = N1 Y Y > ∈ Rd2 ×d2 are the auto-covariance matrices of X resp. Y ,
and Cxy = N1 XY > ∈ Rd1 ×d2 is the cross-covariance matrix of X and Y .
Exercise 1: Primal CCA (10 + 5 P)
We have seen in the lecture that a solution of the canonical correlation analysis can be found in some
eigenvector of the generalized eigenvalue problem:
0 Cxy wx Cxx 0 wx
=λ
Cyx 0 wy 0 Cyy wy
(a) Show that among all eigenvectors (wx , wy ) the solution is the one associated to the highest eigenvalue.
(b) Show that if (wx , wy ) is a solution, then (−wx , −wy ) is also a solution of the CCA problem.
Exercise 2: Dual CCA (10 + 15 + 5 + 5 P)
In this exercise, we would like to derive the dual optimization problem.
(a) Show, that it is always possible to find an optimal solution in the span of the data, that is,
wx = Xαx , wy = Y αy
with some coefficient vectors αx ∈ RN and αy ∈ RN .
(b) Show that the solution of the dual optimization problem is found in an eigenvector of the generalized
eigenvalue problem 2
0 A · B αx A 0 αx
=ρ· 2
B·A 0 αy 0 B αy
where A = X > X and B = Y > Y .
(c) Show that the solution of the dual is given by the eigenvector associated to the highest eigenvalue.
(d) Show how a solution to the original problem can be obtained from the solution of the generalized eigenvalue
problem of the dual.
Exercise 3: CCA and Least Square Regression (20 P)
Consider some supervised dataset with the inputs stored in a matrix X ∈ RD×N and the targets stored in
a vector y ∈ RN . We assume that both our inputs and targets are centered. The least squares regression
optimization problem is:
min kX > v − yk2
v∈RD
We would like to relate least square regression and CCA, specifically, their respective solutions v and (wx , wy ).
(a) Show that if X and y are the two modalities of CCA (i.e. X ∈ RD×N and y ∈ R1×N ), the first part of the
solution of CCA (i.e. the vector wx ) is equivalent to the solution v of least square regression up to a scaling
factor.
Exercise 4: Programming (30 P)
Download the programming files on ISIS and follow the instructions.