0% found this document useful (0 votes)
70 views18 pages

Mathematics for Data Science: Lecture 10

The document covers key concepts in mathematics for data science, focusing on the Cauchy-Schwarz inequality, orthogonal projections, and the Gram-Schmidt algorithm for generating orthonormal bases. It includes exercises to reinforce understanding of these concepts and discusses the definition of orthogonal matrices and QR decomposition. The lecture is presented by Nikhil Krishnan M from IIT Palakkad.

Uploaded by

aditi9560.saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views18 pages

Mathematics for Data Science: Lecture 10

The document covers key concepts in mathematics for data science, focusing on the Cauchy-Schwarz inequality, orthogonal projections, and the Gram-Schmidt algorithm for generating orthonormal bases. It includes exercises to reinforce understanding of these concepts and discusses the definition of orthogonal matrices and QR decomposition. The lecture is presented by Nikhil Krishnan M from IIT Palakkad.

Uploaded by

aditi9560.saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DS5004: Mathematics for Data Science

Lec 10

Nikhil Krishnan M

nikhil@[Link]
[Link]

1
Cauchy-Schwarz Inequality
• | < 𝒙, 𝒚 > |2 ≤ < 𝒙, 𝒙 >< 𝒚, 𝒚 >

2
3
Exercise
• Prove that | < 𝒙, 𝒚 > |2 = < 𝒙, 𝒙 >< 𝒚, 𝒚 > if and only if one of the
vectors is a scalar multiple of the other.

4
Orthogonal Projection Gives the Best
Approximation!
• Let 𝑈 be a subspace of 𝑉.

• Orthonormal basis: {𝒖1 , 𝒖2, … , 𝒖𝑛}

• Let 𝒗 ∈ 𝑉 and 𝒗
෥ = < 𝒗, 𝒖1 > 𝒖1 + ⋯ +< 𝒗, 𝒖𝑛 > 𝒖𝑛

• Then for any 𝒖′ ∈ 𝑈, we have: 𝒗 − 𝒖′ ≥ 𝒗 − 𝒗


5
6
Exercise
• Recall our earlier discussion on least-squares solution. Think about
how projections are related to this?

7
8
9
Gram-Schmidt Algorithm
• Given 𝑆 = {𝒙1 , 𝒙2 , … , 𝒙𝑛 } → An orthonormal basis for span(𝑆)

• For simplicity, assume that 𝑆 is linearly independent.

10
Gram-Schmidt Algorithm: Idea
• (𝑛 − 1) steps

• In step-𝑖, we already have an orthonormal basis 𝑆 ={𝒗1 , … , 𝒗𝑖 } for


span({𝒙1 , … , 𝒙𝑖 })

• We will now express 𝒙𝑖+1 = 𝒙


෥𝑖+1 + 𝒚𝑖+1

• 𝒚𝑖+1 is orthogonal to 𝒗1 , … , 𝒗𝑖
𝒚𝑖+1
• Take 𝒗𝑖+1 = and add it to 𝑆
𝒚𝑖+1

11
Gram-Schmidt Algorithm
𝒙1
• Let 𝒗1 = , 𝑆 = {𝒗1 }
𝒙1

• Step-𝑖 (𝑖 = 1,2, … , 𝑛 − 1)

• 𝑆 ={𝒗1 , … , 𝒗𝑖 } is an orthonormal basis for span({𝒙1 , … , 𝒙𝑖 })

•𝒙
෥𝑖+1 =< 𝒙𝑖+1 , 𝒗1 > 𝒗1 + ⋯ +< 𝒙𝑖+1 , 𝒗𝑖 > 𝒗𝑖

• 𝒚𝑖+1 = 𝒙𝑖+1 − 𝒙
෥𝑖+1 is orthogonal to 𝒗1 , … , 𝒗𝑖

𝒚𝑖+1
• Take 𝒗𝑖+1 = , 𝑆 = 𝑆 ∪ {𝒗𝑖+1 }
𝒚𝑖+1 12
Exercise
• How can the algorithm be tweaked for the general case where
𝑆 = {𝒙1 , 𝒙2 , … , 𝒙𝑛 } is possibly linearly dependent?

13
Orthogonal matrices - Definition
• Let 𝐴 ∈ ℝ𝑛×𝑛

• Orthogonal matrix: columns of 𝐴 are orthonormal w.r.t. Euclidean


inner product

14
15
QR Decomposition

16
17
Thank You!

18

You might also like