Linear Algebra for ML #5 | Singular Value Decomposition


5.1 Singular Value decomposition (SVD)

  • singular value decomposition (SVD) : is a factorization of a real or complex matrix that generalizes the eigendecomposition (EVD) of a square normal matrix to any m×n matrix via an extension of the polar decomposition.

    A=UΣVT
    1. ARm×n : A given rectangular matrix
    2. URm×m : matrices with orthonormal columns, providing an orthonormal basis of Col(A)
    3. VRn×n : matrices with orthonormal columns, providing an orthonormal basis of Row(A)
    4. ΣRm×n : a diagonal matrix whose entries are in a decreasing order, i.e., σ1σ2σmin(m,n)



5.2 SVD as Sum of Outer Products

  • SVD as Sum of Outer Products : A can also be represented as the sum of outer products

A=UΣVT=Σni=1σiujvTi


  • From now on, we just want to show the equation below is true in this chapter (5.2) for the next chapter (5.3)

    AV=UΣA=UΣVT

  • Another Perspective of SVD : We can easily find two orthonormal basis sets, {u1,,un} for Col(A) and {v1,vn} for Row(A), by using, Gram-Schmidt orthogonalization.

  • Are these unique orthonormal basis sets? No, then can we jointly find them such that

Avi=σiui

  • Let us denote U=[u1 u2  um]Rm×n, V=[v1 v2  vn]Rn×n and Σ=diag(σn)Rn×n

  • AV=UΣ[Av1 Av2  Avn]=[ σ1u1 σ2u2  σnun]

  • V1=VT since Rn×n has orthonormal columns.

  • Thus AV=UΣA=UΣVT



5.3 Geometrical Interpretation of SVD

  • Geometrical Interpretation : For every linear transformation (rectangular matrix) A , one can find n dimensional orthonormal basis of V and transformed orthonormal basis U

A=UΣVTAV=UΣ

  • Furthermore : The figure below shows the orthogonal vector set x,y and the linearly transformed one Ax,Ay.

  • And you can see two geometrical features :

    1. There could be one or more sets of orthogonal {Ax,Ay}.
    2. After transformed by matrix A, the lengths of x,y are scaled by scaling factor. It is called singular value (like eigenvalue at EVD) and represented as σ1,σ2,σ3, starting with the larger value.

Figure. the orthogonal vector set x,y and the linearly transformed one Ax,Ay. [reference]


5.4 Computing SVD

  • First, we form AAT , ATA and then get SVD of each as followings :

AAT=UΣVTVΣTUT=UΣΣTUT=UΣ2UT

ATA=VΣTUTUΣVT=VΣTΣUT=VΣ2VT

  • From the above, you can see the form of result equation is very similar to EVD:
    1. U Is a orthogonal matrix which is from the eigen-decomposition of AAT, and it is called left singular vector.
    2. Σ is a diagonal matrix whose diagonal entries are equal to the square root of the eigenvalues from AAT.
    3. V is a orthogonal matrix which is from the eigen-decomposition of ATA, and it is called right singular vector.


5.5 Application of SVD

  • Recall SVD as Sum of Outer Products : matrix A can also be represented as the sum of outer products

A=UΣVT=Σni=1σiujvTi=σ1u1vT1+σ2u2vT2+σ3u3vT3

  • Here, ujvTi is m×n matrix and we can see matrix A as a sum of multiple layers via SVD.

  • That is, we can reconstruct A with only a few singular values rather than using all of them (Σ).

  • And this can be used for dimensionality reduction like task such as image compression.

Figure. Partial reconstruction of matrix using SVD [reference]

Figure. image compression using SVD (original, p=20, p=50, p=100) [reference]