Linear Algebra with Python and NumPy (II)

Linear Algebra with Python and NumPy (II)

This post is a continuation of the previous post on using Python and NumPy package for linear algebra.

We will briefly cover topics such as:

  • Solving systems of linear equations
  • Eigenvalues, eigenvectors and matrix spectral decomposition
  • Singular value decomposition (SVD)
  • Solving overdetermined linear systems using method of least-squares
  • Moore-Penrose pseudoinverse matrix
In [1]:
from numpy import *   # Import NumPy package which enables all the fun with algebra

1 Solving Linear Systems with Regular Matrix

Assume we have a system of linear algebralic equations given by

$$ \mathbf{A} \mathbf{x} = \mathbf{b}, $$

where $\mathbf{A} \in \mathbb{C}^{n\times n}$ and $\mathbf{b} \in \mathbb{C}^{n}$. To find a solution for $\mathbf{x}$, we can use method numpy.linalg.solve. As we surely know from algebra classes, an exact solution exists if and only if $\mathbf{A}$ is a full-rank square matrix (also called regular matrix), which is also required by the mentioned solving method.

Example: Solve a system of linear equations given by

$$ \mathbf{A} = \left(\begin{array}{r} 0 & 2 & 1 \\ 3 & -1 & 2 \\ 1 & -1 & 1 \end{array}\right), \quad \mathbf{b} = \left(\begin{array}{r} 2 \\ -3 \\ -3 \end{array}\right), \quad \mathbf{x} = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}.$$

In [2]:
A = matrix('0 2 1; 3 -1 2; 1 -1 1')
b = matrix('2; -3; -3')
print(A)
print('rank(A) =', linalg.matrix_rank(A))   # Verify that A is a full-rank matrix (rank(A) = n)
[[ 0  2  1]
 [ 3 -1  2]
 [ 1 -1  1]]
rank(A) = 3
In [3]:
x = linalg.solve(A,b)   # Get solution for the linear system A*x=b
print(x)
[[ 1.]
 [ 2.]
 [-2.]]
In [4]:
print(b-A*x)    # Verify that x is the valid solution for A*x=b
[[0.]
 [0.]
 [0.]]

2 Eigenvalues and Eigenvectors

Assume we have a linear transformation given by a square matrix $\mathbf{A} \in \mathbb{C}^{n\times n}$. Then matrix $\mathbf{A}$ has a scalar eigenvalue $\lambda$ associated with a non-zero eigenvector $\mathbf{v}$ if

$$ \mathbf{A} \mathbf{v} = \lambda \mathbf{v}.$$

The obvious geometrical interpretation is such that eigenvectors are vectors not affected by given transformation in the term of rotation, but only stretched (scaled) by associated factor of $\lambda$. Another fundamental meaning is connected with systems of linear differential equations, but that's another story.

If a matrix $\mathbf{A} \in \mathbb{C}^{n\times n}$ has $n$ linearly independent eigenvectors, then matrix $\mathbf{A}$ can be factorized as

$$ \mathbf{A} = \mathbf{U} \mathbf{D} \mathbf{U}^H, $$

where $\mathbf{D}$ is a diagonal matrix containing eigenvalues on its diagonal. Columns of $\mathbf{U}$ are eigenvectors, which makes $\mathbf{U}$ an unitary matrix. Note that only diagonalizable matrices can be factorized in this way, which is also called spectral decomposition or eigendecomposition.

Example: Find eigenvalues and eigenvectors for matrix

$$ \mathbf{A} = \begin{pmatrix} 2 & 0 & 0 \\ 0 & 3 & 4 \\ 0 & 4 & 9 \end{pmatrix}.$$

In [5]:
A = matrix('2 0 0; 0 3 4; 0 4 9')
λ, U = linalg.eig(A)
print('Eigenvalues:')
print(λ)
print('\nEigenvectors:')
print(U)
Eigenvalues:
[11.  1.  2.]

Eigenvectors:
[[ 0.          0.          1.        ]
 [ 0.4472136   0.89442719  0.        ]
 [ 0.89442719 -0.4472136   0.        ]]

Verify that $ \mathbf{A} \mathbf{v} - \lambda \mathbf{v} = 0$ for each pair of eigenvalue-eigenvector

In [6]:
eps = 0
for i in range(A.shape[0]):
    eps += sum(A*U[:,i] - λ[i]*U[:,i])
print(eps)
1.6653345369377348e-16

Verify that $\mathbf{A} = \mathbf{U} \mathbf{D} \mathbf{U}^T $

In [7]:
print( U * diag(λ) * U.T )
[[2. 0. 0.]
 [0. 3. 4.]
 [0. 4. 9.]]

3 Singular Value Decomposition (SVD)

Since the spectral decomposition $ \mathbf{A} = \mathbf{U} \mathbf{D} \mathbf{U}^H $ exists only for a square, diagonalizable matrix, there is an obvious question if it can be generalized also for matrices of any shape. It can be proven, if we allow two diverse unitary matrices instead of single matrix $\mathbf{U}$, then such factorization always exists.

Assume we have a matrix $\mathbf{A} \in \mathbb{C}^{m\times n}$, ($m, n \in \mathbb{N}$). Then there exists a matrix factorization of $\mathbf{A}$, called singular value decomposition (SVD), of the form

$$ \mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^H, $$

where $\mathbf{\Sigma}$ is a diagonal matrix containing so-called singular values $\sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_n \geq 0$ on its diagonal. Matrices $\mathbf{U}$, $\mathbf{V}$ are unitary matrices, that means their columns are orthonormal. The columns of $\mathbf{U}$ and $\mathbf{V}$ are called the left-singular vectors and right-singular vectors of $\mathbf{A}$, respectively.

Using NumPy package, the SVD decomposition can be computed by method numpy.linalg.svd. It returns matrices $\mathbf{U}$, $\mathbf{V}^H$ and singular values $\sigma$ (note that $\mathbf{V}$ is returned as $\mathbf{V}^H$ by this method).

Example: Find singular value decomposition for matrix

$$ \mathbf{A} = \left(\begin{array}{r} 14 & 2 \\ 4 & 22 \\ 16 & 13 \end{array}\right).$$

In [8]:
A = matrix('14 2; 4 22; 16 13')
U, σ, VH = linalg.svd(A, full_matrices=False)
print('Singular values σ:')
print(σ)
print('\nLeft-singular vectors:')
print(U)
print('\nRight-singular vectors:')
print(VH.H)
Singular values σ:
[30. 15.]

Left-singular vectors:
[[-0.33333333  0.66666667]
 [-0.66666667 -0.66666667]
 [-0.66666667  0.33333333]]

Right-singular vectors:
[[-0.6  0.8]
 [-0.8 -0.6]]

Verify that $\mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^H $

In [9]:
print( U * diag(σ) * VH )
[[14.  2.]
 [ 4. 22.]
 [16. 13.]]

4 Solving Linear Systems by Method of Least-Squares

Consider a system of linear equations $ \mathbf{A} \mathbf{x} = \mathbf{b}$, where $\mathbf{A} \in \mathbb{C}^{m\times n}$, $\mathbf{b} \in \mathbb{C}^{m}$ and $m > n$. This kind of system is called overdetermined, because it has more equations than unknowns, thus generally it has no unique solution. However, we can look for the least-squares solution that minimizes the Euclidean norm of the residuals, that is,

$$ \min_{\mathbf{x}\in \mathbb{C}^n} \|\mathbf{b}-\mathbf{A}\mathbf{x}\|. $$

4.1 Least-squares solution for systems with full-rank matrix

In case when $\mathbf{A}$ is a full-rank matrix, that is $\mathrm{rank} \, \mathbf{A} = \min(m,n)$, the least-squares solution can be found easily as a solution of linear system

$$ \mathbf{A}^H \mathbf{A} \mathbf{x} = \mathbf{A}^H \mathbf{b}.$$

In such case, $\mathbf{A}^H \mathbf{A}$ is a positive-definite Hermitian matrix and the least-squares solution exists and is unique. To get the solution we can use QR or Cholesky decomposition or find the inversion of $\mathbf{A}^H \mathbf{A}$ as follows

$$ \mathbf{x} = (\mathbf{A}^H \mathbf{A})^{-1} \mathbf{A}^H \mathbf{b}.$$

4.2 Least-squares solution for systems with matrix of any rank

Lets discuss the most general case where matrix $\mathbf{A}$ might be rectangular and also its rank might be arbitrary, that is $\mathrm{rank}\,\mathbf{A}>0$. In such case, inverse of $ \mathbf{A}^H \mathbf{A}$ does not exists, so we need to develop another solving method. Using singular value decomposition $ \mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^H$, we can find that

$$ \|\mathbf{b}-\mathbf{A}\mathbf{x}\| = \|\mathbf{U}^H\mathbf{b}-\mathbf{U}^H\mathbf{A}\mathbf{x}\| = \|\mathbf{U}^H\mathbf{b} - \mathbf{\Sigma} \mathbf{V}^H\mathbf{x}\|.$$

If we denote

$$\mathbf{c}=\mathbf{U}^H\mathbf{b}, \quad \mathbf{y}=\mathbf{V}^H\mathbf{x},$$

we can see that $\|\mathbf{b}-\mathbf{A}\mathbf{x}\| $ is minimized if and only if $\|\mathbf{c} - \mathbf{\Sigma}\mathbf{y}\| $ is minimal. This way we have transformed the least-squares problem with matrix $\mathbf{A}$ and $\mathbf{b}$ into a problem with diagonal matrix $\mathbf{\Sigma}$ and vector $\mathbf{c}$. Moreover, it is visible that $\|\mathbf{c} - \mathbf{\Sigma}\mathbf{y}\|$ is minimal if and only if $\mathbf{c} = \mathbf{\Sigma}\mathbf{y}$, thus

$$\mathbf{y} = \mathbf{\Sigma}^{-1}\mathbf{c}.$$

Thanks to that we can find least-squares solution for $\mathbf{x}$ as

$$\mathbf{x} = \mathbf{V}\mathbf{y} = \mathbf{V}\mathbf{\Sigma}^{-1}\mathbf{c} = \mathbf{V}\mathbf{\Sigma}^{-1}\mathbf{U}^H\mathbf{b}.$$

So, after computing the SVD decomposition we just need to determine the inverse of matrix $\mathbf{\Sigma}$. Lets denote $r = \mathrm{rank}\,\mathbf{A}$, then inversion $\mathbf{\Sigma}^{-1} \in \mathbb{R}^{n\times n}$ can be found as

$$\mathbf{\Sigma}^{-1} = \begin{pmatrix} \mathbf{\tilde{\Sigma}}^{-1} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{pmatrix}, \quad \mathbf{\tilde{\Sigma}}^{-1} = \begin{pmatrix} 1/\sigma_1 & 0 & \dots & 0 \\ 0 & 1/\sigma_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & 1/\sigma_r \end{pmatrix}. $$

To make things more elegant we can call the expression $\mathbf{V}\mathbf{\Sigma}^{-1}\mathbf{U}^H$ as Moore–Penrose pseudoinverse of matrix $\mathbf{A}$ and denote it as $\mathbf{A}^+$. Then, we can write the solution in a very short form

$$\mathbf{x} = \mathbf{A}^+\mathbf{b}. $$

Example: Find a least-square solution for overdetermined system of linear equations given by

$$ \mathbf{A} = \left(\begin{array}{r} 1 & 0 & -1 & 2 \\ 1 & 1 & 1 & -1 \\ 0 & -1 & -2 & 3 \\ 5 & 2 & -1 & 4 \\ -1 & 2 & 5 & 8 \end{array}\right), \quad \mathbf{b} = \left(\begin{array}{r} -1 \\ 2 \\ -3 \\ 1 \\ 7 \end{array}\right), \quad \mathbf{x} = \left(\begin{array}{r} x_1 \\ x_2 \\ x_3 \\ x_4 \end{array}\right).$$

In [10]:
A = matrix('1 0 -1 2; 1 1 1 -1; 0 -1 -2 3; 5 2 -1 4; -1 2 5 -8')
b = matrix('-1; 2; -3; 1; 7')
n = A.shape[1]
r = linalg.matrix_rank(A)
print(A)
print('n =', n)
print('rank(A) =', r)
[[ 1  0 -1  2]
 [ 1  1  1 -1]
 [ 0 -1 -2  3]
 [ 5  2 -1  4]
 [-1  2  5 -8]]
n = 4
rank(A) = 2

We can see that rank of matrix $\mathbf{A}$ is 2 which is lower than number of unknowns and it makes the matrix rank-deficient.

In [11]:
# SVD decomposition of matrix A
U, σ, VH = linalg.svd(A, full_matrices=False)
V = VH.H
print('Singular values:\n', σ)
print('\nLeft-singular vectors:')
print(U)
print('\nRight-singular vectors:')
print(V)

# Moore–Penrose pseudoinverse
sigma_inv = diag(hstack([1/σ[:r], zeros(n-r)]))
A_plus = V * sigma_inv * U.H
print('\nMoore–Penrose pseudoinverse of A:')
print(A_plus)

# Least-squares solution
x = A_plus * b
print('\nLeast-squares solution x:')
print(x)

# Error of solution ||b-A*x||
eps = linalg.norm(b-A*x)
print('\nError of the least-squares solution: ||b-A*x|| =', eps)
Singular values:
 [1.15923770e+01 5.44213162e+00 1.18203515e-15 1.59830883e-16]

Left-singular vectors:
[[-0.20752001 -0.08477441  0.63436269  0.67966601]
 [ 0.09883702 -0.30122044 -0.70236176  0.63677741]
 [-0.30635703  0.21644603  0.06315979  0.25652062]
 [-0.424886   -0.85676412  0.07512664 -0.2510953 ]
 [ 0.82023408 -0.34811766  0.30763414  0.06096691]]

Right-singular vectors:
[[-0.26339267 -0.79411857 -0.52472245  0.15705526]
 [ 0.10316178 -0.53791974  0.81669201  0.18169801]
 [ 0.46971622 -0.28172091 -0.05921667 -0.83456179]
 [-0.83627066  0.02552208  0.23275289 -0.49580852]]

Moore–Penrose pseudoinverse of A:
[[ 0.01708543  0.04170854 -0.02462312  0.13467337  0.0321608 ]
 [ 0.00653266  0.03065327 -0.0241206   0.08090452  0.04170854]
 [-0.0040201   0.01959799 -0.02361809  0.02713568  0.05125628]
 [ 0.01457286 -0.00854271  0.02311558  0.02663317 -0.06080402]]

Least-squares solution x:
[[ 0.5]
 [ 0.5]
 [ 0.5]
 [-0.5]]

Error of the least-squares solution: ||b-A*x|| = 2.3603665272841412e-15

So, using SVD decomposition and Moore-Penrose pseudoinverse matrix, we have found the least-squares solution

$$ \mathbf{x} = \left(\begin{array}{r} 0.5\\ 0.5\\ 0.5\\ -0.5 \end{array}\right).$$

We can also utilize some built-in methods from NumPy package. To compute Moore–Penrose pseudoinverse there is a method numpy.linalg.pinv.

In [12]:
print('Moore–Penrose pseudoinverse by NumPy:')
print(linalg.pinv(A))
Moore–Penrose pseudoinverse by NumPy:
[[ 0.01708543  0.04170854 -0.02462312  0.13467337  0.0321608 ]
 [ 0.00653266  0.03065327 -0.0241206   0.08090452  0.04170854]
 [-0.0040201   0.01959799 -0.02361809  0.02713568  0.05125628]
 [ 0.01457286 -0.00854271  0.02311558  0.02663317 -0.06080402]]

NumPy has even a method numpy.linalg.lstsq to compute the least-squares solution for any linear system that may be under-, well-, or over- determined.

In [13]:
print('Least-squares solution by NumPy:')
print(linalg.lstsq(A,b,rcond=-1)[0])
Least-squares solution by NumPy:
[[ 0.5]
 [ 0.5]
 [ 0.5]
 [-0.5]]

5 Conclusion

As we can see, Python and NumPy package can be successfully used even for some advanced tasks from linear algebra. Also, the Jupyter notebook offers a truly convenient way to write mathematical description and test live code for various numerical methods.

Comments

Comments powered by Disqus