Matrix factorizations based on eigenvalues
In this category, we have two kinds of factorizations on square matrices: Spectral and Schur decompositions (although, technically, a spectral decomposition is a special case of Schur decomposition). The objective of both is initially to present the eigenvalues of one or several matrices simultaneously, although they have quite different applications.
Spectral decomposition
We consider the following four cases:
- Given a square matrix
A
, we seek all vectorsv
(right eigenvectors) that satisfy A ● v = m ● v for some real or complex valuem
(the corresponding eigenvalues). If all eigenvectors are different, we collect them as the columns of matrixV
(that happens to be invertible). Their corresponding eigenvalues are stored in the same order as the diagonal entries of a diagonal matrixD
. We can then realizeA
as the product A = V ● D ● V-1. We refer to this decomposition as an ordinary eigenvalue problem. - Given a square matrix
A
, we seek all vectorsv
(left eigenvectors) that satisfy v ● A = m ● v for the eigenvaluesm
. As before, if all eigenvectors are different, they are collected in matrixV
; their corresponding eigenvalues are collected in the diagonal matrixD
. The matrixA
can then be decomposed as the product A = V ● D ● V-1. We also refer to this factorization as an ordinary eigenvalue problem. The eigenvalues are the same as in the previous case. - Given square matrices
A
andB
with the same size, we seek all vectorsv
(generalized right eigenvectors) that satisfy m ● A ● v = n ● B ● v for some real or complex valuesm
andn
. The ratios r = n/m, when they are computable, are called generalized eigenvalues. The eigenvectors are collected as columns of matrixV
, and their corresponding generalized eigenvaluesr
collected in a diagonal matrixD
. We can then realize the relation betweenA
andB
by the identity A = B ● V ● D ● V-1. We refer to this identity as a generalized eigenvalue problem. - For the same case as before, if we seek vectors
v
(generalized left eigenvectors) and valuesm
andn
that satisfy m ● v ● A = n ● v ● B, we have another similar decomposition. We again refer to this factorization as a generalized eigenvalue problem.
The following functions in the modules scipy.linalg
and scipy.sparse.linalg
help us to compute eigenvalues and eigenvectors:
Constructor |
Description |
---|---|
|
Ordinary/generalized eigenvalue problem |
|
Eigenvalues for ordinary/generalized eigenvalue problem |
|
Ordinary/generalized eigenvalue problem. Hermitian/symmetric matrix |
|
Eigenvalues for ordinary/generalized eigenvalue problem; Hermitian/symmetric matrix |
|
Ordinary eigenvalue problem; Hermitian/symmetric band matrix |
|
Eigenvalues for ordinary eigenvalue problem; Hermitian/symmetric band matrix |
|
Find k eigenvalues and eigenvectors |
|
Find |
|
Ordinary/generalized eigenvalue problem with optional preconditioning |
For any kind of eigenvalue problem where the matrices are not symmetric or banded, we use the function eig
, which is a wrapper for the LAPACK
routines GEEV
and GGEV
(the latter for generalized eigenvalue problems). The function eigvals
is syntactic sugar for a case of eig
that only outputs the eigenvalues, but not the eigenvectors. To report whether we require left of right eigenvectors, we use the optional Boolean parameters left
and right
. By default, left
is set to False
and right
to True
, hence offering right eigenvectors.
For eigenvalue problems with non-banded real symmetric or Hermitian matrices, we use the function eigh
, which is a wrapper for the LAPACK routines of the form *EVR
, *GVD
, and *GV
. We are given the choice to output as many eigenvalues as we want, with the optional parameter eigvals
. This is a tuple of integers that indicate the indices of the lowest and the highest eigenvalues required. If omitted, all eigenvalues are returned. In such a case, it is possible to perform the computation with a much faster algorithm based on divide and conquer techniques. We may indicate this choice with the optional Boolean parameter turbo
(by default set to False
).
If we wish to report only eigenvalues, we can set the optional parameter eigvals_only
to True
, or use the corresponding syntactic sugar eighvals
.
The last case that we contemplate in the scipy.linalg
module is that of the eigenvalue problem of a banded real symmetric or Hermitian matrix. We use the function eig_banded
, making sure that the input matrices are in the AB format. This function is a wrapper for the LAPACK
routines *EVX
.
For extremely large matrices, the computation of eigenvalues is often computationally impossible. If these large matrices are sparse, it is possible to calculate a few eigenvalues with two iterative algorithms, namely the Implicitly Restarted Arnoldi
and the
Implicitly Restarted Lanczos methods (the latter for symmetric or Hermitian matrices). The module scipy.sparse.linalg
has two functions, eigs
and eigsh
, which are wrappers to the ARPACK
routines *EUPD
that perform them. We also have the function lobpcg
that performs another iterative algorithm, the Locally Optimal Block Preconditioned Conjugate Gradient method. This function accepts a Preconditioner, and thus has the potential to converge more rapidly to the desired eigenvalues.
We will illustrate the usage of all these functions with an interesting matrix: Andrews
. It was created in 2003 precisely to benchmark memory-efficient algorithms for eigenvalue problems. It is a real symmetric sparse matrix with size 60,000 × 60,000 and 760,154 non-zero entries. It can be downloaded from the Sparse Matrix Collection at www.cise.ufl.edu/research/sparse/matrices/Andrews/Andrews.html.
For this example, we downloaded the matrix in the Matrix Market format Andrews.mtx
. Note that the matrix is symmetric, and the file only provides data on or below the main diagonal. After collecting all this information, we ensure that we populate the upper triangle too:
In [1]: import numpy as np, scipy.sparse as spsp, \ ...: scipy.sparse.linalg as spspla In [2]: np.set_printoptions(suppress=True, precision=6) In [3]: rows, cols, data = np.loadtxt("Andrews.mtx", skiprows=14, ...: unpack=True); \ ...: rows-=1; \ ...: cols-=1 In [4]: A = spsp.csc_matrix((data, (rows, cols)), \ ...: shape=(60000,60000)); \ ...: A = A + spsp.tril(A, k=1).transpose()
We compute first the top largest five eigenvalues in absolute value. We call the function eigsh
, with the option which='LM'
.
In [5]: %time eigvals, v = spspla.eigsh(A, 5, which='LM') CPU times: user 3.59 s, sys: 104 ms, total: 3.69 s Wall time: 3.13 s In [6]: print eigvals [ 69.202683 69.645958 70.801108 70.815224 70.830983]
We may compute the smallest eigenvalues in terms of the absolute value too, by switching to the option which='SM'
:
In [7]: %time eigvals, v = spspla.eigsh(A, 5, which='SM') CPU times: user 19.3 s, sys: 532 ms, total: 19.8 s Wall time: 16.7 s In [8]: print eigvals [ 10.565523 10.663114 10.725135 10.752737 10.774503]
Tip
The routines in ARPACK
are not very efficient at finding small eigenvalues. It is usually preferred to apply the shift-invert mode in this case for better performance. For information about this procedure, read the description in www.caam.rice.edu/software/ARPACK/UG/node33.html, or the article by R. B. Lehoucq, D. C. Sorensen, and C. Yang, ARPACK USER GUIDE: Solution of Large Scale Eigenvalue Problems by Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia, PA, 1998.
Note
The function eigsh
allows us to perform shift-invert mode by indicating a value close to the required eigenvalues. If we have a good guess, as offered by the previous step, we may apply this procedure with the option sigma
, and a strategy with the option mode. In this case, we also need to provide a linear operator instead of a matrix. The time of execution is much slower, but the results are much more precise in general (although the given example would not suggest so!).
In [9]: A = spspla.aslinearoperator(A) In [10]: %time spspla.eigsh(A, 5, sigma=10.0, mode='cayley') CPU times: user 2min 5s, sys: 916 ms, total: 2min 6s Wall time: 2min 6s In [11]: print eigvals [ 10.565523 10.663114 10.725135 10.752737 10.774503]
Schur decomposition
There are four cases:
- Complex Schur decomposition for a square matrix
A
with complex coefficients. We can realizeA
as the product A = U ● T ● UH of a unitary matrixU
with an upper triangular matrixT
, and the Hermitian transpose ofU
. We callT
the complex Schur form ofA
. The entries in the diagonal ofT
are the eigenvalues ofA
. - Real Schur decomposition for a square matrix
A
with real coefficients. If all the eigenvalues of the matrix are real valued, then we may realize the matrix as the product A = V ● S ● VT of an orthonormal matrixV
with a block-upper triangular matrixS
, and the transpose ofV
. The blocks inS
are either of size 1 × 1 or 2 × 2. If the block is 1 × 1, the value is one of the real eigenvalues ofA
. Any 2 × 2 blocks represents a pair of complex conjugate eigenvalues ofA
. We callS
the real Schur form ofA
. - Complex generalized Schur decomposition of two square matrices
A
andB
. We can simultaneously factorize them to the form A = Q ● S ● ZH and B = Q ● T ● ZH with the same unitary matricesQ
andZ
. The matricesS
andT
are both upper triangular, and the ratios of their diagonal elements are precisely the generalized eigenvalues ofA
andB
. - Real generalized Schur decomposition of two real-valued square matrices
A
andB
. Simultaneous factorization of both can be achieved in the form A = Q ● S ● ZT and B = Q ● T ● ZT for the same orthogonal matricesQ
andZ
. The matricesS
andT
are block-upper triangular, with blocks of size 1 × 1 and 2 × 2. With the aid of these blocks, we can find the generalized eigenvalues ofA
andB
.
There are four functions in the module scipy.linalg
that provide us with tools to compute any of these decompositions:
Constructor |
Description |
---|---|
|
Schur decomposition of a matrix |
|
Convert from real Schur form to complex Schur form |
|
Generalized Schur decomposition of two matrices |
|
Hessenberg form of a matrix |
The function hessenberg
gives us the first step in the computation of any Schur decomposition. This is a factorization of any square matrix A
in the form A = Q ● U ● QH, where Q
is unitary and U
is an upper Hessenberg matrix (all entries are zero below the sub-diagonal). The algorithm is based on the combination of the LAPACK
routines GEHRD
, GEBAL
(to compute U
), and the BLAS
routines GER
, GEMM
(to compute Q
).
The functions schur
and qz
are wrappers to the LAPACK
routines GEES
and GGES
, to compute the normal and generalized Schur decompositions (respectively) of square matrices. We choose whether to report complex or real decompositions on the basis of the optional parameter output (which we set to 'real'
or 'complex'
). We also have the possibility of sorting the eigenvalues in the matrix representation. We do so with the optional parameter sort
, with the following possibilities:
None
: If we do not require any sorting. This is the default.'lhp'
: In the left-hand plane.'rhp'
: In the right-hand plane'iuc'
: Inside the unit circle'ouc'
: Outside the unit circlefunc
: Any callable function calledfunc
can be used to provide the users with their own sorting