Creation of matrices and linear operators
In the first part of this chapter, we are going to focus on the effective creation of matrices. We start by recalling some different ways to construct a basic matrix as an ndarray
instance class, including an enumeration of all the special matrices already included in NumPy and SciPy. We proceed to examine the possibilities of constructing complex matrices from basic ones. We review the same concepts within the matrix
instance class. Next, we explore in detail the different ways to input sparse matrices. We finish the section with the construction of linear operators.
Note
We assume familiarity with ndarray
creation in NumPy, as well as data types (dtype), indexing, routines for the combination of two or more arrays, array manipulation, or extracting information from these objects. In this chapter, we will focus on the functions, methods, and routines that are significant to matrices alone. We will disregard operations if their outputs have no translation into linear algebra equivalents. For a primer on ndarray
, we recommend you to browse through Chapter 2, Top-Level SciPy of Learning SciPy for Numerical and Scientific Computing, Second Edition. For a quick review of Linear Algebra, we recommend Hoffman and Kunze, Linear Algebra 2nd Edition, Pearson, 1971.
Constructing matrices in the ndarray class
We may create matrices from data as ndarray
instances in three different ways: manually from standard input, by assigning to each entry a value from a function, or by retrieving the data from external files.
Constructor |
Description |
---|---|
|
Create a matrix from |
|
Create diagonal matrix with entries of array |
|
Create a matrix by executing a function over each coordinate |
|
Create a matrix from a text or binary file (basic) |
|
Create a matrix from a text file (advanced) |
Let us create some example matrices to illustrate some of the functions defined in the previous table. As before, we start an iPython session:
In [1]: import numpy as np, matplotlib.pyplot as plt, \ ...: scipy.linalg as spla, scipy.sparse as spsp, \ ...: scipy.sparse.linalg as spspla In [2]: A = np.array([[1,2],[4,16]]); \...: A Out[2]: array([[ 1, 2], [ 4, 16]]) In [3]: B = np.fromfunction(lambda i,j: (i-1)*(j+1), ...: (3,2), dtype=int); \ ...: print B ...: [[-1 -2] [ 0 0] [ 1 2]] In [4]: np.diag((1j,4)) Out[4]: array([[ 0.+1.j, 0.+0.j], [ 0.+0.j, 4.+0.j]])
Special matrices with predetermined zeros and ones can be constructed with the following functions:
Constructor |
Description |
---|---|
|
Array of a given shape, entries not initialized |
|
2-D array with ones on the k-th diagonal, and zeros elsewhere |
|
Identity array |
|
Array with all entries equal to one |
|
Array with all entries equal to zero |
|
Array with ones at and below the given diagonal, zeros otherwise |
Tip
All these constructions, except numpy.tri
, have a companion function xxx_like
that creates ndarray
with the requested characteristics and with the same shape and data type as another source ndarray
class:
In [5]: np.empty_like(A) Out[5]: array([[140567774850560, 140567774850560], [ 4411734640, 562954363882576]])
Of notable importance are arrays constructed as numerical ranges.
Constructor |
Description |
---|---|
|
Evenly spaced values within an interval |
|
Evenly spaced numbers over an interval |
|
Evenly spaced numbers on a log scale |
|
Coordinate matrices from two or more coordinate vectors |
|
|
|
|
Special matrices with numerous applications in linear algebra can be easily called from within NumPy and the module scipy.linalg
.
Constructor |
Description |
---|---|
|
Circulant matrix generated by 1-D array |
|
Companion matrix of polynomial with coefficients coded by |
|
Sylvester's construction of a Hadamard matrix of size n × n. n must be a power of 2 |
|
Hankel matrix with |
|
Hilbert matrix of size n × n |
|
The inverse of a Hilbert matrix of size n × n |
|
Leslie matrix with fecundity array |
|
n × n truncations of the Pascal matrix of binomial coefficients |
|
Toeplitz array with first column |
|
Van der Monde matrix of array |
For instance, one fast way to obtain all binomial coefficients of orders up to a large number (the corresponding Pascal triangle) is by means of a precise Pascal matrix. The following example shows how to compute these coefficients up to order 13
:
In [6]: print spla.pascal(13, kind='lower')
Besides these basic constructors, we can always stack arrays in different ways:
Constructor |
Description |
---|---|
|
Join matrices together |
|
Stack matrices horizontally |
|
Stack matrices vertically |
|
Repeat a matrix a certain number of times (given by |
|
Create a block diagonal array |
Let us observe some of these constructors in action:
In [7]: np.tile(A, (2,3)) # 2 rows, 3 columns Out[7]: array([[ 1, 2, 1, 2, 1, 2], [ 4, 16, 4, 16, 4, 16], [ 1, 2, 1, 2, 1, 2], [ 4, 16, 4, 16, 4, 16]]) In [8]: spla.block_diag(A,B) Out[8]: array([[ 1, 2, 0, 0], [ 4, 16, 0, 0], [ 0, 0, -1, -2], [ 0, 0, 0, 0], [ 0, 0, 1, 2]])
Constructing matrices in the matrix class
For the matrix
class, the usual way to create a matrix directly is to invoke either numpy.mat
or numpy.matrix
. Observe how much more comfortable is the syntax of numpy.matrix
than that of numpy.array
, in the creation of a matrix similar to A
. With this syntax, different values separated by commas belong to the same row of the matrix. A semi-colon indicates a change of row. Notice the casting to the matrix
class too!
In [9]: C = np.matrix('1,2;4,16'); \ ...: C Out[9]: matrix([[ 1, 2], [ 4, 16]])
These two functions also transform any ndarray
into matrix
. There is a third function that accomplishes this task: numpy.asmatrix
:
In [10]: np.asmatrix(A) Out[10]: matrix([[ 1, 2], [ 4, 16]])
For arrangements of matrices composed by blocks, besides the common stack operations for ndarray
described before, we have the extremely convenient function numpy.bmat
. Note the similarity with the syntax of numpy.matrix
, particularly the use of commas to signify horizontal concatenation and semi-colons to signify vertical concatenation:
In [11]: np.bmat('A;B') In [12]: np.bmat('A,C;C,A') Out[11]: Out[12]: matrix([[ 1, 2], matrix([[ 1, 2, 1, 2], [ 4, 16], [ 4, 16, 4, 16], [-1, -2], [ 1, 2, 1, 2], [ 0, 0], [ 4, 16, 4, 16]]) [ 1, 2]])
Constructing sparse matrices
There are seven different ways to input sparse matrices. Each format is designed to make a specific problem or operation more efficient. Let us go over them in detail:
Method |
Name |
Optimal use |
---|---|---|
|
Block Sparse Row |
Efficient arithmetic, provided the matrix contains blocks. |
|
Coordinate |
Fast and efficient construction format. Efficient methods to convert to the CSC and CSR formats. |
|
Compressed Sparse Column |
Efficient matrix arithmetic and column slicing. Relatively fast matrix-vector product. |
|
Compressed Sparse Row |
Efficient matrix arithmetic and row slicing. Fastest to perform matrix-vector products. |
|
Diagonal storage |
Efficient for construction and storage if the matrix contains long diagonals of non-zero entries. |
|
Dictionary of keys |
Efficient incremental construction and access of individual matrix entries. |
|
Row-based linked list |
Flexible slicing. Efficient for changes to matrix sparsity. |
They can be populated in up to five ways, three of which are common to every sparse matrix format:
- They can cast to sparse any generic matrix. The
lil
format is the most effective with this method:In [13]: A_coo = spsp.coo_matrix(A); \ ....: A_lil = spsp.lil_matrix(A)
- They can cast to a specific sparse format another sparse matrix in another sparse format:
In [14]: A_csr = spsp.csr_matrix(A_coo)
- Empty sparse matrices of any shape can be constructed by indicating the shape and
dtype
:In [15]: M_bsr = spsp.bsr_matrix((100,100), dtype=int)
They all have several different extra input methods, each specific to their storage format.
- Fancy indexing: As we would do with any generic matrix. This is only possible with the LIL or DOK formats:
In [16]: M_lil = spsp.lil_matrix((100,100), dtype=int) In [17]: M_lil[25:75, 25:75] = 1 In [18]: M_bsr[25:75, 25:75] = 1 NotImplementedError Traceback (most recent call last) <ipython-input-18-d9fa1001cab8> in <module>() ----> 1 M_bsr[25:75, 25:75] = 1 [...]/scipy/sparse/bsr.pyc in __setitem__(self, key, val) 297 298 def __setitem__(self,key,val): --> 299 raise NotImplementedError 300 301 ###################### NotImplementedError:
- Dictionary of keys: This input system is most effective when we create, update, or search each element one at a time. It is efficient only for the LIL and DOK formats:
In [19]: M_dok = spsp.dok_matrix((100,100), dtype=int) In [20]: position = lambda i, j: ((i<j) & ((i+j)%10==0)) In [21]: for i in range(100): ....: for j in range(100): ....: M_dok[i,j] = position(i,j) ....:
- Data, rows, and columns: This is common to four formats: BSR, COO, CSC, and CSR. This is the method of choice to import sparse matrices from the Matrix Market Exchange format, as illustrated at the beginning of the chapter.
Tip
With the data, rows, and columns input method, it is a good idea to always include the option
shape
in the construction. In case this is not provided, the size of the matrix will be inferred from the largest coordinates from the rows and columns, resulting possibly in a matrix of a smaller size than required. - Data, indices, and pointers: This is common to three formats: BSR, CSC, and CSR. It is the method of choice to import sparse matrices from the Rutherford-Boeing Exchange format.
Note
The Rutherford-Boeing Exchange format is an updated version of the Harwell-Boeing format. It stores the matrix as three vectors:
pointers_v
,indices_v
, anddata
. The row indices of the entries of the jth column are located in positionspointers_v(j)
throughpointers_v(j+1)-1
of the vectorindices_v
. The corresponding values of the matrix are located at the same positions, in the vector data.
Let us show by example how to read an interesting matrix in the Rutherford-Boeing matrix exchange format, Pajek/football
. This 35 × 35 matrix with 118 non-zero entries can be found in the collection at www.cise.ufl.edu/research/sparse/matrices/Pajek/football.html.
It is an adjacency matrix for a network of all the national football teams that attended the FIFA World Cup celebrated in France in 1998. Each node in the network represents one country (or national football team) and the links show which country exported players to another country.
This is a printout of the football.rb
file:
The header of the file (the first four lines) contains important information:
- The first line provides us with the title of the matrix,
Pajek/football; 1998; L. Krempel; ed: V. Batagelj
, and a numerical key for identification purposesMTRXID=1474
. - The second line contains four integer values:
TOTCRD=12
(lines containing significant data after the header; see In [24]),PTRCRD=2
(number of lines containing pointer data),INDCRD=5
(number of lines containing indices data), andVALCRD=2
(number of lines containing the non-zero values of the matrix). Note that it must be TOTCRD = PTRCRD + INDCRD + VALCRD. - The third line indicates the matrix type
MXTYPE=(iua)
, which in this case stands for an integer matrix, unsymmetrical, compressed column form. It also indicates the number of rows and columns (NROW=35
,NCOL=35
), and the number of non-zero entries (NNZERO=118
). The last entry is not used in the case of a compressed column form, and it is usually set to zero. - The fourth column contains the Fortran formats for the data in the following columns.
PTRFMT=(20I4)
for the pointers,INDFMT=(26I3)
for the indices, andVALFMT=(26I3)
for the non-zero values.
We proceed to opening the file for reading, storing each line after the header in a Python list, and extracting from the relevant lines of the file, the data we require to populate the vectors indptr
, indices
, and data
. We finish by creating the corresponding sparse matrix called football
in the CSR format, with the data
, indices
, pointers
method:
In [22]: f = open("football.rb", 'r'); \ ....: football_list = list(f); \ ....: f.close() In [23]: football_data = np.array([]) In [24]: for line in range(4, 4+12): ....: newdata = np.fromstring(football_list[line], sep=" ") ....: football_data = np.append(football_data, newdata) ....: In [25]: indptr = football_data[:35+1] - 1; \ ....: indices = football_data[35+1:35+1+118] - 1; \ ....: data = football_data[35+1+118:] In [26]: football = spsp.csr_matrix((data, indices, indptr), ....: shape=(35,35))
At this point, it is possible to visualize the network with its associated graph, with the help of a Python module called networkx
. We obtain the following diagram depicting as nodes the different countries. Each arrow between the nodes indicates the fact that the originating country has exported players to the receiving country:
Note
networkx
is a Python module to deal with complex networks. For more information, visit their Github project pages at networkx.github.io.
One way to accomplish this task is as follows:
In [27]: import networkx In [28]: G = networkx.DiGraph(football) In [29]: f = open("football_nodename.txt"); \ ....: m = list(f); \ ....: f.close() In [30]: def rename(x): return m[x] In [31]: G = networkx.relabel_nodes(G, rename) In [32]: pos = networkx.spring_layout(G) In [33]: networkx.draw_networkx(G, pos, alpha=0.2, node_color='w', ....: edge_color='b')
The module scipy.sparse
borrows from NumPy some interesting concepts to create constructors and special matrices:
Constructor |
Description |
---|---|
|
Sparse matrix from diagonals |
|
Random sparse matrix of prescribed density |
|
Sparse matrix with ones in the main diagonal |
|
Identity sparse matrix of size n × n |
Both functions diags
and rand
deserve examples to show their syntax. We will start with a sparse matrix of size 14 × 14 with two diagonals: the main diagonal contains 1s, and the diagonal below contains 2s. We also create a random matrix with the function scipy.sparse.rand
. This matrix has size 5 × 5, with 25 percent non-zero elements (density=0.25
), and is crafted in the LIL format:
In [34]: diagonals = [[1]*14, [2]*13] In [35]: print spsp.diags(diagonals, [0,-1]).todense() [[ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 2. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 2. 1. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 2. 1. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 2. 1. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 2. 1. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 1. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 1. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 1. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 1.]] In [36]: S_25_lil = spsp.rand(5, 5, density=0.25, format='lil') In [37]: S_25_lil Out[37]: <5x5 sparse matrix of type '<type 'numpy.float64'>' with 6 stored elements in LInked List format> In [38]: print S_25_lil (0, 0) 0.186663044982 (1, 0) 0.127636181284 (1, 4) 0.918284870518 (3, 2) 0.458768884701 (3, 3) 0.533573291684 (4, 3) 0.908751420065 In [39]: print S_25_lil.todense() [[ 0.18666304 0. 0. 0. 0. ] [ 0.12763618 0. 0. 0. 0.91828487] [ 0. 0. 0. 0. 0. ] [ 0. 0. 0.45876888 0.53357329 0. ] [ 0. 0. 0. 0.90875142 0. ]]
Similar to the way we combined ndarray
instances, we have some clever ways to combine sparse matrices to construct more complex objects:
Constructor |
Description |
---|---|
|
Sparse matrix from sparse sub-blocks |
|
Stack sparse matrices horizontally |
|
Stack sparse matrices vertically |
Linear operators
A linear operator is basically a function that takes as input a column vector and outputs another column vector, by left multiplication of the input with a matrix. Although technically, we could represent these objects just by handling the corresponding matrix, there are better ways to do this.
Constructor |
Description |
---|---|
|
Common interface for performing matrix vector products |
|
Return |
In the scipy.sparse.linalg
module, we have a common interface that handles these objects: the LinearOperator
class. This class has only the following two attributes and three methods:
shape
: The shape of the representing matrixdtype
: The data type of the matrixmatvec
: To perform multiplication of a matrix with a vectorrmatvec
: To perform multiplication by the conjugate transpose of a matrix with a vectormatmat
: To perform multiplication of a matrix with another matrix
Its usage is best explained through an example. Consider two functions that take vectors of size 3, and output vectors of size 4, by left multiplication with two respective matrices of size 4 × 3. We could very well define these functions with lambda predicates:
In [40]: H1 = np.matrix("1,3,5; 2,4,6; 6,4,2; 5,3,1"); \ ....: H2 = np.matrix("1,2,3; 1,3,2; 2,1,3; 2,3,1") In [41]: L1 = lambda x: H1.dot(x); \ ....: L2 = lambda x: H2.dot(x) In [42]: print L1(np.ones(3)) [[ 9. 12. 12. 9.]] In [43]: print L2(np.tri(3,3)) [[ 6. 5. 3.] [ 6. 5. 2.] [ 6. 4. 3.] [ 6. 4. 1.]]
Now, one issue arises when we try to add/subtract these two functions, or multiply any of them by a scalar. Technically, it should be as easy as adding/subtracting the corresponding matrices, or multiplying them by any number, and then performing the required left multiplication again. But that is not the case.
For instance, we would like to write (L1+L2)(v)
instead of L1(v) + L2(v)
. Unfortunately, doing so will raise an error:
TypeError: unsupported operand type(s) for +: 'function' and 'function'
Instead, we may instantiate the corresponding linear operators and manipulate them at will, as follows:
In [44]: Lo1 = spspla.aslinearoperator(H1); \ ....: Lo2 = spspla.aslinearoperator(H2) In [45]: Lo1 - 6 * Lo2 Out[45]: <4x3 _SumLinearOperator with dtype=float64> In [46]: print Lo1 * np.ones(3) [ 9. 12. 12. 9.] In [47]: print (Lo1-6*Lo2) * np.tri(3,3) [[-27. -22. -13.] [-24. -20. -6.] [-24. -18. -16.] [-27. -20. -5.]]
Linear operators are a great advantage when the amount of information needed to describe the product with the related matrix is less than the amount of memory needed to store the non-zero elements of the matrix.
For instance, a permutation matrix is a square binary matrix (ones and zeros) that has exactly one entry in each row and each column. Consider a large permutation matrix, say 1024 × 1024, formed by four blocks of size 512 × 512: a zero block followed horizontally by an identity block, on top of an identity block followed horizontally by another zero block. We may store this matrix in three different ways:
In [47]: P_sparse = spsp.diags([[1]*512, [1]*512], [512,-512], \ ....: dtype=int) In [48]: P_dense = P_sparse.todense() In [49]: mv = lambda v: np.roll(v, len(v)/2) In [50]: P_lo = spspla.LinearOperator((1024,1024), matvec=mv, \ ....: matmat=mv, dtype=int)
In the sparse case, P_sparse
, we may think of this as the storage of just 1024 integer numbers. In the dense case, P_dense
, we are technically storing 1048576 integer values. In the case of the linear operator, it actually looks like we are not storing anything! The function mv
that indicates how to perform the multiplications has a much smaller footprint than any of the related matrices. This is also reflected in the time of execution of the multiplications with these objects:
In [51]: %timeit P_sparse * np.ones(1024) 10000 loops, best of 3: 29.7 µs per loop In [52]: %timeit P_dense.dot(np.ones(1024)) 100 loops, best of 3: 6.07 ms per loop In [53]: %timeit P_lo * np.ones(1024) 10000 loops, best of 3: 25.4 µs per loop