Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
15 Math Concepts Every Data Scientist Should Know

You're reading from   15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Arrow left icon
Product type Paperback
Published in Aug 2024
Publisher Packt
ISBN-13 9781837634187
Length 510 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
David Hoyle David Hoyle
Author Profile Icon David Hoyle
David Hoyle
Arrow right icon
View More author details
Toc

Table of Contents (21) Chapters Close

Preface 1. Part 1: Essential Concepts FREE CHAPTER
2. Chapter 1: Recap of Mathematical Notation and Terminology 3. Chapter 2: Random Variables and Probability Distributions 4. Chapter 3: Matrices and Linear Algebra 5. Chapter 4: Loss Functions and Optimization 6. Chapter 5: Probabilistic Modeling 7. Part 2: Intermediate Concepts
8. Chapter 6: Time Series and Forecasting 9. Chapter 7: Hypothesis Testing 10. Chapter 8: Model Complexity 11. Chapter 9: Function Decomposition 12. Chapter 10: Network Analysis 13. Part 3: Selected Advanced Concepts
14. Chapter 11: Dynamical Systems 15. Chapter 12: Kernel Methods 16. Chapter 13: Information Theory 17. Chapter 14: Non-Parametric Bayesian Methods 18. Chapter 15: Random Matrices 19. Index 20. Other Books You May Enjoy

Matrices as transformations

Matrices are typically applied to vectors and other matrices through the process of matrix multiplication. However, the dry mechanics of matrix multiplication tend to hide what a matrix really represents and what matrix multiplication does. We aim to shed light on what matrices really are in this chapter. We’ll start by covering the basics of matrix multiplication and then show how matrices represent transformations.

Matrix multiplication

If we have a matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> of size <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mi>M</mi><mo>×</mo><mi>K</mi></mrow></mrow></math>, and a matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> of size <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mi>K</mi><mo>×</mo><mi>N</mi></mrow></mrow></math>, then we can multiply those two matrices together to get a new matrix <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>C</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math>, which is of size <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mi>M</mi><mo>×</mo><mi>N</mi></mrow></mrow></math>. The matrix element <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math> is calculated as follows:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msub><mi>C</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>=</mo><mspace width="0.25em" /><mrow><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mi>K</mi></munderover><msub><mi>A</mi><mrow><mi>i</mi><mi>k</mi></mrow></msub></mrow><msub><mi>B</mi><mrow><mi>k</mi><mi>j</mi></mrow></msub></mrow></mrow></math>

Eq. 10

In this example, we are multiplying a <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mi>K</mi><mo>×</mo><mi>N</mi></mrow></mrow></math> matrix by an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi><mml:mo>×</mml:mo><mml:mi>K</mml:mi></mml:math> matrix. Schematically, we can write this as follows:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mrow><mi>M</mi><mo>×</mo><mi>N</mi></mrow></mtd></mtr><mtr><mtd><mrow><mtext>Matrix</mtext><munder><munder><mi>C</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mtd></mtr></mtable><mo>=</mo><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mrow><mi>M</mi><mo>×</mo><mi>K</mi></mrow></mtd></mtr><mtr><mtd><mrow><mtext>Matrix</mtext><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mtd></mtr></mtable><mo>×</mo><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mrow><mi>K</mi><mo>×</mo><mi>N</mi></mrow></mtd></mtr><mtr><mtd><mrow><mtext>Matrix</mtext><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mtd></mtr></mtable></mrow></mrow></math>

Eq. 11

From this, it is clear that the “inner” dimensions in this example match, both being <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>K</mml:mi></mml:math>. To multiply two matrices together, the inner dimensions must match, when we write the multiplication out in this schematic way. If the dimensions do not match, we cannot multiply the matrices together. For example, we cannot multiply a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mn>10</mml:mn><mml:mo>×</mml:mo><mml:mn>4</mml:mn></mml:math> matrix by a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mn>6</mml:mn><mml:mo>×</mml:mo><mml:mn>7</mml:mn></mml:math> matrix.

The inner product as matrix multiplication

In fact, if we think of a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>d</mml:mi></mml:math>-dimensional column vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> as a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mn>1</mml:mn></mml:math> matrix, and its transpose, a<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi mathvariant="normal">⊤</mi></mrow></math> ,which is a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>d</mml:mi></mml:math>-dimensional row vector, as a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:math> matrix, then multiplying those two matrices together as a<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi mathvariant="normal">⊤</mi></mrow></math>a gives a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:mn>1</mml:mn></mml:math> result, or in other words, a scalar. The value of that scalar is calculated using the right-hand side of Eq. 10 for matrix multiplication and gives the same calculation as Eq. 2 for the inner product. In other words, we have the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mo>=</mo><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mo>⋅</mo><munder><mi>a</mi><mo stretchy="true">_</mo></munder></mrow></mrow></math>

Eq. 12

More generally, if we have two vectors <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> of the same length, then <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mo>=</mo><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mo>⋅</mo><munder><mi>b</mi><mo stretchy="true">_</mo></munder></mrow></mrow></math>. Let’s look at the last formula more schematically, as shown in Figure 3.2:

Figure 3.2: Matrix multiplication of a row vector and column vector is the same as the inner product

Figure 3.2: Matrix multiplication of a row vector and column vector is the same as the inner product

The left-hand side of the schematic equation in Figure 3.2 is a matrix multiplication, but if we calculate that matrix multiplication by hand, we get the expression on the right-hand side of the figure, which is just the inner product <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder><mml:mo>⋅</mml:mo><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>.

Matrix multiplication as a series of inner products

We can extend this connection between inner products and matrix multiplication by looking again at the right-hand side of Eq. 10. The matrix element <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math> looks like an inner product between a vector formed from the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:math> row of matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:math> column of matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>. That means we can represent the matrix multiplication <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math> schematically as follows:

Figure 3.3: Matrix elements resulting from a matrix multiplication can be viewed as inner product calculations

Figure 3.3: Matrix elements resulting from a matrix multiplication can be viewed as inner product calculations

In fact, this schematic is how I remember how to do matrix multiplication, not the dry formula given in Eq. 10.

Matrix multiplication is not commutative

The matrix multiplication <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math> is, of course, the matrix counterpart of the ordinary multiplication of real numbers that we are familiar with from school. Even the notation <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math> gives the impression that matrix multiplication will follow the same rules and patterns as ordinary multiplication. This is not the case. There are some subtleties and nuances with matrix multiplication. One of these subtleties to be aware of is that the order of the matrices matters. In general, for two different matrices <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></math>and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, we have <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>≠</mo><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math>. We say that matrix multiplication is not commutative.

To see this more concretely, let’s take an explicit example. Consider these two matrices:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em" columnwidth="auto auto" columnalign="center center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mn>1</mn></mtd><mtd><mn>4</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mrow><mo>−</mo><mn>2</mn></mrow></mtd></mtr></mtable></mfenced><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em" columnwidth="auto auto" columnalign="center center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mn>2</mn></mtd><mtd><mn>1</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd></mtr></mtable></mfenced></mrow></mrow></math>

Eq. 13

You can confirm for yourself, by doing the matrix multiplications by hand, that the following apply:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em" columnwidth="auto auto" columnalign="center center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mn>6</mn></mtd><mtd><mn>5</mn></mtd></mtr><mtr><mtd><mrow><mo>−</mo><mn>2</mn></mrow></mtd><mtd><mrow><mo>−</mo><mn>2</mn></mrow></mtd></mtr></mtable></mfenced><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em" columnwidth="auto auto" columnalign="center center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mn>2</mn></mtd><mtd><mn>6</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd><mtd><mn>2</mn></mtd></mtr></mtable></mfenced></mrow></mrow></math>

Eq. 14

Obviously, there are special cases where matrix multiplication does commute, for example, the trivial case when <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder><mml:mo>=</mml:mo><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, in which case <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><msup><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mn>2</mn></msup></mrow></mrow></math>. There are also cases when <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> are different and yet we have <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math>. These special cases require extra conditions on the properties of the matrices <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, but for now, we can say that in general, <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>≠</mo><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math>, so be careful you don’t assume it.

The outer product as a matrix multiplication

Just as we showed that the inner product between two vectors could be written as a matrix multiplication, we can do the same for calculating the outer product between two vectors. If we have an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi></mml:math>-component real-valued column vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> , we can think of it as an <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mi>M</mi><mo>×</mo><mn>1</mn></mrow></mrow></math> matrix. Likewise, if we have an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component real-valued row vector b<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi mathvariant="normal">⊤</mi></mrow></math> then we can think of it as a <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mn>1</mn><mo>×</mo><mi>N</mi></mrow></mrow></math> matrix. We can then multiply these two matrices together to get a b<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi mathvariant="normal">⊤</mi></mrow></math>. From the rules of matrix multiplication, this is an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix whose <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:math> matrix element is given by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math> , which is the same as we get when we calculate the outer product, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder><mml:mo>⊗</mml:mo><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> between the vectors <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><munder><mi>b</mi><mo stretchy="true">_</mo></munder></mrow></math>. Because of this, we almost always use the more succinct notation a b<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi mathvariant="normal">⊤</mi></mrow></math>to denote the outer product <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder><mml:mo>⊗</mml:mo><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> when <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><munder><mi>b</mi><mo stretchy="true">_</mo></munder></mrow></math> are real-valued. Schematically, we have the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mspace width="0.25em" /><msup><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup><mspace width="0.25em" /><mo>=</mo><mspace width="0.25em" /><mfenced open="(" close=")"><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><msub><mi>a</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>a</mi><mi>M</mi></msub></mtd></mtr></mtable></mfenced><mspace width="0.25em" /><mo>×</mo><mspace width="0.25em" /><mfenced open="(" close=")"><mtable columnspacing="0.8000em 0.8000em" columnwidth="auto auto auto" columnalign="center center center" rowalign="baseline"><mtr><mtd><msub><mi>b</mi><mn>1</mn></msub></mtd><mtd><mo>…</mo></mtd><mtd><msub><mi>b</mi><mi>N</mi></msub></mtd></mtr></mtable></mfenced><mo>=</mo><mspace width="0.25em" /><mfenced open="(" close=")"><mtable columnspacing="0.8000em 0.8000em" columnwidth="auto auto auto" columnalign="center center center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><mrow><msub><mi>a</mi><mn>1</mn></msub><msub><mi>b</mi><mn>1</mn></msub></mrow></mtd><mtd><mo>⋯</mo></mtd><mtd><mrow><msub><mi>a</mi><mn>1</mn></msub><msub><mi>b</mi><mi>N</mi></msub></mrow></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><mrow><msub><mi>a</mi><mi>M</mi></msub><msub><mi>b</mi><mn>1</mn></msub></mrow></mtd><mtd><mo>⋯</mo></mtd><mtd><mrow><msub><mi>a</mi><mi>M</mi></msub><msub><mi>b</mi><mi>N</mi></msub></mrow></mtd></mtr></mtable></mfenced><mo>=</mo><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mo>⊗</mo><munder><mi>b</mi><mo stretchy="true">_</mo></munder></mrow></mrow></math>

Eq. 15

You’ll recall that we could also calculate the outer product, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder><mml:mo>⊗</mml:mo><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, from the vectors <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>. You will have guessed that we can also write this outer product as the matrix multiplication <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><mi>b</mi><mo stretchy="true">_</mo></munder><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup></mrow></mrow></math> when <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> are real-valued. Again, this notation is more commonly used to represent the outer product, rather than <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder><mml:mo>⊗</mml:mo><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>.

Multiplying multiple matrices together

Once we know how to multiply two matrices together, it is a simple matter to multiply many matrices together – we simply take them two at a time. For example, if we have <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrices <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, then their product can be calculated via the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.125em" /><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.125em" /><munder><munder><mi>C</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.25em" /><mo>=</mo><mspace width="0.25em" /><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>×</mo><mfenced open="(" close=")"><mrow><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.125em" /><munder><munder><mi>C</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mfenced><mspace width="0.25em" /><mo>=</mo><mspace width="0.25em" /><mfenced open="(" close=")"><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.125em" /><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mfenced><mo>×</mo><munder><munder><mi>C</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math>

Eq. 16

We can either multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> together first and then multiply the result by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, or we can multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> together first and then use the result to multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>. Either way, we get the same result. This means matrix multiplication is associative.

Transforming a vector by matrix multiplication

So far, we have learned about vectors and matrices and their basic properties. We have also learned how to multiply matrices together. We have even seen how we can consider a vector as a special kind of matrix. This immediately raises the question of what happens if we multiply a vector by a matrix – what do we get and what does that multiplication represent?

Consider an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component column vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>. As we can think of the vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> as an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mn>1</mml:mn></mml:math> matrix, we can clearly multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> using the rules of matrix multiplication. In fact, we get the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em 0.8000em" columnwidth="auto auto auto" columnalign="center center center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><msub><mi>A</mi><mn>11</mn></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mn>1</mn><mi>N</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>A</mi><mrow><mi>M</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mi>M</mi><mi>N</mi></mrow></msub></mtd></mtr></mtable></mfenced><mo>×</mo><mfenced open="(" close=")"><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><msub><mi>b</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>b</mi><mi>N</mi></msub></mtd></mtr></mtable></mfenced><mo>=</mo><mfenced open="(" close=")"><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><mrow><msub><mi>A</mi><mn>11</mn></msub><msub><mi>b</mi><mn>1</mn></msub><mo>+</mo><msub><mi>A</mi><mn>12</mn></msub><msub><mi>b</mi><mn>2</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><msub><mi>A</mi><mrow><mn>1</mn><mi>N</mi></mrow></msub><msub><mi>b</mi><mi>N</mi></msub></mrow></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><mrow><msub><mi>A</mi><mrow><mi>M</mi><mn>1</mn></mrow></msub><msub><mi>b</mi><mn>1</mn></msub><mo>+</mo><msub><mi>A</mi><mrow><mi>M</mi><mn>2</mn></mrow></msub><msub><mi>b</mi><mn>2</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><msub><mi>A</mi><mrow><mi>M</mi><mi>N</mi></mrow></msub><msub><mi>b</mi><mi>N</mi></msub></mrow></mtd></mtr></mtable></mfenced></mrow></mrow></math>

Eq. 17

The result is an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi><mml:mo>×</mml:mo><mml:mn>1</mml:mn></mml:math> matrix, that is, an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi></mml:math>-component column vector. So, multiplying a vector by a matrix gives us another vector. The components of this new vector are given by the expressions inside the brackets on the right-hand side of Eq. 17. The components of this new vector are (in general) different from those of vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, so the effect of multiplying a vector by a matrix is to transform that vector. From this, we can conclude that matrices represent transformations.

If we look more closely at the individual expressions in the vector on the right-hand side of Eq. 17, we can see that each component in the new vector is a linear combination of the components in the old vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>. So, the matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> represents a linear transformation. The individual matrix elements <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><msub><mi>A</mi><mn>11</mn></msub><mo>,</mo><msub><mi>A</mi><mn>12</mn></msub></mrow></mrow></math><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><msub><mi>A</mi><mn>11</mn></msub><mo>,</mo><msub><mi>A</mi><mn>12</mn></msub></mrow></mrow></math>, and so on tell us the weights in those linear combinations that give us the components of the new vector. In other words, the individual matrix elements encode the details of the linear transformation.

One thing we haven’t spoken about yet is what effect the relative sizes of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math> has. If <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi><mml:mo>=</mml:mo><mml:mi>N</mml:mi></mml:math>, then obviously, multiplying an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> by the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> gives us another <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component vector. Although we have transformed the vector, we have, in this case, stayed within the same <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>N</mi></mrow></math>-dimensional space. However, if <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi><mml:mo><</mml:mo><mml:mi>N</mml:mi></mml:math>, then our new vector has fewer components than the starting vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, and so we have reduced the dimensionality. Alternatively, if <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi><mml:mo>></mml:mo><mml:mi>N</mml:mi></mml:math>, our new vector has more components than we started with, and we have increased the dimensionality.

In all the examples previously, we have been multiplying a column vector by a matrix. But we can equally multiply a matrix by a row vector. Let’s stick with our vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> but we will use its row vector form <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msup></mml:math>. Now we can think of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msup></mml:math> as a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix. So, if we have an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>M</mml:mi></mml:math> matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, then we can perform the matrix multiplication <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msup><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> , and we get a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:mi>M</mml:mi></mml:math> matrix out of it, that is, an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi></mml:math>-component row vector. As you might expect, this new <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>M</mml:mi></mml:math>-component vector is just a linear transformation of our starting <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msup></mml:math>, with the details of the linear transformation encoded in the matrix elements <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math>.

Finally, we should highlight that since matrix multiplication is a linear transformation, it means that if we apply a matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> to a combination of vectors, the result is the same as combining the results of applying <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> to each vector individually. In more detail, we have the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>×</mo><mfenced open="(" close=")"><mrow><msub><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mn>1</mn></msub><mo>+</mo><msub><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mn>2</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><msub><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mi>K</mi></msub></mrow></mfenced><mo>=</mo><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><msub><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mn>1</mn></msub><mo>+</mo><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><msub><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mn>2</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><msub><munder><mi>b</mi><mo stretchy="true">_</mo></munder><mi>K</mi></msub></mrow></mrow></math>

Eq. 18

We will make use of this fact shortly.

The identity matrix

Now that we have learned that matrix multiplication represents the linear transformation of vectors, let’s look at some particular special cases of transformations. Consider the <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mi>N</mi><mo>×</mo><mi>N</mi></mrow></mrow></math>matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> given here:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msub><munder><munder><mi>I</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mi>N</mi></msub><mspace width="0.25em" /><mo>=</mo><mspace width="0.25em" /><mfenced open="(" close=")"><mtable columnspacing="0.8000em 0.8000em 0.8000em 0.8000em" columnwidth="auto auto auto auto auto" columnalign="center center center center center" rowspacing="1.0000ex 1.0000ex 1.0000ex 1.0000ex" rowalign="baseline baseline baseline baseline baseline"><mtr><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mo>…</mo></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd><mtd><mo>…</mo></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋮</mo></mtd><mtd><mn>1</mn></mtd><mtd><mo>…</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mo>…</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mo>…</mo></mtd><mtd><mn>1</mn></mtd></mtr></mtable></mfenced></mrow></mrow></math>

Eq. 19

The matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> has 1 for each matrix element along its diagonal and 0 everywhere else. Now, what is the effect of multiplying by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math>? Let’s try it. Consider an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component column vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>. If we multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math>, we get the result shown here:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msub><munder><munder><mi>I</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mi>N</mi></msub><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mo>=</mo><msub><munder><munder><mi>I</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mi>N</mi></msub><mo>×</mo><mfenced open="(" close=")"><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><msub><mi>a</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>a</mi><mi>N</mi></msub></mtd></mtr></mtable></mfenced><mo>=</mo><mfenced open="(" close=")"><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><mrow><mn>1</mn><mo>×</mo><msub><mi>a</mi><mn>1</mn></msub><mo>+</mo><mn>0</mn><mo>×</mo><msub><mi>a</mi><mn>2</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><mn>0</mn><mo>×</mo><msub><mi>a</mi><mi>N</mi></msub></mrow></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><mrow><mn>0</mn><mo>×</mo><msub><mi>a</mi><mn>1</mn></msub><mo>+</mo><mn>0</mn><mo>×</mo><msub><mi>a</mi><mn>2</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><mn>1</mn><mo>×</mo><msub><mi>a</mi><mi>N</mi></msub></mrow></mtd></mtr></mtable></mfenced><mo>=</mo><mfenced open="(" close=")"><mtable columnwidth="auto" columnalign="center" rowspacing="1.0000ex 1.0000ex" rowalign="baseline baseline baseline"><mtr><mtd><msub><mi>a</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>a</mi><mi>N</mi></msub></mtd></mtr></mtable></mfenced><mo>=</mo><munder><mi>a</mi><mo stretchy="true">_</mo></munder></mrow></mrow></math>

Eq. 20

So, multiplying any vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> just gives us back <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> itself. We haven’t done anything to the starting vector. The transformation represented by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> is just the identity transformation, which leaves vectors untouched. Hence, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> is called the identity matrix. Or, more specifically, it is the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component identity matrix because it operates on <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi></mml:math>-component vectors.

It is a simple matter to confirm, via a similar calculation to the previous one, that if we reverse the order of the calculation, so that we multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> by a row vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msup></mml:math>, we leave the row vector unchanged. In terms of math notation, we have the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup><msub><munder><munder><mi>I</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mi>N</mi></msub><mo>=</mo><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup></mrow></mrow></math>

Eq. 21

Now remember that when we explained matrix multiplication as a series of inner products, we learned that we could think of a matrix as a set of column vectors, so it is not surprising that when we multiply an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math>, we leave the matrix untouched. In terms of the math, we have the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msub><munder><munder><mi>I</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mi>N</mi></msub><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math>

Eq. 22

Again, if we multiply them in the opposite order, we also leave the matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> unchanged, so in terms of the math, we have the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><msub><munder><munder><mi>I</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mi>N</mi></msub><mo>=</mo><munder><munder><mi>B</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder></mrow></mrow></math>

Eq. 23

The inverse matrix

If the identity matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> leaves an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix untouched, we can think of it as the matrix analog of multiplying a number by 1. For any number <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>a</mml:mi></mml:math> on the real number line, we have <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mi>a</mi><mo>×</mo><mn>1</mn><mo>=</mo><mi>a</mi></mrow></mrow></math> and <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mn>1</mn><mo>×</mo><mi>a</mi><mo>=</mo><mi>a</mi></mrow></mrow></math><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mn>1</mn><mo>×</mo><mi>a</mi><mo>=</mo><mi>a</mi></mrow></mrow></math>. The number 1 here is called the identity element. For a number <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>a</mml:mi></mml:math>, we also have the concept of its reciprocal, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math>, which is the number we multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>a</mml:mi></mml:math> by to get the identity element, so that <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math> is defined by the following relationship:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msup><mi>a</mi><mrow><mo>−</mo><mn>1</mn></mrow></msup><mi>a</mi><mo>=</mo><mi>a</mi><msup><mi>a</mi><mrow><mo>−</mo><mn>1</mn></mrow></msup><mo>=</mo><mn>1</mn></mrow></mrow></math>

Eq. 24

For an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, we have an analogous concept – the inverse matrix of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, which is denoted by the symbol <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math>. The matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math> is an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:math> matrix and, as you might have guessed, is defined as the matrix we multiply <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> by to get the identity element, the matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:math> in this case. So, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math> is defined by the following relationship:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msup><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mrow><mo>−</mo><mn>1</mn></mrow></msup><mspace width="0.125em" /><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.25em" /><mo>=</mo><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.125em" /><msup><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mrow><mo>−</mo><mn>1</mn></mrow></msup><mo>=</mo><msub><munder><munder><mi>I</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mi>N</mi></msub></mrow></mrow></math>

Eq. 25

Conceptually, we can think of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math> as playing a similar role and having similar properties to the reciprocal <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math> in ordinary arithmetic. Just like the reciprocal in ordinary arithmetic, the inverse matrix can be extremely useful in simplifying mathematical expressions by canceling other terms out.

Note that the inverse matrix is only defined for square matrices. Non-square matrices do not have a proper inverse. However, not all square matrices necessarily have an inverse. That is, there are some square matrices, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math>, for which there are no solutions, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math>, to the relation in Eq. 25. We will talk more about that later when we introduce eigen-decompositions of a square matrix.

More examples of matrices as transformations

Let’s look at another specific example of a matrix and understand its effect as a transformation. Consider the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mn>2</mml:mn><mml:mo>×</mml:mo><mml:mn>2</mml:mn></mml:math> matrix here:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em" columnwidth="auto auto" columnalign="center center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mrow><mo>−</mo><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mrow></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr></mtable></mfenced></mrow></mrow></math>

Eq. 26

Clearly matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> operates on two-component vectors that live in a two-dimensional plane. We can think of that plane as being the usual <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:math> plane. What transformation does this represent? Let’s break it down. Let’s look at the effect of the transformation represented by <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> on a specific vector. In this case, we’re going to choose the vector that represents the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>x</mml:mi></mml:math> axis. In column vector form, this vector is as follows:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mfenced open="(" close=")"><mfrac><mn>1</mn><mn>0</mn></mfrac></mfenced></mrow></math>

Eq. 27

All other vectors representing points on the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>x</mml:mi></mml:math> axis are just multiples of the vector in Eq. 27. Now, what is the effect of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> on this vector? It is easy to compute, and we find the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mfenced open="(" close=")"><mfrac><mn>1</mn><mn>0</mn></mfrac></mfenced><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em" columnwidth="auto auto" columnalign="center center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mrow><mo>−</mo><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mrow></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr></mtable></mfenced><mo>×</mo><mfenced open="(" close=")"><mfrac><mn>1</mn><mn>0</mn></mfrac></mfenced><mo>=</mo><mfenced open="(" close=")"><mfrac><mstyle scriptlevel="+1"><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mstyle><mstyle scriptlevel="+1"><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mstyle></mfrac></mfenced></mrow></mrow></math>

Eq. 28

The new vector on the right-hand side of Eq. 28 represents a point in the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:math> plane that has identical and positive <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>x</mml:mi></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>y</mml:mi></mml:math> components. In other words, it represents a 45° anti-clockwise rotation of our starting point, which was on the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>x</mml:mi></mml:math> axis.

Let’s look at the effect of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> on another vector. This time we’re going to choose a vector that represents a point on the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>y</mml:mi></mml:math> axis. In column vector form, this vector is as follows:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mfenced open="(" close=")"><mfrac><mn>0</mn><mn>1</mn></mfrac></mfenced></mrow></math>

Eq. 29

All other vectors representing points on the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>y</mml:mi></mml:math> axis are just multiples of the vector in Eq. 29. The effect of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> on this vector is as follows:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><munder><mi>A</mi><mo stretchy="true">_</mo></munder><mo stretchy="true">_</mo></munder><mspace width="0.25em" /><mfenced open="(" close=")"><mfrac><mn>0</mn><mn>1</mn></mfrac></mfenced><mo>=</mo><mfenced open="(" close=")"><mtable columnspacing="0.8000em" columnwidth="auto auto" columnalign="center center" rowspacing="1.0000ex" rowalign="baseline baseline"><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mrow><mo>−</mo><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mrow></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr></mtable></mfenced><mo>×</mo><mfenced open="(" close=")"><mfrac><mn>0</mn><mn>1</mn></mfrac></mfenced><mo>=</mo><mfenced open="(" close=")"><mfrac><mrow><mo>−</mo><mstyle scriptlevel="+1"><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mstyle></mrow><mstyle scriptlevel="+1"><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mstyle></mfrac></mfenced></mrow></mrow></math>

Eq. 30

The new vector on the right-hand side of Eq. 30 represents a point in the second quadrant of the (x, y) plane, and again represents a 45° anti-clockwise rotation of our starting point on the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>y</mml:mi></mml:math> axis. The effect of matrix <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> on the vectors <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mn>1</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mfenced></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>1</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mfenced></mml:math> is illustrated schematically in Figure 3.4:

Figure 3.4: Schematic illustration of the effect of matrix ​​​A _​ _​​

Figure 3.4: Schematic illustration of the effect of matrix A _ _

Now, any two-dimensional vector can be written as a sum of the two vectors we have just studied. To show this, consider the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><mfenced open="(" close=")"><mfrac><mi>x</mi><mi>y</mi></mfrac></mfenced><mo>=</mo><mfenced open="(" close=")"><mfrac><mi>x</mi><mn>0</mn></mfrac></mfenced><mo>+</mo><mfenced open="(" close=")"><mfrac><mn>0</mn><mi>y</mi></mfrac></mfenced><mo>=</mo><mi>x</mi><mfenced open="(" close=")"><mfrac><mn>1</mn><mn>0</mn></mfrac></mfenced><mo>+</mo><mi>y</mi><mfenced open="(" close=")"><mfrac><mn>0</mn><mn>1</mn></mfrac></mfenced></mrow></mrow></math>

Eq. 31

Given the effect of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> on both <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mfrac linethickness="0pt"><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mfrac linethickness="0pt"><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:math> is a 45° anti-clockwise rotation, then the effect of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> on any two-dimensional vector will be a 45° anti-clockwise rotation. Therefore, as a transformation, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> is a matrix that represents a 45° anti-clockwise rotation.

Since any two-dimensional vector can be written as a linear combination of the vectors <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mfrac linethickness="0pt"><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mfrac linethickness="0pt"><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:math>, these two vectors are called basis vectors – they provide a basis from which we can construct all other two-dimensional vectors. These two vectors are also orthogonal to each other. In geometric terms, this means they are at right-angles to each other – this is obvious in this example because one vector lies along the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>x</mml:mi></mml:math> axis while the other lies along the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>y</mml:mi></mml:math> axis. In algebraic terms, orthogonality means the inner product between the two vectors is 0. Basis vectors don’t have to be orthogonal to each other. For example, the two vectors <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mfrac linethickness="0pt"><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mfenced separators="|"><mml:mrow><mml:mfrac linethickness="0pt"><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:math> can also be used to describe any point on the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:math> plane. However, when basis vectors are orthogonal, they are easy to work with. Moving along one orthogonal basis vector does not change how far along we are on another orthogonal basis vector. For example, moving along the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>x</mml:mi></mml:math> axis does not affect where we are on the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>y</mml:mi></mml:math> axis. This means we can apply calculations along one orthogonal basis vector without having to worry about what is happening in terms of the other basis vectors. This makes orthogonal basis vectors very convenient to work with – a fact we will make use of when we move on to decompositions of matrices in the next section.

Given a set of orthogonal basis vectors <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>1</mn></msub><mo>,</mo><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>2</mn></msub><mo>,</mo><mo>…</mo><mo>,</mo><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>d</mi></msub></mrow></mrow></math> in a <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>d</mml:mi></mml:math>-dimensional space, we can easily work out how to represent any vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> in terms of those basis vectors. Say we have a vector <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and we want to write it as follows:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mo>=</mo><msub><mi>α</mi><mn>1</mn></msub><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>1</mn></msub><mo>+</mo><msub><mi>α</mi><mn>2</mn></msub><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>2</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><msub><mi>α</mi><mi>d</mi></msub><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>d</mi></msub></mrow></mrow></math>

Eq. 32

Then, we can work out the values of the weights <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><msub><mi>α</mi><mn>1</mn></msub><mo>,</mo><msub><mi>α</mi><mn>2</mn></msub><mo>,</mo><mo>⋯</mo><mo>,</mo><msub><mi>α</mi><mi>d</mi></msub></mrow></mrow></math> by taking the inner product of both sides of Eq. 32 with each of the basis vectors <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>1</mn></msub><mo>,</mo><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>2</mn></msub><mo>,</mo><mo>…</mo><mo>,</mo><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>d</mi></msub></mrow></mrow></math>. Doing so, we get the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub><mo>=</mo><msub><mi mathvariant="normal">α</mi><mn>1</mn></msub><msubsup><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>1</mn><mi mathvariant="normal">⊤</mi></msubsup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub><mo>+</mo><msub><mi mathvariant="normal">α</mi><mn>2</mn></msub><msubsup><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mn>2</mn><mi mathvariant="normal">⊤</mi></msubsup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub><mo>+</mo><mo>…</mo><mo>+</mo><msub><mi mathvariant="normal">α</mi><mi>d</mi></msub><msubsup><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>d</mi><mi mathvariant="normal">⊤</mi></msubsup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub></mrow></mrow></math>

Eq. 33

Since <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math> is, by definition, orthogonal to all the other basis vectors except <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math> itself, then the inner products <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msubsup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math>, unless <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mi>i</mml:mi></mml:math>. Plugging this fact into the preceding equation, we get the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub><mo>=</mo><msub><mi mathvariant="normal">α</mi><mi>i</mi></msub><msubsup><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi><mi mathvariant="normal">⊤</mi></msubsup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub><mspace width="0.25em" /><mspace width="0.25em" /><mspace width="0.25em" /><mo>⇒</mo><mspace width="0.25em" /><mspace width="0.25em" /><mspace width="0.25em" /><msub><mi mathvariant="normal">α</mi><mi>i</mi></msub><mspace width="0.25em" /><mo>=</mo><mspace width="0.25em" /><mfrac><mrow><msup><munder><mi>a</mi><mo stretchy="true">_</mo></munder><mi mathvariant="normal">⊤</mi></msup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub></mrow><mrow><msubsup><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi><mi mathvariant="normal">⊤</mi></msubsup><msub><munder><mi>v</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub></mrow></mfrac></mrow></mrow></math>

Eq. 34

So, we can easily work out the required weights. If the basis vectors are all of unit length, so that <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msubsup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math> for every value of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>i</mml:mi></mml:math>, then the expression in Eq. 34 for the weights becomes even easier. It becomes <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi mathvariant="normal">α</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi mathvariant="normal">⊤</mml:mi></mml:mrow></mml:msup><mml:msub><mml:mrow><mml:munder underaccent="false"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math> . A set of orthogonal basis vectors that are of unit length are called orthonormal and form an orthonormal basis. Using an orthonormal basis to represent our vectors is extremely convenient. In the next section of this chapter, we will show how an orthonormal basis can be extracted from any matrix and therefore can be used as an extremely convenient way of working with that matrix. But for now, let’s look at how to do some of those matrix multiplications and transformations in a code example.

Matrix transformation code example

For our code example, we’ll use the in-built functions in the NumPy package to do this. All the code examples that follow (and additional ones) can be found in the Code_Examples_Chap3.ipynb Jupyter notebook in the GitHub repository.

First, we’ll use the numpy.matmul function to multiply two matrices together:

import numpy as np
# Create 3x3 matrices
A = np.array([[1.0, 2.0, 1.0], [-2.5, 1.0, 0.0], [3.0, 1.0, 1.5]])
B = np.array([[1.0, -1.0, -1.0], [5, 2.0, 3.0], [3.0, 1.0, 2.0]])
# Multiply the matrices together
np.matmul(A, B)

The preceding code produces the following output:

array([[14.,  4. ,  7. ],
       [2.5,  4.5,  5.5],
       [12.5, 0.5,  3. ]])

We can use the same NumPy function to multiply a vector by a matrix:

# Create a 4-dimensional vector
a = np.array([1.0, 2.0, 3.0, -2.0])
# Create a 3x4 matrix
A = np.array([
    [1.0, 1.0, 0.0, 1.0], [-2.0, 2.5, 1.5, 3.0], 
    [0.0, 1.0, 1.0, 4.0]])
# We'll use the matrix multiplication function to calculate # A*a
np.matmul(A, a)

We get the following output:

array([ 1. ,  1.5, -3. ])

The NumPy package even has an in-built function for calculating the inverse of a matrix, as the following code demonstrates:

# Create a 4x4 square matrix
A = np.array( [[1, 2, 3, 4],
               [2, 1, 2, 1],
               [0, 1, 3, 2],
               [1, 1, 2, 2]])
# Calculate and store the inverse matrix
Ainv = np.linalg.inv(A)
# Multiply the matrix by its inverse.
# We should get the identity matrix 
#[[1,0,0,0], [0,1,0,0], [0,0,1,0], [0,0,0,1]]
# up to numerical precision
np.matmul(Ainv, A)

These simple code examples of matrix transformations bring this section neatly to a close, so let’s recap what we have learned in this section.

What we learned

In this section, we have learned the following:

  • How to multiply matrices together
  • How to multiply a vector by a matrix and vice versa
  • What the identity matrix is and its effect on any other matrix
  • What the inverse of a matrix is and why it is useful
  • How a matrix represents a linear transformation
  • How sets of orthonormal vectors provide a convenient basis on which we can express any other vector

Having learned the basics of matrix multiplication and how matrices represent transformations, we’ll now learn some standard ways of representing or decomposing matrices. These decompositions help us to understand in more detail the effect of a matrix and provide convenient ways to work with and manipulate matrices.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at ₹800/month. Cancel anytime