One of the best ways to solve linear equations is by a systematic method known as elimination. This is a method that allows us to systematically eliminate variables and use substitution to solve equations.
Let's take a look at two equations with two variables, as follows:
After elimination, this becomes the following:
As we can see, the x variable is no longer in the second equation. We can plug the y value back into the first equation and solve for x. Doing this, we find that x = 3 and y = 1.
We call this triangular factorization. There are two types—lower triangular and upper triangular. We solve the upper triangular system from top to bottom using a process known as back substitution, and this works for systems of any size.
While this is an effective method, it is not fail-proof. We could come across a scenario where we have more equations than variables, or more variables than equations, which are unsolvable. Or, we could have a scenario such as 0x = 7, and, as we know, dividing by zero is impossible.
Let's solve three equations with three variables, as follows:
We will use upper triangular factorization and eliminate variables, starting with y and then z. Let's start by putting this into our matrix form, as follows:
For our purposes and to make things simpler, we will drop v, the column vector, and get the following result:
Then, exchange row 2 and row 3 with each other, like this:
Then, add row 2 and row 1 together to eliminate the first value in row 2, like this:
Next, multiply row 1 by 3/2 and subtract it from row 3, like this:
Finally, multiply row 2 by 6 and subtract it from row 3, like this:
As you can notice, the values in the matrix now form a triangle pointing upward, which is why we call it upper triangular. By substituting the values back into the previous equation backward (from bottom to top), we can solve, and find that , , and .
In summary, becomes , as illustrated here:
Note: The values across the diagonal in the triangular factorized matrix are called pivots, and when factorized, the values below the diagonal are all zeros.
To check that our found solution is right, we solve , using our found values for x, y, and z, like this:
This then becomes the following equation:
And as we can see, the left-hand side is equal to the right-hand side.
After upper triangular factorization, an arbitrary 4x4 matrix will look like this:
We could take this a step further and factorize the upper triangular matrix until we end up with a matrix that contains only the pivot values along the diagonal, and zeros everywhere else. This resulting matrix P essentially fully solves the problem for us without us having to resort to forward or backward substitution, and it looks like this:
But as you can tell, there are a lot of steps involved in getting us from A to P.
There is one other very important factorization method called lower-upper (LU) decomposition. The way it works is we factorize A into an upper triangular matrix U, and record the steps of Gaussian elimination in a lower triangular matrix L, such that .
Let's revisit the matrix we upper-triangular factorized before and put it into LU factorized form, like this:
If we multiply the two matrices on the right, we will get the original matrix A. But how did we get here? Let's go through the steps, as follows:
- We start with , so that the following applies:
- We add -1 to what was the identity matrix at l2,1 to represent the operation (row 2)-(-1)(row 1), so it becomes the following:
- We then add to the matrix at l3,1 to represent the operation, so it becomes the following:
- We then add 6 to the matrix at l3,2 to represent the operation (row 3)-6(row 2), so it becomes the following:
This is the LU factorized matrix we saw earlier.
You might now be wondering what this has to do with solving , which is very valid. The elimination process tends to work quite well, but we have to additionally apply all the operations we did on A to b as well, and this involves extra steps. However, LU factorization is only applied to A.
Let's now take a look at how we can solve our system of linear equations using this method.
For simplicity, we drop the variables vector and write A and b as follows:
But even this can get cumbersome to write as we go, so we will instead write it in the following way for further simplicity:
We then multiply both sides by and get the following result:
This tells us that , and we already know from the preceding equation that (so ). And by using back substitution, we can find the vector v.
In the preceding example, you may have noticed some new notation that I have not yet introduced, but not to worry—we will observe all the necessary notation and operations in the next section.