You're reading from Hands-On Artificial Intelligence for Beginners An introduction to AI concepts, algorithms, and their implementation

Product type Paperback

Published in Oct 2018

Publisher Packt

ISBN-13 9781788991063

Length 362 pages

Edition 1st Edition

Languages

Python

Tools

Assemble

Concepts

Artificial Intelligence

Authors (2):

David Dindi

Patrick D. Smith

View More author details

Applied math basics

When we talk about mathematics as related to deep learning and AI, we're often talking about linear algebra. Linear algebra is a branch of continuous mathematics that involves the study of vector space and operations performed in vector space. If you remember back to grade-school algebra, algebra in general deals with unknown variables. With linear algebra, we're extending this study into linear systems that have an arbitrary number of dimensions, which is what makes this a form of continuous mathematics.

AI relies on the basic building block of the tensor. Within AI, these mathematical objects store information throughout ANNs that allow them to operate; they are data structures that are utilized throughout AI. As we will see, a tensor has a rank, which essentially tells us about the indices of the data (how many rows and columns the data has).

While many problems in deep learning are not formally linear problems, the basic building blocks of matrices and tensors are the primary data structures for solving, optimizing, and approximating within an ANN.

Want to see how linear algebra can help us from a programmatic standpoint? Take a look at the following code block:

import numpy as np
## Element-wise multiplication without utilizing linear algebra techniques

x = [1,2,3]
y = [4,5,6]

product = []
for i in range(len(x)):
    product.append(x[i]*y[i])

## Element-wise multiplication utilizing linear algebra techniques

x = np.array([1,2,3])
y = np.array([4,5,6])
x * y

We can eliminate strenuous loops by simply utilizing NumPy's built-in linear algebra functions. When you think of AI, and the thousands upon thousands of operations that have to be computed at the runtime of an application, the building blocks of linear algebra can also help us out programmatically. In the following sections, we'll be reviewing these fundamental concepts in both mathematical notation and Python.

Each of the following examples will use the Python package NumPy; import numpy as np

The building blocks – scalars, vectors, matrices, and tensors

In the following section, we'll introduce the fundamental types of linear algebra objects that are used throughout AI applications; scalars, vectors, matrices, and tensors.

Scalars

Scalars are nothing but singular, real numbers that can take the form of an integer or floating point. In Python, we create a scalar by simply assigning it:

my_scalar = 5
my_scalar = 5.098

Vectors

Vectors are one-dimensional arrays of integers. Geometrically, they store the direction and magnitude of change from a point. We'll see how this works in machine learning algorithms when we discuss principal component analysis (PCA) in the next few pages. Vectors in Python are created as numpy array objects:

my_vector = np.array([5,6])

Vectors can be written in several ways:

Matrices

Matrices are two-dimensional lists of numbers that contain rows and columns. Typically, rows in a matrix are denoted by i, while columns are denoted by j.

Matrices are represented as:

We can easily create matrices in Python as NumPy arrays, much like we can with vectors:

matrix = np.array([[5,6], [6,9]])

The only different is that we are adding an additional vector to the array to create the matrix.

Tensors

While you may have heard of vectors and matrices before, the name tensor may be new. A tensor is a generalized matrix, and they have different sizes, or ranks, which measure their dimensions.

Tensors are three (or more)-dimensional lists; you can think of them as a sort of multi-dimensional object of numbers, such as a cube. Tensors have a unique transitive property and form; if a tensor transforms another entity, it too must transform. Any rank 2 tensor can be represented as a matrix, but not all matrices are automatically rank 2 tensors. A tensor must have this transitive property. As we'll see, this will come into play with neural networks in the next chapter. We can create tensors in Python such as the following:

tensor = [[[1,2,3,4]],[[2,5,6,3]],[[7,6,3,4]]]

Within the context of AI, tensors can represent things such as word embeddings or weights in a neural network. We'll talk about these more as we encounter at them in upcoming chapters.

Matrix math

The basic operations of an ANN are based on matrix math. In this section, we'll be reviewing the basic operations that you need to know to understand the mechanics of ANNs.

Scalar operations

Scalar operations involve a vector (or matrix) and a scalar. To perform an operation with a scalar on a matrix, simply apply to the scalar to every item in the matrix:

In Python, we would simply do the following:

vector = np.array([[1,2], [1,2]])
new_vector = vector + 2

Element–wise operations

In element-wise operations, position matters. Values that correspond positionally are combined to create a new value.

To add to and/or subtract matrices or vectors:

And in Python:

vector_one = np.array([[1,2],[3,4]])
vector_two = np.array([[5,6],[7,8]])
    a + b
    ## You should see:
        array([[ 6, 8],[10, 12]])
        array([[ 6, 8],[10, 12]])
     a - b
     ## You should see:
         array([[-4, -4], [-4, -4]])

There are two forms of multiplication that we may perform with vectors: the Dot product, and the Hadamard product.

The dot product is a special case of multiplication, and is rooted in larger theories of geometry that are used across the physical and computational sciences. It is a special case of a more general mathematical principle known as an inner product. When utilizing the dot product of two vectors, the output is a scalar:

Dot products are a workhorse in machine learning. Think about a basic operation: let's say we're doing a simple classification problem where we want to know if an image contains a cat or a dog. If we did this with a neural network, it would look as follows:

Here, y is our classification cat or dog. We determine y by utilizing a network represented by f, where the input is x, while w and b represent a weight and bias factor (don't worry, we'll explain this in more detail in the coming chapter!). Our x and w are both matrices, and we need to output a scalar , which represents either cat or dog. We can only do this by taking the dot product of w and .

Relating back to our example, if this function were presented with an unknown image, taking the dot product will tell us how similar in direction the new vector is to the cat vector (a) or dog vector (b) by the measure of the angle () between them:

If the vector is closer to the direction of the cat vector (a), we'll classify the image as containing a cat. If it's closer to the dog vector (b), we'll classify it as containing a dog. In deep learning, a more complex version of this scenario is performed over and over; it's the core of how ANNs work.

In Python, we can take the dot product of two vectors by using a built-in function from numpy, np.dot():

## Dot Product
vector_one = np.array([1,2,3])
vector_two = np.array([2,3,4])
np.dot(vector_one,vector_two) ## This should give us 20

The Hadamard product, on the other hand, outputs a vector:

The Hadamard product is element-wise, meaning that the individual numbers in the new matrix are the scalar multiples of the numbers from the previous matrices. Looking back to Python, we can easily perform this operation in Python with a simple * operator:

vector_one = np.array([1,2,3])
vector_two = np.array([2,3,4])
vector_one * vector_two
## You should see:
array([ 2,  6, 12])

Now that we've scratched the surface of basic matrix operations, let's take a look at how probability theory can aid us in the artificial intelligence field.