Figure 1: Going from a single number to an n-dimension tensor
This article is an excerpt from the book Deep Reinforcement Learning Hands-On - Second Edition by Maxim Lapan. This book is an updated and expanded version of the bestselling guide to the very latest RL tools and techniques. In this article, we’ll discuss the fundamental building block of all DL toolkits, tensor.
If you're familiar with the NumPy library, then you already know that its central purpose is the handling of multi-dimensional arrays in a generic way. In NumPy, such arrays aren't called tensors, but they are in fact tensors. Tensors are used very widely in scientific computations as generic storage for data. For example, a color image could be encoded as a 3D tensor with dimensions of width, height, and color plane.
Apart from dimensions, a tensor is characterized by the type of its elements. There are eight types supported by PyTorch: three float types (16-bit, 32-bit, and 64-bit) and five integer types (8-bit signed, 8-bit unsigned, 16-bit, 32-bit, and 64-bit). Tensors of different types are represented by different classes, with the most commonly used being torch.FloatTensor
(corresponding to a 32-bit float), torch.ByteTensor
(an 8-bit unsigned integer), and torch.LongTensor
(a 64-bit signed integer). The rest can be found in the PyTorch documentation.
There are three ways to create a tensor in PyTorch:
To give you examples of these methods, let's look at a simple session:
>>> import torch >>> import numpy as np >>> a = torch.FloatTensor(3, 2) >>> a tensor([[4.1521e+09, 4.5796e-41], [ 1.9949e-20, 3.0774e-41], [ 4.4842e-44, 0.0000e+00]])
Here, we imported both PyTorch and NumPy and created an uninitialized tensor of size 3×2. By default, PyTorch allocates memory for the tensor, but doesn't initialize it with anything. To clear the tensor's content, we need to use its operation:
>> a.zero_() tensor([[ 0., 0.], [ 0., 0.], [ 0., 0.]])
There are two types of operation for tensors: inplace and functional. Inplace operations have an underscore appended to their name and operate on the tensor's content. After this, the object itself is returned. The functional equivalent creates a copy of the tensor with the performed modification, leaving the original tensor untouched. Inplace operations are usually more efficient from a performance and memory point of view.
Another way to create a tensor by its constructor is to provide a Python iterable (for example, a list or tuple), which will be used as the contents of the newly created tensor:
>>> torch.FloatTensor([[1,2,3],[3,2,1]]) tensor([[ 1., 2., 3.], [ 3., 2., 1.]])
Here we are creating the same tensor with zeroes using NumPy:
>>> n = np.zeros(shape=(3, 2)) >>> n array([[ 0., 0.], [ 0., 0.], [ 0., 0.]]) >>> b = torch.tensor(n) >>> b tensor([[ 0., 0.], [ 0., 0.], [ 0., 0.]], dtype=torch.float64)
The torch.tensor
method accepts the NumPy array as an argument and creates a tensor of appropriate shape from it. In the preceding example, we created a NumPy array initialized by zeros, which created a double (64-bit float) array by default. So, the resulting tensor has the DoubleTensor
type (which is shown in the preceding example with the dtype value). Usually, in DL, double precision is not required and it adds an extra memory and performance overhead. The common practice is to use the 32-bit float type, or even the 16-bit float type, which is more than enough. To create such a tensor, you need to specify explicitly the type of NumPy array:
>>> n = np.zeros(shape=(3, 2), dtype=np.float32) >>> torch.tensor(n) tensor([[ 0., 0.], [ 0., 0.], [ 0., 0.]])
As an option, the type of the desired tensor could be provided to the torch.tensor function in the dtype argument. However, be careful, since this argument expects to get a PyTorch type specification, not the NumPy one. PyTorch types are kept in the torch package, for example, torch.float32
and torch.uint8
.
>>> n = np.zeros(shape=(3,2)) >>> torch.tensor(n, dtype=torch.float32) tensor([[ 0., 0.], [ 0., 0.], [ 0., 0.]])
In this article, we saw a quick overview of tensor, the fundamental building block of all DL toolkits. We talked about tensor and how to create it in the PyTorch library. Discover ways to increase efficiency of RL methods both from theoretical and engineering perspective with the book Deep Reinforcement Learning Hands-on, Second Edition by Maxim Lapan.
Maxim Lapan is a deep learning enthusiast and independent researcher. He has spent 15 years working as a software developer and systems architect. His projects have ranged from low-level Linux kernel driver development to performance optimization and the design of distributed applications working on thousands of servers.
With his areas of expertise including big data, machine learning, and large parallel distributed HPC and non-HPC systems, Maxim is able to explain complicated concepts using simple words and vivid examples. His current areas of interest are practical applications of deep learning, such as deep natural language processing and deep reinforcement learning. Maxim lives in Moscow, Russian Federation, with his family.