PyTorch tensor practical work, inspired from Python Data Science Handbook
In this practical work, we will setup a Python environment and discover PyTorch tensors. Tensors are the PyTorch version of NumPy ndarray with additional capabilities specific to GPU use and automatic differentiation. Most the methods available on NumPy arrays have an equivalent with PyTorch tensors but the name may be different.
Anaconda: Anaconda is a Python distribution with great machine learning integration working on Linux, Windows and Mac OS X. Using it instead of the default Python distribution of the OS will allow us to have a finer control without the need for administrator privileges.
During this course, we will a virtual environments. A virtual
environment is a tool that helps to keep dependencies required by
different projects separate by creating isolated Python virtual
environments for them. It also provides a way to easily reproduce the
environment required to run a specific piece of code. This is one of
the most important tools that most of the developers use, even outside
of the data science world. In this course we will use Anaconda virtual
environments instead of virtualenv
or pipenv
because Anaconda
environments are able to keep track of packages installed with
conda
(Anaconda package manager).
Downloading the Python 3.7 version of Anaconda from this address
Launch the installation shell script.
> chmod u+x Anaconda3-2019.10-Linux-x86_64.sh
> ./Anaconda3-2019.10-Linux-x86_64.sh
The installer will ask you if it should modify your .bashrc
to add
the anaconda folders to the PATH
, answer yes.
Now that anaconda in installed, we will create a virtual environment in which we will setup all our libraries. The documentation for the anaconda virtual environment management is available here.
I have already created a virtual environment containing all the libraries that we will use during this course. Download the export file here and import the environment using the following command:
conda env create -f deep_nlp.yml
We can use torch.tensor
to create tensors (arrays) from Python
lists:
>>> import torch
>>> torch.tensor([1,2,3,4,5])
tensor([1, 2, 3, 4, 5])
All the values in a PyTorch tensor have to contain the same type or be compatible via up-cast:
>>> torch.tensor([3.14, 2, 3, 4])
tensor([3.1400, 2.0000, 3.0000, 4.0000])
NumPy arrays can be multi-dimensional. Here we create a 2 dimensional array using a list of lists.
>>> a = torch.tensor([range(i, i + 3) for i in [2, 4]])
>>> a
tensor([[2, 3, 4],
[4, 5, 6]])
>>> a.ndim
2
>>> a.shape
torch.Size([2, 3])
>>> a.dtype
torch.int64
a.ndim
tells us that we have created a 2D tensor. a.shape
tells us
that the first dimension being of length 2 (rows) and the second one
being of length 3 (columns). a.dtype
tells us that the tensor
contains integers stored on 64 bytes. It is important to note that
these are not Python arbitrary precision integers. In order to perform
fast computations on a large quantity of data PyTorch uses internally
low level types (more explanations on this subject
here).
There are many efficient ways to create a variety of PyTorch tensors, most of them identical to the NumPy counterparts described here. The official documentation for PyTorch tensor creation is available here
Exercises, from ML+:
- Create a 1D array of numbers from 0 to 9
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- Create a 3×3 PyTorch tensor of all True’s
tensor([[True, True, True],
[True, True, True],
[True, True, True]])
Now that we can create NumPy arrays, let's learn how to access the values they contain.
For 1D tensors, indexing elements can be done similarly to Python lists:
>>> a = torch.tensor([5, 0, 3, 3, 7, 9])
>>> a
tensor([5, 0, 3, 3, 7, 9])
>>> a[0]
tensor(5)
>>> a[2]
tensor(3)
>>> a[-1]
tensor(9)
>>> a[-2]
tensor(7)
In a multi-dimensional tensors, items can be accessed using a comma-separated tuple of indices:
>>> a = torch.tensor([[3, 5, 2, 4], [7, 6, 8, 8], [1, 6, 7, 7]])
>>> a
tensor([[3, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])
>>> a[0]
tensor([3, 5, 2, 4])
>>> a[0][2]
tensor(2)
>>> a[0, 2]
tensor(2)
>>> a[-1, -2]
tensor(7)
>>> a[-1, -2] = 10
>>> a
tensor([[ 3, 5, 2, 4],
[ 7, 6, 8, 8],
[ 1, 6, 10, 7]])
Like with usual Python lists, we can create subtensors using the
slice notation. The PyTorch slicing follows that of the standard
Python list; to access a slice of an array x
, use this:
x[start:stop:step]
If any of these are unspecified, they default to the values start = 0
, stop = size of dimension
, step = 1
.
>>> x = torch.arange(10)
>>> x
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> x[:5] # first five elements
tensor([0, 1, 2, 3, 4])
>>> x[5:] # elements after index 5
tensor([5, 6, 7, 8, 9])
>>> x[4:7] # middle sub-array
tensor([4, 5, 6])
>>> x[::2] # every other element
tensor([0, 2, 4, 6, 8])
>>> x[1::2] # every other element, starting at index 1
tensor([1, 3, 5, 7, 9])
As of today, PyTorch does not support using a negative step
size to flip a
tensor. The same behavior can be obtained by using torch.flip
although this operation creates a copy of the tensor and not a view
(this will be explained later).
>>> torch.flip(x, dims = (0,))
tensor([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
We can also slice multidimensional arrays. Like with indexing, we will
specify the slicing indices for each dimension separated by ,
.
>>> x = torch.tensor([[12, 5, 2, 4], [ 7, 6, 8, 8], [ 1, 6, 7, 7]])
>>> x
tensor([[12, 5, 2, 4],
[ 7, 6, 8, 8],
[ 1, 6, 7, 7]])
>>> x[:2, :3] # first two rows, first three columns
tensor([[12, 5, 2],
[ 7, 6, 8]])
>>> x[:, ::2] # all rows (: is a slicing with default start and stop), every other column
tensor([[12, 2],
[ 7, 8],
[ 1, 7]])
An extremely important concept to keep in mind when working with
tensor slices is that they return views rather than copies of the
array data. When slicing Python lists, the slice is a copy of the
original array, this is not the case with torch.tensor
slices.
>>> x = torch.tensor([[12, 5, 2, 4], [ 7, 6, 8, 8], [ 1, 6, 7, 7]])
>>> x
tensor([[12, 5, 2, 4],
[ 7, 6, 8, 8],
[ 1, 6, 7, 7]])
>>> x2 = x[:2, :2]
>>> x2
tensor([[12, 5],
[ 7, 6]])
>>> x2[0, 0] = 99
>>> x2
tensor([[99, 5],
[ 7, 6]])
>>> x
tensor([[99, 5, 2, 4],
[ 7, 6, 8, 8],
[ 1, 6, 7, 7]])
Keeping this fact in mind will spare you from terrible debugging sessions.
Exercises: All the exercises use the following array:
>>> x = torch.tensor([[12, 5, 2, 4], [ 7, 6, 8, 8], [ 1, 6, 7, 7]])
>>> x
tensor([[12, 5, 2, 4],
[ 7, 6, 8, 8],
[ 1, 6, 7, 7]])
- Select the last line of
x
tensor([1, 6, 7, 7])
- Slice the two sevens on the last line of
x
tensor([7, 7])
- (harder) Slice and reverse the lines and the columns of the top right rectangle
tensor([[8, 8, 6],
[4, 2, 5]])
Another useful type of operation is reshaping of arrays. The most
flexible way of doing this is with the reshape
method. For example,
if you want to put the number 1
through 9
in a 3x3
grid, you can
do the following:
>>> torch.arange(1, 10).view(3, 3)
tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
You can use -1
as a joker when reshaping, PyTorch will deduce the
correct value from the number of elements of the array.
>>> torch.arange(1, 10).view(3, -1)
tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
A common thing to do when working with tensors is to flatten them
using .view(-1)
.
>>> x = torch.arange(1, 10).view(3, -1)
>>> x
tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> x.view(-1)
tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
Another common reshaping pattern is the conversion of a
one-dimensional array into a two-dimensional row or column
matrix. This can be done with the view method, or more easily by
making use of None
keyword within a slice operation:
>>> x = torch.tensor([1, 2, 3])
>>> x.view(1, 3) # Row vector via view
tensor([[1, 2, 3]])
>>> x[None, :] # Row vector via None
tensor([[1, 2, 3]])
>>> x.view(3, 1) # Column vector via view
tensor([[1],
[2],
[3]])
>>> x[:, None] # Column vector via None
tensor([[1],
[2],
[3]])
Exercises:
- Create a 3D tensor containing the numbers from
1
to27
with shape(3, 3, 3)
tensor([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[19, 20, 21],
[22, 23, 24],
[25, 26, 27]]])
- Create the following tensor
tensor([[ 0, 2],
[ 4, 6],
[ 8, 10],
[12, 14],
[16, 18]])
- Using the answer to question 2, create the following tensor
tensor([[16, 18],
[12, 14],
[ 8, 10],
[ 4, 6],
[ 0, 2]])
- Using the answer to question 2, create the following tensor
tensor([[18, 16],
[14, 12],
[10, 8],
[ 6, 4],
[ 2, 0]])
- (harder) Create the following tensor
tensor([ 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11])
It is possible to combine multiple tensors into one, and to conversely split a single tensor into multiple tensors.
First, to concatenate multiple 1D tensors, we can simply do the following:
>>> x = torch.tensor([1, 2, 3])
>>> y = torch.tensor([4, 5, 6])
>>> torch.cat([x, y])
tensor([1, 2, 3, 4, 5, 6])
>>> torch.cat([x, y, x])
tensor([1, 2, 3, 4, 5, 6, 1, 2, 3])
We can also concatenate multidimensional arrays by precising the axis (dimension) along which we want to perform the concatenation:
>>> y = torch.arange(6, 12).view(2, 3)
>>> x = torch.arange(6).view(2, 3)
>>> x
tensor([[0, 1, 2],
[3, 4, 5]])
>>> y = torch.arange(6, 12).view(2, 3)
>>> y
tensor([[ 6, 7, 8],
[ 9, 10, 11]])
>>> torch.cat([x, y], dim = 0) # Concatenate along dimension 0 (rows)
tensor([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
>>> torch.cat([x, y]) # The default concatenation dimension is 0
tensor([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
>>> torch.cat([x, y], dim = 1) # Concatenate along dimension 1 (columns)
tensor([[ 0, 1, 2, 6, 7, 8],
[ 3, 4, 5, 9, 10, 11]])
Exercises: All the exercises use the following tensors:
>>> x = torch.tensor([1, 2, 3])
>>> y = torch.arange(6).view(2, 3)
>>> y
tensor([[0, 1, 2],
[3, 4, 5]])
- Concatenate
x
andy
to create the following tensor. Be careful of the shapes of the array you are manipulating.
tensor([[1, 2, 3],
[0, 1, 2],
[3, 4, 5]])
- (harder) Using
x
,y
andtorch.cat
create the following array
tensor([[0, 1, 2, 1, 2, 3],
[3, 4, 5, 1, 2, 3]])
Multiple aggregations methods are available in NumPy. Here are a few examples:
>>> x = torch.arange(10, dtype = torch.float32) # Have to precise floating type to compute mean and std
>>> x
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> x.sum()
tensor(45.)
>>> x.mean()
tensor(4.5000)
>>> x.std()
tensor(3.0277)
>>> x.min()
tensor(0.)
>>> x.max()
tensor(9.)
Similarly to the torch.cat
function, we can precise along which axis
we want to perform the computation.
>>> x = torch.arange(12, dtype = torch.float32).reshape(3, 4)
>>> x
tensor([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
>>> x.sum(dim = 0) # Sum along the rows, we have one result by column
tensor([12., 15., 18., 21.])
>>> x.sum(dim = 1) # Sum along the columns, we have one result by row
tensor([ 6., 22., 38.])
>>> x.mean(dim = 0) # Mean along the rows, we have one result by column
tensor([4., 5., 6., 7.])
>>> x.mean(dim = 1) # Mean long the columns, we have one result by row
tensor([1.5000, 5.5000, 9.5000])
>>> x.min(dim = 0) # Minimum along the rows, we get two tensors, the first one with
# the value of the min for each column, the second one with the
# index of the row in which the minimum is.
torch.return_types.min(
values=tensor([0., 1., 2., 3.]),
indices=tensor([0, 0, 0, 0]))
>>> x.min(dim = 1) # Minimum along the columns, we get two tensors, the first one with
# the value of the min for each row, the second one with the
# index of the column in which the minimum is.
torch.return_types.min(
values=tensor([0., 4., 8.]),
indices=tensor([0, 0, 0]))
Exercise:
- Compute, for each column, the mean of the values of the rows of even index in the following tensor.
>>> x = torch.arange(42, dtype = torch.float32).view(6, 7)
>>> x
tensor([[ 0., 1., 2., 3., 4., 5., 6.],
[ 7., 8., 9., 10., 11., 12., 13.],
[14., 15., 16., 17., 18., 19., 20.],
[21., 22., 23., 24., 25., 26., 27.],
[28., 29., 30., 31., 32., 33., 34.],
[35., 36., 37., 38., 39., 40., 41.]])
The result should be
tensor([14., 15., 16., 17., 18., 19., 20.])
Now that we know basic manipulations on PyTorch tensors, we can start making operations between them.
When arrays have identical sizes, binary operations are performed on an element-by-element basis:
>>> a = torch.tensor([1, 2, 3])
>>> b = torch.tensor([5, 5, 5])
>>> a + b
tensor([6, 7, 8])
Broadcasting allows these types of binary operations to be performed on arrays of different sizes. For example, we can just add a scalar (think of it as a zero-dimensional tensor) to a tensor:
>>> a + 5
tensor([6, 7, 8])
You can think of this as an operation that stretches or duplicates the
value 5
into the tensor [5, 5, 5]
, and adds the results. The
advantage of PyTorch's broadcasting is that this duplication of values
does not actually take place, but it is a useful mental model as we
think about broadcasting.
>>> a
tensor([1, 2, 3])
>>> M = torch.ones((3, 3))
>>> M
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
>>> M + a
tensor([[2., 3., 4.],
[2., 3., 4.],
[2., 3., 4.]])
Here the one-dimensional tensor a
is stretched, or broadcast
across the dimension in order to match the shape of M
.
Broadcasting is a powerful tool that follows precise rules. An in-depth explanation of these rules can be found here.
A typical use case of broadcasting is to normalize the values of an array by subtracting the mean and dividing by the standard deviation. In the following example, let's assume that the values of each column are observation of a different variable. We would like to compute the mean and standard deviation for each column and then perform the normalization.
>>> x = torch.randn(3, 5) # Randomly generate x using a normal distribution
>>> x
tensor([[-1.6168, 0.4247, 0.6664, -0.9946, -1.7289],
[ 0.9720, -0.7299, 0.7018, 0.7963, -1.0579],
[ 0.0678, -0.3628, 0.5733, -0.7313, 0.2808]])
>>> x.mean(dim = 0) # Compute the mean for each column
tensor([-0.1923, -0.2227, 0.6472, -0.3099, -0.8353])
>>> x.std(dim = 0) # Compute the standard deviation for each column
tensor([1.3139, 0.5899, 0.0663, 0.9670, 1.0232])
>>> (x - x.mean(dim = 0)) / x.std(dim = 0) # Standardize each column with its own
# mean and standard deviation
tensor([[-1.0842, 1.0974, 0.2900, -0.7081, -0.8733],
[ 0.8862, -0.8599, 0.8230, 1.1440, -0.2175],
[ 0.1980, -0.2375, -1.1129, -0.4358, 1.0909]])
Exercises:
All the exercises use the following array:
>>> a = torch.arange(10).view(-1, 2)
>>> a
tensor([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
- Add
2
to the first column and-1
to the second column ofa
tensor([[ 2, 0],
[ 4, 2],
[ 6, 4],
[ 8, 6],
[10, 8]])
- Multiply the values of the first column of
a
by the maximum of the values of the second column. The result should be:
tensor([[ 0, 1],
[18, 3],
[36, 5],
[54, 7],
[72, 9]])
PyTorch also provides binary comparison operators that output boolean arrays:
>>> a = torch.tensor([1, 2, 3, 4])
>>> b = torch.tensor([3, 2, 1, 4])
>>> a == b
tensor([False, True, False, True])
>>> a != b
tensor([ True, False, True, False])
>>> a > b
tensor([False, False, True, False])
>>> a <= b
tensor([ True, True, False, True])
These operators support the broadcasting:
>>> a = torch.tensor([1, 2, 3, 4])
>>> a == 2
tensor([False, True, False, False])
>>> a % 2 == 0
tensor([False, True, False, True])
We can also combine boolean arrays using boolean operators
>>> a = torch.arange(1, 11)
>>> a
tensor([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> a % 2 == 0
tensor([False, True, False, True, False, True, False, True, False, True])
>>> a % 5 == 0
tensor([False, False, False, False, True, False, False, False, False, True])
>>> ~(a % 5 == 0)
tensor([ True, True, True, True, False, True, True, True, True, False])
>>> (a % 2 == 0) | (a % 5 == 0)
tensor([False, True, False, True, True, True, False, True, False, True])
>>> (a % 2 == 0) & (a % 5 == 0)
tensor([False, False, False, False, False, False, False, False, False, True])
A very powerful PyTorch feature is the ability to select elements of an array using boolean masks. The mask should have the same shape (or compatible) as the tensor.
>>> a = torch.arange(10)
>>> a
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a % 3 == 0 # Multiples of 3 mask
tensor([ True, False, False, True, False, False, True, False, False, True])
>>> a[a % 3 == 0] # Select the multiples of 3 using the mask
tensor([0, 3, 6, 9])
>>> a % 2 == 0 # Multiples of 2 mask
tensor([ True, False, True, False, True, False, True, False, True, False])
>>> ~(a % 2 == 0) # Numbers that are not multiples of 2
tensor([False, True, False, True, False, True, False, True, False, True])
>>> a[(a % 3 == 0) & ~(a % 2 == 0)] # Select the elements that are multiple of 3
# but not of 2
tensor([3, 9])
Boolean masking also works in multiple dimensions:
>>> a = torch.arange(12).reshape(3, 4)
>>> a
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a >= 2
tensor([[False, False, True, True],
[ True, True, True, True],
[ True, True, True, True]])
>>> a[a >= 2]
tensor([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> a < 10
tensor([[ True, True, True, True],
[ True, True, True, True],
[ True, True, False, False]])
>>> a[a < 10]
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[(a >= 2) & (a < 10)]
tensor([2, 3, 4, 5, 6, 7, 8, 9])
Exercises: All the exercises use the following arrays:
>>> a = torch.arange(16).reshape(4, 4)
>>> a
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
-
Select and sum the multiples of 3 in
a
, the result should be 45. -
Subtract the mean of the values strictly greater than 8 not multiple of 4 to the whole array. The result should be:
array([[-12., -11., -10., -9.],
[ -8., -7., -6., -5.],
[ -4., -3., -2., -1.],
[ 0., 1., 2., 3.]])