Skip to content

Latest commit

 

History

History
830 lines (639 loc) · 20.6 KB

practical_work_01.md

File metadata and controls

830 lines (639 loc) · 20.6 KB

Practical work 1

PyTorch tensor practical work, inspired from Python Data Science Handbook

In this practical work, we will setup a Python environment and discover PyTorch tensors. Tensors are the PyTorch version of NumPy ndarray with additional capabilities specific to GPU use and automatic differentiation. Most the methods available on NumPy arrays have an equivalent with PyTorch tensors but the name may be different.

Anaconda setup

Anaconda: Anaconda is a Python distribution with great machine learning integration working on Linux, Windows and Mac OS X. Using it instead of the default Python distribution of the OS will allow us to have a finer control without the need for administrator privileges.

During this course, we will a virtual environments. A virtual environment is a tool that helps to keep dependencies required by different projects separate by creating isolated Python virtual environments for them. It also provides a way to easily reproduce the environment required to run a specific piece of code. This is one of the most important tools that most of the developers use, even outside of the data science world. In this course we will use Anaconda virtual environments instead of virtualenv or pipenv because Anaconda environments are able to keep track of packages installed with conda (Anaconda package manager).

Downloading the Python 3.7 version of Anaconda from this address

Launch the installation shell script.

> chmod u+x Anaconda3-2019.10-Linux-x86_64.sh
> ./Anaconda3-2019.10-Linux-x86_64.sh

The installer will ask you if it should modify your .bashrc to add the anaconda folders to the PATH, answer yes.

Now that anaconda in installed, we will create a virtual environment in which we will setup all our libraries. The documentation for the anaconda virtual environment management is available here.

I have already created a virtual environment containing all the libraries that we will use during this course. Download the export file here and import the environment using the following command:

conda env create -f deep_nlp.yml

Creation of PyTorch tensors

We can use torch.tensor to create tensors (arrays) from Python lists:

>>> import torch
>>> torch.tensor([1,2,3,4,5])
tensor([1, 2, 3, 4, 5])

All the values in a PyTorch tensor have to contain the same type or be compatible via up-cast:

>>> torch.tensor([3.14, 2, 3, 4])
tensor([3.1400, 2.0000, 3.0000, 4.0000])

NumPy arrays can be multi-dimensional. Here we create a 2 dimensional array using a list of lists.

>>> a = torch.tensor([range(i, i + 3) for i in [2, 4]])
>>> a
tensor([[2, 3, 4],
        [4, 5, 6]])

>>> a.ndim
2

>>> a.shape
torch.Size([2, 3])

>>> a.dtype
torch.int64

a.ndim tells us that we have created a 2D tensor. a.shape tells us that the first dimension being of length 2 (rows) and the second one being of length 3 (columns). a.dtype tells us that the tensor contains integers stored on 64 bytes. It is important to note that these are not Python arbitrary precision integers. In order to perform fast computations on a large quantity of data PyTorch uses internally low level types (more explanations on this subject here).

There are many efficient ways to create a variety of PyTorch tensors, most of them identical to the NumPy counterparts described here. The official documentation for PyTorch tensor creation is available here

Exercises, from ML+:

  • Create a 1D array of numbers from 0 to 9
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
  • Create a 3×3 PyTorch tensor of all True’s
tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

Indexing

Now that we can create NumPy arrays, let's learn how to access the values they contain.

For 1D tensors, indexing elements can be done similarly to Python lists:

>>> a = torch.tensor([5, 0, 3, 3, 7, 9])
>>> a
tensor([5, 0, 3, 3, 7, 9])

>>> a[0]
tensor(5)

>>> a[2]
tensor(3)

>>> a[-1]
tensor(9)

>>> a[-2]
tensor(7)

In a multi-dimensional tensors, items can be accessed using a comma-separated tuple of indices:

>>> a = torch.tensor([[3, 5, 2, 4], [7, 6, 8, 8], [1, 6, 7, 7]])
>>> a
tensor([[3, 5, 2, 4],
        [7, 6, 8, 8],
        [1, 6, 7, 7]])

>>> a[0]
tensor([3, 5, 2, 4])

>>> a[0][2]
tensor(2)

>>> a[0, 2]
tensor(2)

>>> a[-1, -2]
tensor(7)

>>> a[-1, -2] = 10
>>> a
tensor([[ 3,  5,  2,  4],
        [ 7,  6,  8,  8],
        [ 1,  6, 10,  7]])

Slicing

Like with usual Python lists, we can create subtensors using the slice notation. The PyTorch slicing follows that of the standard Python list; to access a slice of an array x, use this:

x[start:stop:step]

If any of these are unspecified, they default to the values start = 0, stop = size of dimension, step = 1.

>>> x = torch.arange(10)
>>> x
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> x[:5] # first five elements
tensor([0, 1, 2, 3, 4])
>>> x[5:] # elements after index 5
tensor([5, 6, 7, 8, 9])
>>> x[4:7] # middle sub-array
tensor([4, 5, 6])
>>> x[::2] # every other element
tensor([0, 2, 4, 6, 8])
>>> x[1::2] # every other element, starting at index 1
tensor([1, 3, 5, 7, 9])

As of today, PyTorch does not support using a negative step size to flip a tensor. The same behavior can be obtained by using torch.flip although this operation creates a copy of the tensor and not a view (this will be explained later).

>>> torch.flip(x, dims = (0,))
tensor([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

We can also slice multidimensional arrays. Like with indexing, we will specify the slicing indices for each dimension separated by ,.

>>> x = torch.tensor([[12,  5,  2,  4], [ 7,  6,  8,  8], [ 1,  6,  7,  7]])
>>> x
tensor([[12,  5,  2,  4],
        [ 7,  6,  8,  8],
        [ 1,  6,  7,  7]])

>>> x[:2, :3] # first two rows, first three columns
tensor([[12,  5,  2],
        [ 7,  6,  8]])

>>> x[:, ::2] # all rows (: is a slicing with default start and stop), every other column
tensor([[12,  2],
        [ 7,  8],
        [ 1,  7]])

An extremely important concept to keep in mind when working with tensor slices is that they return views rather than copies of the array data. When slicing Python lists, the slice is a copy of the original array, this is not the case with torch.tensor slices.

>>> x = torch.tensor([[12,  5,  2,  4], [ 7,  6,  8,  8], [ 1,  6,  7,  7]])
>>> x
tensor([[12,  5,  2,  4],
        [ 7,  6,  8,  8],
        [ 1,  6,  7,  7]])

>>> x2 = x[:2, :2]
>>> x2
tensor([[12,  5],
        [ 7,  6]])

>>> x2[0, 0] = 99
>>> x2
tensor([[99,  5],
        [ 7,  6]])

>>> x
tensor([[99,  5,  2,  4],
        [ 7,  6,  8,  8],
        [ 1,  6,  7,  7]])

Keeping this fact in mind will spare you from terrible debugging sessions.

Exercises: All the exercises use the following array:

>>> x = torch.tensor([[12,  5,  2,  4], [ 7,  6,  8,  8], [ 1,  6,  7,  7]])
>>> x
tensor([[12,  5,  2,  4],
        [ 7,  6,  8,  8],
        [ 1,  6,  7,  7]])
  1. Select the last line of x
tensor([1, 6, 7, 7])
  1. Slice the two sevens on the last line of x
tensor([7, 7])
  1. (harder) Slice and reverse the lines and the columns of the top right rectangle
tensor([[8, 8, 6],
        [4, 2, 5]])

Shape manipulation

Another useful type of operation is reshaping of arrays. The most flexible way of doing this is with the reshape method. For example, if you want to put the number 1 through 9 in a 3x3 grid, you can do the following:

>>> torch.arange(1, 10).view(3, 3)
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

You can use -1 as a joker when reshaping, PyTorch will deduce the correct value from the number of elements of the array.

>>> torch.arange(1, 10).view(3, -1)
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

A common thing to do when working with tensors is to flatten them using .view(-1).

>>> x = torch.arange(1, 10).view(3, -1)
>>> x
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

>>> x.view(-1)
tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])

Another common reshaping pattern is the conversion of a one-dimensional array into a two-dimensional row or column matrix. This can be done with the view method, or more easily by making use of None keyword within a slice operation:

>>> x = torch.tensor([1, 2, 3])

>>> x.view(1, 3) # Row vector via view
tensor([[1, 2, 3]])

>>> x[None, :] # Row vector via None
tensor([[1, 2, 3]])

>>> x.view(3, 1) # Column vector via view
tensor([[1],
        [2],
        [3]])

>>> x[:, None] # Column vector via None
tensor([[1],
        [2],
        [3]])

Exercises:

  1. Create a 3D tensor containing the numbers from 1 to 27 with shape (3, 3, 3)
tensor([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9]],

        [[10, 11, 12],
         [13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24],
         [25, 26, 27]]])
  1. Create the following tensor
tensor([[ 0,  2],
        [ 4,  6],
        [ 8, 10],
        [12, 14],
        [16, 18]])
  1. Using the answer to question 2, create the following tensor
tensor([[16, 18],
        [12, 14],
        [ 8, 10],
        [ 4,  6],
        [ 0,  2]])
  1. Using the answer to question 2, create the following tensor
tensor([[18, 16],
        [14, 12],
        [10,  8],
        [ 6,  4],
        [ 2,  0]])
  1. (harder) Create the following tensor
tensor([ 2,  1,  4,  3,  6,  5,  8,  7, 10,  9, 12, 11])

Combination

It is possible to combine multiple tensors into one, and to conversely split a single tensor into multiple tensors.

First, to concatenate multiple 1D tensors, we can simply do the following:

>>> x = torch.tensor([1, 2, 3])
>>> y = torch.tensor([4, 5, 6])

>>> torch.cat([x, y])
tensor([1, 2, 3, 4, 5, 6])

>>> torch.cat([x, y, x])
tensor([1, 2, 3, 4, 5, 6, 1, 2, 3])

We can also concatenate multidimensional arrays by precising the axis (dimension) along which we want to perform the concatenation:

>>> y = torch.arange(6, 12).view(2, 3)
>>> x = torch.arange(6).view(2, 3)
>>> x
tensor([[0, 1, 2],
        [3, 4, 5]])

>>> y = torch.arange(6, 12).view(2, 3)
>>> y
tensor([[ 6,  7,  8],
        [ 9, 10, 11]])

>>> torch.cat([x, y], dim = 0) # Concatenate along dimension 0 (rows)
tensor([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]])

>>> torch.cat([x, y]) # The default concatenation dimension is 0
tensor([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]])

>>> torch.cat([x, y], dim = 1) # Concatenate along dimension 1 (columns)
tensor([[ 0,  1,  2,  6,  7,  8],
        [ 3,  4,  5,  9, 10, 11]])

Exercises: All the exercises use the following tensors:

>>> x = torch.tensor([1, 2, 3])
>>> y = torch.arange(6).view(2, 3)
>>> y
tensor([[0, 1, 2],
        [3, 4, 5]])
  1. Concatenate x and y to create the following tensor. Be careful of the shapes of the array you are manipulating.
tensor([[1, 2, 3],
        [0, 1, 2],
        [3, 4, 5]])
  1. (harder) Using x, y and torch.cat create the following array
tensor([[0, 1, 2, 1, 2, 3],
        [3, 4, 5, 1, 2, 3]])

Aggregations

Multiple aggregations methods are available in NumPy. Here are a few examples:

>>> x = torch.arange(10, dtype = torch.float32) # Have to precise floating type to compute mean and std
>>> x
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

>>> x.sum()
tensor(45.)

>>> x.mean()
tensor(4.5000)

>>> x.std()
tensor(3.0277)

>>> x.min()
tensor(0.)

>>> x.max()
tensor(9.)

Similarly to the torch.cat function, we can precise along which axis we want to perform the computation.

>>> x = torch.arange(12, dtype = torch.float32).reshape(3, 4)
>>> x
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

>>> x.sum(dim = 0) # Sum along the rows, we have one result by column
tensor([12., 15., 18., 21.])

>>> x.sum(dim = 1) # Sum along the columns, we have one result by row
tensor([ 6., 22., 38.])

>>> x.mean(dim = 0) # Mean along the rows, we have one result by column
tensor([4., 5., 6., 7.])

>>> x.mean(dim = 1) # Mean long the columns, we have one result by row
tensor([1.5000, 5.5000, 9.5000])

>>> x.min(dim = 0) # Minimum along the rows, we get two tensors, the first one with
                   # the value of the min for each column, the second one with the
                   # index of the row in which the minimum is.
torch.return_types.min(
values=tensor([0., 1., 2., 3.]),
indices=tensor([0, 0, 0, 0]))

>>> x.min(dim = 1) # Minimum along the columns, we get two tensors, the first one with
                   # the value of the min for each row, the second one with the
                   # index of the column in which the minimum is.
torch.return_types.min(
values=tensor([0., 4., 8.]),
indices=tensor([0, 0, 0]))

Exercise:

  1. Compute, for each column, the mean of the values of the rows of even index in the following tensor.
>>> x = torch.arange(42, dtype = torch.float32).view(6, 7)
>>> x
tensor([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.],
        [ 7.,  8.,  9., 10., 11., 12., 13.],
        [14., 15., 16., 17., 18., 19., 20.],
        [21., 22., 23., 24., 25., 26., 27.],
        [28., 29., 30., 31., 32., 33., 34.],
        [35., 36., 37., 38., 39., 40., 41.]])

The result should be

tensor([14., 15., 16., 17., 18., 19., 20.])

Broadcasting

Now that we know basic manipulations on PyTorch tensors, we can start making operations between them.

When arrays have identical sizes, binary operations are performed on an element-by-element basis:

>>> a = torch.tensor([1, 2, 3])
>>> b = torch.tensor([5, 5, 5])
>>> a + b
tensor([6, 7, 8])

Broadcasting allows these types of binary operations to be performed on arrays of different sizes. For example, we can just add a scalar (think of it as a zero-dimensional tensor) to a tensor:

>>> a + 5
tensor([6, 7, 8])

You can think of this as an operation that stretches or duplicates the value 5 into the tensor [5, 5, 5], and adds the results. The advantage of PyTorch's broadcasting is that this duplication of values does not actually take place, but it is a useful mental model as we think about broadcasting.

>>> a
tensor([1, 2, 3])

>>> M = torch.ones((3, 3))
>>> M
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

>>> M + a
tensor([[2., 3., 4.],
        [2., 3., 4.],
        [2., 3., 4.]])

Here the one-dimensional tensor a is stretched, or broadcast across the dimension in order to match the shape of M.

Broadcasting is a powerful tool that follows precise rules. An in-depth explanation of these rules can be found here.

A typical use case of broadcasting is to normalize the values of an array by subtracting the mean and dividing by the standard deviation. In the following example, let's assume that the values of each column are observation of a different variable. We would like to compute the mean and standard deviation for each column and then perform the normalization.

>>> x = torch.randn(3, 5) # Randomly generate x using a normal distribution
>>> x
tensor([[-1.6168,  0.4247,  0.6664, -0.9946, -1.7289],
        [ 0.9720, -0.7299,  0.7018,  0.7963, -1.0579],
        [ 0.0678, -0.3628,  0.5733, -0.7313,  0.2808]])

>>> x.mean(dim = 0) # Compute the mean for each column
tensor([-0.1923, -0.2227,  0.6472, -0.3099, -0.8353])

>>> x.std(dim = 0) # Compute the standard deviation for each column
tensor([1.3139, 0.5899, 0.0663, 0.9670, 1.0232])

>>> (x - x.mean(dim = 0)) / x.std(dim = 0) # Standardize each column with its own
                                           # mean and standard deviation
tensor([[-1.0842,  1.0974,  0.2900, -0.7081, -0.8733],
        [ 0.8862, -0.8599,  0.8230,  1.1440, -0.2175],
        [ 0.1980, -0.2375, -1.1129, -0.4358,  1.0909]])

Exercises:

All the exercises use the following array:

>>> a = torch.arange(10).view(-1, 2)
>>> a
tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])
  1. Add 2 to the first column and -1 to the second column of a
tensor([[ 2,  0],
        [ 4,  2],
        [ 6,  4],
        [ 8,  6],
        [10,  8]])
  1. Multiply the values of the first column of a by the maximum of the values of the second column. The result should be:
tensor([[ 0,  1],
        [18,  3],
        [36,  5],
        [54,  7],
        [72,  9]])

Comparisons, Masks and Boolean logic

PyTorch also provides binary comparison operators that output boolean arrays:

>>> a = torch.tensor([1, 2, 3, 4])
>>> b = torch.tensor([3, 2, 1, 4])
>>> a == b
tensor([False,  True, False,  True])

>>> a != b
tensor([ True, False,  True, False])

>>> a > b
tensor([False, False,  True, False])

>>> a <= b
tensor([ True,  True, False,  True])

These operators support the broadcasting:

>>> a = torch.tensor([1, 2, 3, 4])
>>> a == 2
tensor([False,  True, False, False])

>>> a % 2 == 0
tensor([False,  True, False,  True])

We can also combine boolean arrays using boolean operators

>>> a = torch.arange(1, 11)
>>> a
tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

>>> a % 2 == 0
tensor([False,  True, False,  True, False,  True, False,  True, False,  True])

>>> a % 5 == 0
tensor([False, False, False, False,  True, False, False, False, False,  True])

>>> ~(a % 5 == 0)
tensor([ True,  True,  True,  True, False,  True,  True,  True,  True, False])

>>> (a % 2 == 0) | (a % 5 == 0)
tensor([False,  True, False,  True,  True,  True, False,  True, False,  True])

>>> (a % 2 == 0) & (a % 5 == 0)
tensor([False, False, False, False, False, False, False, False, False,  True])

A very powerful PyTorch feature is the ability to select elements of an array using boolean masks. The mask should have the same shape (or compatible) as the tensor.

>>> a = torch.arange(10)
>>> a
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>> a % 3 == 0 # Multiples of 3 mask
tensor([ True, False, False,  True, False, False,  True, False, False,  True])

>>> a[a % 3 == 0] # Select the multiples of 3 using the mask
tensor([0, 3, 6, 9])

>>> a % 2 == 0 # Multiples of 2 mask
tensor([ True, False,  True, False,  True, False,  True, False,  True, False])

>>> ~(a % 2 == 0) # Numbers that are not multiples of 2
tensor([False,  True, False,  True, False,  True, False,  True, False,  True])

>>> a[(a % 3 == 0) & ~(a % 2 == 0)] # Select the elements that are multiple of 3
                                    # but not of 2
tensor([3, 9])

Boolean masking also works in multiple dimensions:

>>> a = torch.arange(12).reshape(3, 4)
>>> a
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

>>> a >= 2
tensor([[False, False,  True,  True],
        [ True,  True,  True,  True],
        [ True,  True,  True,  True]])

>>> a[a >= 2]
tensor([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

>>> a < 10
tensor([[ True,  True,  True,  True],
        [ True,  True,  True,  True],
        [ True,  True, False, False]])

>>> a[a < 10]
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>> a[(a >= 2) & (a < 10)]
tensor([2, 3, 4, 5, 6, 7, 8, 9])

Exercises: All the exercises use the following arrays:

>>> a = torch.arange(16).reshape(4, 4)
>>> a
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]])
  1. Select and sum the multiples of 3 in a, the result should be 45.

  2. Subtract the mean of the values strictly greater than 8 not multiple of 4 to the whole array. The result should be:

array([[-12., -11., -10.,  -9.],
       [ -8.,  -7.,  -6.,  -5.],
       [ -4.,  -3.,  -2.,  -1.],
       [  0.,   1.,   2.,   3.]])