PyTorch is an open source machine learning library for Python. In this tutorial, we will provide an introduction to the main PyTorch features, tensor library, and autograd – automatic differentiation package.

##### Tensor Library

The core data structure in PyTorch is a tensor, which is a multi-dimensional array like NumPy’s nd-arrays but it offers GPU support.

First, we need to import torch package.

1 |
import torch |

###### Tensor operations

Creating tensor with all zeros.

1 2 3 4 5 6 |
>>> t = torch.zeros(2, 3) >>> t 0 0 0 0 0 0 [torch.FloatTensor of size 2x3] |

Creating tensor with random numbers from a uniform distribution on the interval [0,1).

1 2 3 4 5 6 |
>>> t = torch.rand(2, 3) >>> t 0.4576 0.9765 0.1219 0.4172 0.5013 0.3238 [torch.FloatTensor of size 2x3] |

Reshaping tensor.

1 2 3 4 5 6 |
>>> t = torch.Tensor([1, 2, 3, 4]) >>> t.view(2, 2) 1 2 3 4 [torch.FloatTensor of size 2x2] |

The matrix dot product.

1 2 3 4 5 6 7 |
>>> t1 = torch.Tensor([[1, 2], [3, 4]]) >>> t2 = torch.Tensor([[10], [30]]) >>> t1.mm(t2) 70 150 [torch.FloatTensor of size 2x1] |

Element-wise matrix multiplication.

1 2 3 4 5 6 |
>>> t1 = torch.Tensor([[1, 2], [3, 4]]) >>> t1.mul(t1) 1 4 9 16 [torch.FloatTensor of size 2x2] |

The transpose of a matrix.

1 2 3 4 5 6 7 8 9 10 11 12 |
>>> t1 = torch.Tensor([[1, 2, 3], [4, 5, 6]]) >>> t1 1 2 3 4 5 6 [torch.FloatTensor of size 2x3] >>> t1.t() 1 4 2 5 3 6 [torch.FloatTensor of size 3x2] |

Slicing

1 2 3 4 5 6 |
>>> t = torch.Tensor([[1, 2], [3, 4]]) >>> t[:, -1] 2 4 [torch.FloatTensor of size 2] |

###### GPU support

PyTorch uses separate backends for CPU and GPU.

Using a GPU in PyTorch is incredibly easy, we just need to call cuda method.

1 |
t = t.cuda() |

###### NumPy bridge

Import NumPy

1 |
import numpy as np |

Creating ndarray from tensor.

1 2 3 4 |
>>> t = torch.zeros(2, 3) >>> t.numpy() array([[ 0., 0., 0.], [ 0., 0., 0.]], dtype=float32) |

Creating tensor from NumPy ndarray.

1 2 3 4 5 6 |
>>> a = np.ones((1, 2)) >>> t = torch.from_numpy(a) >>> t 1 1 [torch.DoubleTensor of size 1x2] |

Going back and forth between PyTorch tensor and NumPy ndarray is very efficient because they share same data.

##### Autograd Library

We need to import Variable from autograd package.

1 |
from torch.autograd import Variable |

Autograd is an automatic differentiation library which lets you compute derivatives for all operations on tensors. To support automatic differentation, Pytorch introduces Variables, which are wrappers around tensors that provide the ability to define a computational graph, so that we can automatically compute gradients. You can retrieve a tensor held by the Variable, using the data attribute.

1 2 3 4 5 6 7 |
>>> t = torch.ones(2, 3) >>> v = Variable(t) >>> v.data 1 1 1 1 1 1 [torch.FloatTensor of size 2x3] |

Variable represents a node in a computational graph and it’s able to accumulate gradient, which will be stored in `grad`

attribute. In order to create Variable that supports gradient accumulation, we need to specify requires_grad=True.

1 2 |
>>> t = torch.ones(2, 3) >>> v = Variable(t, requires_grad=True) |

###### Computation Graph

PyTorch uses a define-by-run paradigm, you run computation on actual tensors with graphs generated on-the-fly. Let’s define and run simple computation graph.

1 2 |
>>> x = Variable(torch.FloatTensor([2]),requires_grad=True) >>> y = x**2 + 4*x |

We can now compute gradients by running backward method, grad attribute contains computed gradients.

1 2 3 4 5 |
>>> y.backward() >>> x.grad Variable containing: 8 [torch.FloatTensor of size 1] |

Dynamic computation graph gives us a lot of flexibility, we can use control-flow primitives, like `for`

and `if `

to define the graph in runtime, no special DSL is needed, and debugging PyTorch code is as easy as debugging python code.