Fundamental Tensor Operations in PyTorch

Lesson 2

Lesson Overview

Hello there! In today's lesson, we delve deeper into PyTorch Tensors and their fundamental operations such as addition and multiplication. We will also learn about broadcasting, a powerful feature in PyTorch that makes it possible to perform operations between tensors of different shapes. Let's get started!

Introduction to Tensor Operations

In our previous lesson, we learned about PyTorch Tensors — the basic building blocks in PyTorch that hold data. They resemble arrays in NumPy but come with additional features that are essential for machine learning tasks like GPU acceleration. Now, we will learn the different operations that can be performed on tensors and better understand how these computations happen deep inside neural networks.

Fundamentally, tensor operations include arithmetical operations such as addition, subtraction, multiplication, and division. Two essential forms of multiplication in tensors are element-wise multiplication and matrix multiplication. We can also perform operations between a scalar and a tensor, or between tensors of different shapes — thanks to a technique called Broadcasting.

Tensor Addition

In tensor addition, we simply add the corresponding elements in the tensors. PyTorch provides the torch.add() function that takes two tensors as inputs and returns a new tensor which is the result of adding the input tensors. It is also as easy as using the + operator.

Let's go over an example:

Python
1import torch
2
3# Creating two tensors
4tensor_a = torch.tensor([[1, 2], [3, 4]], dtype=torch.int32)
5tensor_b = torch.tensor([[5, 6], [7, 8]], dtype=torch.int32)
6
7# Tensor addition
8tensor_sum = torch.add(tensor_a, tensor_b)
9print(f"Tensor Addition:\n{tensor_sum}")

The output of the above code will be:

Plain text
1Tensor Addition:
2tensor([[ 6,  8],
3        [10, 12]], dtype=torch.int32)

This output shows the result of adding two tensors element-wise. Each element from tensor_a is added to the corresponding element in tensor_b, resulting in a new tensor of the same shape.

Element-wise and Matrix Multiplication

Next, let's learn about tensor multiplication. One type is element-wise multiplication where each element in one tensor is multiplied with the corresponding element in the other tensor. We can use the torch.mul() function or the * operator for this.

The other type is matrix multiplication. This is similar to the matrix multiplication in linear algebra where the dot product of rows and columns is calculated. For this, we use the torch.matmul() function.

Below is the Python code demonstrative of these operations:

Python
1# Element-wise Multiplication
2tensor_product = torch.mul(tensor_a, tensor_b)
3print(f"Element-wise Multiplication:\n{tensor_product}")
4
5# Matrix Multiplication
6tensor_c = torch.tensor([[1], [2]], dtype=torch.int32) # 2x1 tensor
7tensor_matmul = torch.matmul(tensor_a, tensor_c)
8print(f"Matrix Multiplication:\n{tensor_matmul}")

The output of the above code will be:

Plain text
1Element-wise Multiplication:
2tensor([[ 5, 12],
3        [21, 32]], dtype=torch.int32)
4
5Matrix Multiplication:
6tensor([[ 5],
7        [11]], dtype=torch.int32)

These outputs demonstrate two different multiplication techniques. In element-wise multiplication, each element of tensor_a is multiplied with the corresponding element of tensor_b. In matrix multiplication, the dot product between tensor_a and tensor_c results in a new tensor shape based on matrix product rules.

Tensor Broadcasting and its Importance

Broadcasting is a technique that allows PyTorch to perform operations on tensors of different shapes. It essentially extends the smaller tensor to match the shape of the larger tensor so that operations can be performed element-wise.

Broadcasting comes in handy while doing operations between a tensor and a scalar, or between tensors of different shapes like adding a vector to each row of a matrix.

Here's how you can accomplish broadcasting in PyTorch:

Python
1# Broadcasted Addition (Tensor + scalar)
2tensor_add_scalar = tensor_a + 5
3print(f"Broadcasted Addition (Adding scalar value):\n{tensor_add_scalar}")
4
5# Broadcasted Addition between tensors of different shapes (same as torch.add)
6broadcasted_sum = tensor_a + tensor_c
7print(f"Broadcasted Addition:\n{broadcasted_sum}")
8
9# Broadcasted Multiplication between tensors of different shapes (same as torch.mul)
10broadcasted_mul = tensor_a * tensor_c
11print(f"Broadcasted Multiplication:\n{broadcasted_mul}")

The output of the above code will be:

Plain text
1Broadcasted Addition (Adding scalar value):
2tensor([[6, 7],
3        [8, 9]], dtype=torch.int32)
4
5Broadcasted Addition:
6tensor([[2, 3],
7        [5, 6]], dtype=torch.int32)
8
9Broadcasted Multiplication:
10tensor([[1, 2],
11        [6, 8]], dtype=torch.int32)

These results illustrate how broadcasting allows operations with tensors of differing shapes. In the first operation, a scalar is added to every element of tensor_a, demonstrating broadcasting with a scalar. The second and third outputs show broadcasting operations between two tensors of different shapes, where tensor_c is broadcasted to match the shape of tensor_a allowing for element-wise addition and multiplication. The addition here is the same as using the torch.add method, and the multiplication is the same as using the torch.mul method, both of which perform broadcasting internally.

Summary

Congratulations! You're now familiar with basic tensor operations in PyTorch: addition, multiplication, and broadcasting. We've understood and implemented these operations, especially broadcasting, which is a crucial concept that comes in handy quite often in practice. This will act as the stepping stone towards understanding the workings of neural networks and their implementation in PyTorch. Happy learning!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.