Hello and welcome! In this lesson, we'll dive into the process of training a neural network model in PyTorch. You'll learn how to import necessary modules, prepare data, define a loss function or criterion, and an optimizer, and set up and run a training loop. By the end of this lesson, you’ll be able to train a neural network model using PyTorch.
Our example will be training a neural network to predict if a soccer team is likely to win based on average goals scored by the team and average goals conceded by the opponent.
Before starting, let's briefly refresh our knowledge on training a Neural Network. Training a model is a process of learning the weight parameters that minimize the error on the training data. The process involves passing data through the model (forward propagation), computing the loss
(how far the model's prediction
is from the actual value
), and then adjusting the weights using this loss (Backward Propagation).
To do this in PyTorch, we will need our training data, a defined model, a loss function, and an optimizer for adjusting the weights.
Let's proceed with our example code and demonstrate this concept more vividly.
At the core of supervised learning techniques, we need input data (features) and output data (target/labels). In our scenario, the input represents the average goals scored by a soccer team and the average goals conceded by their opponent during the season. The output is binary, indicating whether the team is likely to win a match against this specific opponent (1)
or not (0)
.
Let's create our input and output tensor data using the torch.tensor()
function.
Python1import torch 2 3# Input features [Average Goals Scored, Average Goals Conceded by Opponent] 4X = torch.tensor([ 5 [3.0, 0.5], [1.0, 1.0], [0.5, 2.0], [2.0, 1.5], 6 [3.5, 3.0], [2.0, 2.5], [1.5, 1.0], [0.5, 0.5], 7 [2.5, 0.8], [2.1, 2.0], [1.2, 0.5], [0.7, 1.5] 8], dtype=torch.float32) 9 10# Target outputs [1 if the team is likely to win, 0 otherwise] 11y = torch.tensor([[1], [0], [0], [1], [1], [0], [1], [0], [1], [0], [1], [0]], dtype=torch.float32)
It is important to note that we've used dtype=torch.float32
for both X
and y
as our loss function (Binary Cross-Entropy) requires the target tensor y
to be in floating-point format. Other loss functions may require different data types, so it's crucial to ensure compatibility.
We need to define our neural network model. For this binary classification task, we'll use a simple feedforward neural network with an input layer, one hidden layer, and an output layer.
Python1import torch.nn as nn 2 3# Define the model using nn.Sequential 4model = nn.Sequential( 5 nn.Linear(2, 10), 6 nn.ReLU(), 7 nn.Linear(10, 1), 8 nn.Sigmoid() 9)
To train our neural network model, we need to define a criterion
(also known as a loss function) and an optimizer
.
The criterion measures how far the model's predictions are from the actual output. PyTorch provides several loss function classes, and for our binary classification task, we'll use the Binary Cross-Entropy (BCE) Loss.
The optimizer is used to update the model parameters (weights and biases) based on the derivatives computed during backpropagation. PyTorch offers several optimization algorithms under torch.optim
. In this example, we'll use Adam with a learning rate of 0.1.
Python1import torch.optim as optim 2 3criterion = nn.BCELoss() 4optimizer = optim.Adam(model.parameters(), lr=0.1)
Training a neural network typically involves iteratively passing data through our model, calculating the loss, and backpropagating the loss to update our model. This process is repeated for a certain number of epochs. An epoch is one complete pass through the entire training dataset.
Let's demonstrate this with our example code.
Python1# Train the model for 50 epochs 2for epoch in range(50): 3 model.train() # Set the model to training mode 4 5 optimizer.zero_grad() # Zero the gradients for this iteration 6 7 outputs = model(X) # Forward pass: compute predictions 8 9 loss = criterion(outputs, y) # Compute the loss 10 11 loss.backward() # Backward pass: compute the gradient of the loss 12 13 optimizer.step() # Optimize the model parameters based on the gradients 14 15 if (epoch+1) % 10 == 0: # Print every 10 epochs 16 print(f"Epoch {epoch+1}, Loss: {loss.item()}") # Print epoch loss
After setting up the training loop, the code runs a sequence of operations for each epoch, where an epoch is a single pass through the entire dataset. Here’s a breakdown of each step:
- Model Training Mode:
model.train()
places the model in training mode, enabling necessary features that should only be active during training. It's good practice to keep it inside the loop to ensure the model is consistently set to training mode at the start of each epoch, especially if you might switch to other modes for some reason during the training process. - Reset Gradients:
optimizer.zero_grad()
is called to reset the gradients of the model parameters; this is crucial because gradients accumulate by default in PyTorch. - Forward Pass:
outputs = model(X)
computes the model's predictions based on the current state of the model parameters. - Loss Calculation: The loss is calculated by comparing the model's predictions (
outputs
) to the true labels (y
) using the pre-defined loss functioncriterion
, which in this case isnn.BCELoss()
.
Following the loss calculation, the backward pass is initiated:
- Backward Pass:
loss.backward()
computes the gradients of the loss with respect to each parameter. - Parameter Update: The optimizer uses these gradients in
optimizer.step()
to adjust the model's parameters, reducing the loss for the next iteration.
This sequence is repeated for a predefined number of epochs (50
in this case), allowing the model to iteratively learn and improve its performance on the training data.
While training, it's helpful to keep track of the loss at each epoch (or at certain intervals). This allows us to see how our model is improving over time or if it's struggling. Here, we print the loss every 10 epochs.
Python1for epoch in range(50): 2 # Training steps... 3 4 if (epoch+1) % 10 == 0: # Every 10 epochs (range starts from 0) 5 print(f"Epoch {epoch+1}, Loss: {loss.item()}") # Print epoch loss
The output of the above code will be:
Plain text1Epoch 10, Loss: 0.171359583735466 2Epoch 20, Loss: 0.03882661089301109 3Epoch 30, Loss: 0.012083499692380428 4Epoch 40, Loss: 0.005638712551444769 5Epoch 50, Loss: 0.0035258333664387465
The displayed loss
values at various epochs indicate how the model's performance is improving over time as it learns from the training data.
Congratulations! You've just walked through the process of training a neural network model in PyTorch. You've learned to:
- Define a loss function and an optimizer
- Set up and run a training loop
- Monitor your model's progress during training
To reinforce your understanding and mastery of these concepts, work through the upcoming practice exercises. They will give you the opportunity to apply what you've learned to new scenarios and solidify your PyTorch proficiency.
And remember, successful machine learning is a process of iteration and refinement. Don't be discouraged if your first models aren't perfect. With practice and experience, your models (and your skills!) will continue to improve, so keep learning and experimenting!