Welcome back! Previously, we trained a multi-classification PyTorch model using our prepared Wine dataset. Now, let's delve into evaluating our model's performance. In this context, the goal is to discern how well our model generalizes from what it has learned during training to handle unfamiliar, unseen data. This lesson digs into loss functions, accuracy computation, and performance interpretation, all key components in the model evaluation process. Additionally, we will visualize the training and validation losses using matplotlib
to gain better insights into the learning process and model performance.
Before diving into model evaluation, let's briefly recap the process of building and training our PyTorch model using the Wine dataset. Below is the code snippet from the previous lesson that sets up our model, defines the loss function and optimizer, and trains the model over 150 epochs.
Python1import torch 2import torch.nn as nn 3import torch.optim as optim 4from data_preprocessing import load_preprocessed_data 5 6# Load preprocessed data 7X_train, X_test, y_train, y_test = load_preprocessed_data() 8 9# Define the model 10model = nn.Sequential( 11 nn.Linear(13, 10), 12 nn.ReLU(), 13 nn.Linear(10, 10), 14 nn.ReLU(), 15 nn.Linear(10, 3) 16) 17 18# Define criterion and optimizer 19criterion = nn.CrossEntropyLoss() 20optimizer = optim.Adam(model.parameters(), lr=0.001) 21 22# Train the model 23num_epochs = 150 24history = {'loss': [], 'val_loss': []} 25for epoch in range(num_epochs): 26 model.train() 27 optimizer.zero_grad() 28 outputs = model(X_train) 29 loss = criterion(outputs, y_train) 30 loss.backward() 31 optimizer.step() 32 history['loss'].append(loss.item()) 33 34 model.eval() 35 with torch.no_grad(): 36 outputs_val = model(X_test) 37 val_loss = criterion(outputs_val, y_test) 38 history['val_loss'].append(val_loss.item())
This snippet includes:
nn.Sequential
.CrossEntropyLoss
as the loss function and Adam
as the optimizer.With the model trained, we are now in a position to evaluate its performance in a more detailed manner.
During training, we calculated at each epoch the loss for both the training set and test set, used as validation, and stored these values in our history
dictionary. This helped us monitor the model's performance and ensure it was not overfitting to the training data.
Now, to evaluate our fully trained model, we again calculate the test loss and compute the accuracy, which is the fraction of correct predictions over total predictions. We'll use torch.no_grad
to disable gradient calculations, as they are not needed during evaluation, and the accuracy_score
function from sklearn.metrics
to compute accuracy.
Here's the code for performing these evaluations in PyTorch
:
Python1from sklearn.metrics import accuracy_score 2 3# Set the model to evaluation mode 4model.eval() 5 6# Disables gradient calculation 7with torch.no_grad(): 8 # Input the test data into the model 9 outputs = model(X_test) 10 # Calculate the Cross Entropy Loss 11 test_loss = criterion(outputs, y_test).item() 12 # Choose the class with the highest value as the predicted output 13 _, predicted = torch.max(outputs, 1) 14 # Calculate the accuracy 15 test_accuracy = accuracy_score(y_test, predicted) 16 17print(f'Test Accuracy: {test_accuracy:.4f}, Test Loss: {test_loss:.4f}')
In this code, we first set the model to evaluation mode using model.eval()
to ensure it behaves differently during evaluation compared to training. Next, we disable gradient calculation with torch.no_grad()
to save memory and computational resources. We then pass the test data through the model to generate the outputs. To evaluate the model's performance, we calculate the Cross Entropy Loss using these outputs.
We also obtain the predicted class labels by selecting the class with the highest value using torch.max(outputs, 1)
, which returns two tensors: the max value (which we discard using _ as it's not needed) and the index of the max value along dimension 1, which corresponds to the predicted class.
Finally, we compute the test accuracy by comparing the predicted labels with the true labels using the accuracy_score
function from sklearn.metrics
. The output values for test accuracy and test loss provide quantitative measures of our model's performance on unseen test data, offering insight into its generalization capability.
Plain text1Test Accuracy: 0.9259, Test Loss: 0.4211
Visualizing the loss data during model evaluation is crucial as it helps in understanding the learning progress of our model over time. By plotting the training and validation loss, we can identify patterns such as overfitting or underfitting, providing valuable insights to fine-tune the model. Using matplotlib
, a widely-used plotting library in Python, we will graph the loss history recorded during our training to visually assess the model's performance.
Here's how we plot the loss data:
Python1import matplotlib.pyplot as plt 2 3# Plotting actual training and validation loss 4epochs = range(1, num_epochs + 1) 5train_loss = history['loss'] 6val_loss = history['val_loss'] 7 8plt.figure(figsize=(8, 5)) 9plt.plot(epochs, train_loss, label='Training Loss') 10plt.plot(epochs, val_loss, label='Validation Loss') 11plt.title('Model Loss During Training') 12plt.ylabel('Loss') 13plt.xlabel('Epoch') 14plt.legend() 15plt.show()
The generated plot might look like this:
It is possible to see that both the training loss and validation loss decreased steadily, which means the model was learning well from the data. The lines are close together, so the model wasn’t just memorizing the training data; it was actually learning to generalize, which is great because it means it should perform well on new, similar tasks. The smooth decrease in loss also tells us that the training process was stable and overall, the model trained quite successfully.
In this lesson, we explored different aspects of model evaluation, including calculating loss, predicting labels, and determining accuracy. We also learned how to visualize our loss data using matplotlib
to assess the training progress. Upcoming practice exercises will give you hands-on experience in applying these concepts. Remember, mastering PyTorch
requires a balanced mix of understanding and application. Happy coding!
For your convenience, here is the helpful code snippet for loading and preprocessing the Wine dataset, which you can use to ensure your data is properly prepared for training and evaluation in PyTorch:
Python1import torch 2from sklearn.datasets import load_wine 3from sklearn.model_selection import train_test_split 4from sklearn.preprocessing import StandardScaler 5 6def load_preprocessed_data(): 7 # Load the Wine dataset 8 wine = load_wine() 9 X, y = wine.data, wine.target 10 11 # Split the dataset into training and testing sets 12 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y) 13 14 # Scale the features 15 scaler = StandardScaler().fit(X_train) 16 X_train_scaled = scaler.transform(X_train) 17 X_test_scaled = scaler.transform(X_test) 18 19 # Convert to PyTorch tensors 20 X_train_tensor = torch.tensor(X_train_scaled, dtype=torch.float32) 21 X_test_tensor = torch.tensor(X_test_scaled, dtype=torch.float32) 22 y_train_tensor = torch.tensor(y_train, dtype=torch.long) 23 y_test_tensor = torch.tensor(y_test, dtype=torch.long) 24 25 return X_train_tensor, X_test_tensor, y_train_tensor, y_test_tensor