Hyperparameter Tuning Using GridSearchCV

Lesson 3

Lesson Overview

Welcome to today's lesson on Hyperparameter Tuning Using GridSearchCV! Our goal is to optimize a Gradient Boosting model to predict Tesla ($TSLA) stock prices more accurately. This lesson will guide you through the process of hyperparameter tuning using GridSearchCV, focusing on understanding key hyperparameters, setting up a hyperparameter grid, and implementing GridSearchCV to find the better model parameters.

Brief Revision of Loading and Preparing the Dataset

Before diving into hyperparameter tuning, let's quickly revise how we load and prepare our dataset. We start by loading the Tesla dataset, adding technical indicators, and splitting the data into training and testing sets.

Here's a quick overview of the code:

Python
1import pandas as pd
2from datasets import load_dataset
3
4# Load dataset
5tesla = load_dataset('codesignal/tsla-historic-prices')
6tesla_df = pd.DataFrame(tesla['train'])
7
8# Feature Engineering
9tesla_df['SMA_5'] = tesla_df['Adj Close'].rolling(window=5).mean()
10tesla_df['SMA_10'] = tesla_df['Adj Close'].rolling(window=10).mean()
11tesla_df['EMA_5'] = tesla_df['Adj Close'].ewm(span=5, adjust=False).mean()
12tesla_df['EMA_10'] = tesla_df['Adj Close'].ewm(span=10, adjust=False).mean()
13
14# Drop NaN values created by moving averages
15tesla_df.dropna(inplace=True)
16
17# Select features and target
18features = tesla_df[['Open', 'High', 'Low', 'Close', 'Volume', 'SMA_5', 'SMA_10', 'EMA_5', 'EMA_10']].values
19target = tesla_df['Adj Close'].shift(-1).dropna().values  # Predicting next day's close price
20features = features[:-1] # Align features and target arrays correctly for time series forecasting
21
22# Splitting the dataset into training and testing sets
23from sklearn.model_selection import train_test_split
24X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.25, random_state=42)

The code above loads the Tesla historic prices dataset, applies feature engineering to add technical indicators like Simple Moving Averages (SMA) and Exponential Moving Averages (EMA), and preprocesses the dataset by removing NaN values. It then selects relevant features and the target variables, preparing the data for training and testing by splitting it into training and testing sets. The line target = tesla_df['Adj Close'].shift(-1).dropna().values is used for predicting the next day's closing price. The line features = features[:-1] ensures that the features and target arrays are aligned correctly for a time series forecasting task where you want to predict the next day's closing price.

Introduction to Hyperparameter Tuning

Hyperparameters are configuration settings used to tune how our models learn. Examples include learning_rate, n_estimators, and max_depth in Gradient Boosting. Proper hyperparameter tuning can significantly improve model performance.

Imagine you're trying to make the perfect soup. Hyperparameter tuning is like adjusting the seasoning to get the best flavor. Just like too much salt or too little pepper can ruin the dish, poor hyperparameters can underperform our model.

The downside of this approach however is that it takes much more time, as every combination of hyperparameters is being tested.

Setting up a Hyperparameter Grid

To find the best hyperparameters, we'll need to test various combinations. This is where the hyperparameter grid comes in. We define a set of values to test for each hyperparameter.

Here are the key hyperparameters we'll tune:

learning_rate: This controls the contribution of each tree to the final prediction. A smaller learning rate means the model learns more slowly but can achieve better performance with proper tuning.
n_estimators: This is the number of boosting stages (trees) to be used in the model. More boosting stages can improve performance but may also lead to overfitting.
max_depth: This determines the maximum depth of the trees. Deeper trees can capture more complex patterns but may also overfit the training data.

Here's how to set up a hyperparameter grid:

Python
1# Setting up the hyperparameter grid
2param_grid = {
3    'learning_rate': [0.01, 0.1],
4    'n_estimators': [100, 200],
5    'max_depth': [3, 4]
6}

In this grid, each combination of learning_rate, n_estimators, and max_depth will be tested. In this param_grid dictionary, the keys are hyperparameter names, and mapped lists contain their possible values.

Implementing GridSearchCV

GridSearchCV automates the process of hyperparameter tuning by searching for the best combination of parameters in our grid.

Here's how to implement GridSearchCV:

Python
1from sklearn.model_selection import GridSearchCV
2from sklearn.ensemble import GradientBoostingRegressor
3
4# Instantiate the GridSearchCV object
5model = GridSearchCV(GradientBoostingRegressor(random_state=42), param_grid, cv=3)
6
7# Fit the model to the training data
8model.fit(X_train, y_train)

In the code above, we first import the necessary libraries. We then instantiate the GridSearchCV object with a GradientBoostingRegressor and our predefined param_grid. The cv=3 parameter specifies that 3-fold cross-validation should be used, meaning the data will be split into three subsets, and the model will be trained and validated three times, each time using a different subset for validation and the remaining subsets for training. This helps ensure the model's performance is robust and not dependent on a particular train-test split. Finally, we fit the GridSearchCV object to the training data, which involves training multiple models using different hyperparameter combinations and selecting the best one based on cross-validation results.

Evaluating and Interpreting Results

Once GridSearchCV has found the best parameters, we need to evaluate and interpret the results.

Python
1# Print the best parameters found
2print("Best parameters found:", model.best_params_)
3# Output:
4# Best parameters found: {'learning_rate': 0.1, 'max_depth': 3, 'n_estimators': 100}

Now, using the combination of hyperparameters resulted in the best model performance, let's calculate the error using these parameters:

Python
1# Predict with the best estimator
2best_model = model.best_estimator_
3predictions = best_model.predict(X_test)
4
5# Calculate and print Mean Squared Error
6from sklearn.metrics import mean_squared_error
7mse = mean_squared_error(y_test, predictions)
8print("Mean Squared Error with best params:", mse)
9# Output:
10# Mean Squared Error with best params: 22.27547097230719

This indicates the model's accuracy using the optimized hyperparameters, with a lower MSE indicating better accuracy.

Visualizing Predictions

Visualizing predictions helps us understand how well our model is performing and identify any patterns or discrepancies between the actual and predicted values. By plotting the actual values against the predicted values, we can visually assess the model's accuracy and spot areas where the predictions may be off. This is crucial for interpreting the effectiveness of our hyperparameter tuning and understanding the model's behavior.

Python
1# Plotting predictions vs actual values
2import matplotlib.pyplot as plt
3
4plt.figure(figsize=(10, 6))
5plt.scatter(range(len(y_test)), y_test, label='Actual', alpha=0.7)
6plt.scatter(range(len(y_test)), predictions, label='Predicted', alpha=0.7)
7plt.title('Actual vs Predicted Values with Tuned Hyperparameters')
8plt.xlabel('Sample Index')
9plt.ylabel('Value')
10plt.legend()
11plt.show()

Here, we visualize the comparison between actual values and predictions. The closer these points are together, the better the model's predictive performance.

Lesson Summary

Great job! You've now learned how to use GridSearchCV for hyperparameter tuning to optimize a Gradient Boosting model. This process involves defining a hyperparameter grid, implementing GridSearchCV, and evaluating the results. Applying these techniques will significantly enhance your model's performance and ensure more accurate predictions.

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.