Demystifying AdaBoost: A Practical Guide to Strengthening Predictive Models

Lesson 3

Introduction

Hello, and welcome to our journey into the AdaBoost algorithm! AdaBoost, an abbreviation for Adaptive Boosting, is a crucial ensemble learning method employed in machine learning. Using Python, we'll build an AdaBoost model from scratch and learn how to boost prediction accuracy by combining multiple weak learners into a powerful one.

Understanding Boosting and AdaBoost

First, let's define our terms. Boosting is a technique in which several weak learners are combined to create a strong learner, thereby improving our predictive model. AdaBoost largely follows the same principle. However, it introduces an important twist: it adapts by focusing more on instances that were incorrectly predicted in previous iterations by assigning them higher weights.

Consider a multiphase bank loan approval process to illustrate this concept. Each phase in this process acts as a weak learner. The first phase might be a credit score check, followed by an employment history verification in the second phase, and so on. Collectively, these weak learners form a strong learner who decide on loan approval.

Implementation of AdaBoost: Step 1

Now, let's bring AdaBoost to life with Python.

We begin by initializing the AdaBoost class, specifying the parameters (including the number of learners and the learning rate), and initializing lists to store the models and their weights:

Python
1import numpy as np
2from sklearn.datasets import make_classification
3from sklearn.metrics import accuracy_score
4from sklearn.model_selection import train_test_split
5from sklearn.tree import DecisionTreeClassifier
6
7class AdaBoost:
8    def __init__(self, num_learners=10, learning_rate=1):
9        self.num_learners = num_learners
10        self.learning_rate = learning_rate
11        self.models = []
12        self.model_weights = []

Implementation of AdaBoost: Step 2

The fit method trains the learners iteratively in sequence. The later learners adjust to focus more on instances wrongly predicted by the earlier ones.

Python
1def fit(self, X, y):
2    M, N = X.shape
3    W = np.ones(M) / M  # Initialize weights
4    y = y * 2 - 1  # Convert y to {-1, 1}
5    ...

The AdaBoost algorithm uses {-1, 1} labels instead of {0, 1} to simplify the computation of errors and updating sample weights. Correctly classified observations get a weight of -1 and incorrect ones get +1. This way, the algorithm can easily adjust the weights - by increasing those of misclassified samples and decreasing the correctly classified ones - in the learning process.

Python
1    ...
2    for _ in range(self.num_learners):
3        tree = DecisionTreeClassifier(max_depth=1)
4        tree.fit(X, y, sample_weight=W)
5        
6        pred = tree.predict(X)
7        error = W.dot(pred != y)
8        if error > 0.5:
9            break
10        ...

In the AdaBoost algorithm, if error exceeds 0.5, it means our weak classifier is performing worse than a random guess. So, we halt boosting to avoid incorporating its output, which doesn't contribute any value or improvement to our model.

Python
1        ...
2        beta = self.learning_rate * np.log((1 - error) / error)  # Compute beta
3        W = W * np.exp(beta * (pred != y))  # Update weights
4
5        W = W / W.sum()  # Normalize weights
6        
7        self.models.append(tree)
8        self.model_weights.append(beta)

Note how the weights are initialized. np.ones(M) creates a M-dimensional array of ones, and dividing by M means each weight is equal to 1/M, therefore all weights sum to 1. This represents a uniform distribution of weights across all data instances. This means that the initial model will consider all instances as equally important.

Then we compute beta using the formula $\beta=\eta \cdot log \left(\frac {1-\varepsilon }{\varepsilon }\right)$ , where:

$\eta$ is the learning rate.
$\varepsilon$ is the error rate (calculated as error = W.dot(pred != y)), which gives the sum of the weights of the instances that were incorrectly predicted.

The instances' weights are then updated based on beta and the errors, making the wrongly predicted instances more critical in subsequent iterations.

Implementation of AdaBoost: Step 3

The AdaBoost predict method makes the final prediction based on the majority vote, considering the predictions of all the weak learners.

Python
1def predict(self, X):
2    Hx = sum(beta * h.predict(X) for h, beta in zip(self.models, self.model_weights))  # Weighted aggregate of predictions
3    return Hx > 0  # Calculate majority vote

This function calculates the final prediction of the AdaBoost algorithm. It does so by taking a weighted sum of predictions from each trained model (self.models), with the weights (self.model_weights) signifying the performance of each model. The better a model, the higher its weight.

Application of AdaBoost on Synthetic Data

Next, we'll test our AdaBoost model on a synthetic data set. We create this data set using the make_classification function from sklearn and divide it into training and test datasets.

Python
1data = make_classification(n_samples=1000)  # Creates a synthetic dataset
2X = data[0]
3y = data[1]
4
5# Split data into training and testing datasets
6X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
7
8ada = AdaBoost(S=10, learning_rate=0.5)  # Initialize AdaBoost model
9ada.fit(X_train, y_train)  # Train the model

Model Evaluation & Performance Analysis

Finally, we evaluate our AdaBoost classifier by testing the accuracy of its predictions on the test data set.

Python
1pred = ada.predict(X_test)
2print('AdaBoost accuracy:', accuracy_score(y_test, pred))  # Accuracy as correct predictions over total predictions

The accuracy score, which is the ratio of correct predictions to the total number of predictions, is commonly used in classification problems as a performance metric.

Lesson Summary and Practice

Excellent! You've successfully learned and implemented the AdaBoost algorithm in Python. We've traversed the fascinating terrain of AdaBoost, walked through the code line by line, created a synthetic dataset, and assessed prediction accuracy. Looking ahead, practice exercises will solidify your learning. Keep exploring, enjoy practicing, and never stop learning!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.