Lesson 3

Hello, and welcome to our journey into the AdaBoost algorithm! AdaBoost, an abbreviation for Adaptive Boosting, is a crucial ensemble learning method employed in machine learning. Using Python, we'll build an AdaBoost model from scratch and learn how to boost prediction accuracy by combining multiple weak learners into a powerful one.

First, let's define our terms. Boosting is a technique in which several weak learners are combined to create a strong learner, thereby improving our predictive model. AdaBoost largely follows the same principle. However, it introduces an important twist: it adapts by focusing more on instances that were incorrectly predicted in previous iterations by assigning them higher weights.

Consider a multiphase bank loan approval process to illustrate this concept. Each phase in this process acts as a weak learner. The first phase might be a credit score check, followed by an employment history verification in the second phase, and so on. Collectively, these weak learners form a strong learner who decide on loan approval.

Now, let's bring AdaBoost to life with Python.

We begin by initializing the AdaBoost class, specifying the parameters (including the number of learners and the learning rate), and initializing lists to store the models and their weights:

Python`1import numpy as np 2from sklearn.datasets import make_classification 3from sklearn.metrics import accuracy_score 4from sklearn.model_selection import train_test_split 5from sklearn.tree import DecisionTreeClassifier 6 7class AdaBoost: 8 def __init__(self, num_learners=10, learning_rate=1): 9 self.num_learners = num_learners 10 self.learning_rate = learning_rate 11 self.models = [] 12 self.model_weights = []`

The `fit`

method trains the learners iteratively in sequence. The later learners adjust to focus more on instances wrongly predicted by the earlier ones.

Python`1def fit(self, X, y): 2 M, N = X.shape 3 W = np.ones(M) / M # Initialize weights 4 y = y * 2 - 1 # Convert y to {-1, 1} 5 ...`

The AdaBoost algorithm uses `{-1, 1}`

labels instead of `{0, 1}`

to simplify the computation of errors and updating sample weights. Correctly classified observations get a weight of -1 and incorrect ones get +1. This way, the algorithm can easily adjust the weights - by increasing those of misclassified samples and decreasing the correctly classified ones - in the learning process.

Python`1 ... 2 for _ in range(self.num_learners): 3 tree = DecisionTreeClassifier(max_depth=1) 4 tree.fit(X, y, sample_weight=W) 5 6 pred = tree.predict(X) 7 error = W.dot(pred != y) 8 if error > 0.5: 9 break 10 ...`

In the AdaBoost algorithm, if error exceeds 0.5, it means our weak classifier is performing worse than a random guess. So, we halt boosting to avoid incorporating its output, which doesn't contribute any value or improvement to our model.

Python`1 ... 2 beta = self.learning_rate * np.log((1 - error) / error) # Compute beta 3 W = W * np.exp(beta * (pred != y)) # Update weights 4 5 W = W / W.sum() # Normalize weights 6 7 self.models.append(tree) 8 self.model_weights.append(beta)`

Note how the weights are initialized. `np.ones(M)`

creates a M-dimensional array of ones, and dividing by M means each weight is equal to 1/M, therefore all weights sum to 1. This represents a uniform distribution of weights across all data instances. This means that the initial model will consider all instances as equally important.

Then we compute `beta`

using the formula $\beta=\eta \cdot log \left(\frac {1-\varepsilon }{\varepsilon }\right)$,
where:

- $\eta$ is the learning rate.
- $\varepsilon$ is the error rate (calculated as
`error = W.dot(pred != y)`

), which gives the sum of the weights of the instances that were incorrectly predicted.

The instances' weights are then updated based on `beta`

and the errors, making the wrongly predicted instances more critical in subsequent iterations.

The AdaBoost `predict`

method makes the final prediction based on the majority vote, considering the predictions of all the weak learners.

Python`1def predict(self, X): 2 Hx = sum(beta * h.predict(X) for h, beta in zip(self.models, self.model_weights)) # Weighted aggregate of predictions 3 return Hx > 0 # Calculate majority vote`

This function calculates the final prediction of the AdaBoost algorithm. It does so by taking a weighted sum of predictions from each trained model (`self.models`

), with the weights (`self.model_weights`

) signifying the performance of each model. The better a model, the higher its weight.

Next, we'll test our AdaBoost model on a synthetic data set. We create this data set using the `make_classification`

function from `sklearn`

and divide it into training and test datasets.

Python`1data = make_classification(n_samples=1000) # Creates a synthetic dataset 2X = data[0] 3y = data[1] 4 5# Split data into training and testing datasets 6X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) 7 8ada = AdaBoost(S=10, learning_rate=0.5) # Initialize AdaBoost model 9ada.fit(X_train, y_train) # Train the model`

Finally, we evaluate our AdaBoost classifier by testing the accuracy of its predictions on the test data set.

Python`1pred = ada.predict(X_test) 2print('AdaBoost accuracy:', accuracy_score(y_test, pred)) # Accuracy as correct predictions over total predictions`

The `accuracy score`

, which is the ratio of correct predictions to the total number of predictions, is commonly used in classification problems as a performance metric.

Excellent! You've successfully learned and implemented the AdaBoost algorithm in Python. We've traversed the fascinating terrain of AdaBoost, walked through the code line by line, created a synthetic dataset, and assessed prediction accuracy. Looking ahead, practice exercises will solidify your learning. Keep exploring, enjoy practicing, and never stop learning!