Lesson 4

Welcome! Today, we are diving into a fascinating metric in machine learning called **AUC-ROC**.

Our goal for this lesson is to understand what AUC (Area Under the Curve) and ROC (Receiver Operating Characteristic) are, how to calculate and interpret the AUC-ROC metric, and how to visualize the ROC curve using Python. Ready to explore? Let's get started!

**ROC (Receiver Operating Characteristic)**: This graph shows the performance of a classification model at different threshold settings. It plots the True Positive Rate (`TPR`

) against the False Positive Rate (`FPR`

). In this context, a **threshold** is a value that determines the cutoff point for classifying a positive versus a negative outcome based on the model's predicted probabilities. For example, if the threshold is set to 0.5, any predicted probability above 0.5 is classified as positive, and anything below is classified as negative. By varying this threshold, we generate different True Positive and False Positive rates, which are then used to plot the ROC curve.

Imagine you have a medical test used to detect a particular disease. **True Positive Rate (TPR)** measures how effective the test is at correctly identifying patients who have the disease (true positives). **False Positive Rate (FPR)**, on the other hand, measures how often the test incorrectly indicates the disease in healthy patients (false positives).

$\text{TPR} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Negatives (FN)}}$

$\text{FPR} = \frac{\text{False Positives (FP)}}{\text{False Positives (FP)} + \text{True Negatives (TN)}}$

Note that:

- When the threshold is set to
`1`

, it means that we classify all values as negatives, resulting in both TPR and FPR being 0. - When the threshold is set to
`0`

, it means that we classify all values as positives, resulting in both TPR and FPR being 1.

That means than the ROC curve will always start at point (0, 0) and end at point (1, 1).

Visualizing the ROC curve helps understand model performance at different thresholds. Let's look at a Python code snippet to see these concepts in action. We'll manually calculate the ROC data and then plot it using `matplotlib`

.

Python`1import matplotlib.pyplot as plt 2import numpy as np 3from sklearn.metrics import confusion_matrix 4 5# Sample binary classification dataset 6y_true = np.array([0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1]) 7y_scores = np.array([0.0, 0.4, 0.5, 0.8, 0.4, 0.8, 0.5, 0.8, 0.7, 0.5, 1]) 8 9# Get unique thresholds 10thresholds = np.sort(np.unique(y_scores)) 11 12# Initialize lists to hold TPR and FPR values 13tpr = [] 14fpr = [] 15 16# Calculate TPR and FPR for each threshold 17for thresh in thresholds: 18 y_pred = (y_scores >= thresh).astype(int) 19 tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel() 20 21 tpr.append(tp / (tp + fn)) # True Positive Rate 22 fpr.append(fp / (fp + tn)) # False Positive Rate 23 24# Plotting ROC curve 25plt.plot(fpr, tpr, marker='.') 26plt.xlabel('False Positive Rate') 27plt.ylabel('True Positive Rate') 28plt.title('ROC Curve') 29plt.show()`

In the example above, `y_true`

represents the true labels, and `y_scores`

is an array with the predicted probabilities.

**Import Libraries**: Import necessary libraries. We use`numpy`

for numerical operations,`matplotlib`

for plotting, and`confusion_matrix`

from`sklearn.metrics`

to compute confusion matrix values.**Define True Labels and Scores**:`y_true`

holds the binary class labels (`0`

for class 0,`1`

for class 1), and`y_scores`

contains the predicted probabilities.**Get Unique Thresholds**: Extract unique threshold values from`y_scores`

using`np.sort`

and`np.unique`

.**Initialize TPR and FPR Lists**: These lists will collect True Positive Rate (`TPR`

) and False Positive Rate (`FPR`

) values for each threshold.**Calculate TPR and FPR for Each Threshold**: Iterate over the thresholds, make predictions based on current threshold, compute`tn`

,`fp`

,`fn`

, and`tp`

using`confusion_matrix`

. Use these values to compute`TPR`

and`FPR`

at each threshold and append them to their respective lists.**Plot the Curve**: Use`matplotlib.pyplot`

to plot these values.`plt.plot(fpr, tpr, marker='.')`

plots the ROC curve with points marked by dots.**Add Labels**: Add labels to the x- and y-axes and a title with`plt.xlabel('False Positive Rate')`

,`plt.ylabel('True Positive Rate')`

, and`plt.title('ROC Curve')`

.

Running this code, you'll see a graph (ROC curve) showing how `TPR`

and `FPR`

change with different threshold values:

**AUC (Area Under the Curve)**: This single number summary indicates how well the model distinguishes between the two classes. An AUC of 1 means perfect distinction, while an AUC of 0.5 means the model's predictions are no better than random guessing.

Why AUC-ROC is Useful:

**Useful for Imbalanced Classes**: AUC is particularly useful when you have imbalanced classes. While accuracy can be misleading, AUC gives a better measure of model performance by focusing on the balance between`TPR`

and`FPR`

.**Threshold Independence**: AUC-ROC evaluates the model performance across all classification threshold values, giving a comprehensive overview compared to metrics like precision or recall, which are threshold-dependent.

Let's define another set of predictions which are more accurate:

Python`1y_scores_better = np.array([0.0, 0.2, 0.6, 0.4, 0.9, 0.7, 0.9, 0.8, 0.9, 0.5, 1])`

And plot the ROC curve for both sets of predictions. This time we will use a simpler way to calculate `TPR`

and `FPR`

lists – using the `roc_curve`

function from `sklearn`

:

Python`1from sklearn.metrics import roc_curve 2 3# Calculate ROC curve for first set of scores 4fpr1, tpr1, _ = roc_curve(y_true, y_scores) 5 6# Calculate ROC curve for second set of scores 7fpr2, tpr2, _ = roc_curve(y_true, y_scores_better) 8 9# Plotting both ROC curves 10plt.plot(fpr1, tpr1, marker='.', label='model 1') 11plt.plot(fpr2, tpr2, marker='.', label='model 2') 12plt.xlabel('False Positive Rate') 13plt.ylabel('True Positive Rate') 14plt.title('ROC Curve') 15plt.legend() 16plt.show()`

The orange curve (Model 2) has a greater area under itself than the blue one (Model 1), which indicates a better performance of the corresponding model.

Let's look at how to calculate the AUC-ROC score using the `roc_auc_score`

function from `sklearn.metrics`

:

Python`1from sklearn.metrics import roc_auc_score 2 3# Calculate AUC-ROC for the first set of scores 4auc_roc_1 = roc_auc_score(y_true, y_scores) 5print(f"AUC-ROC (Model 1): {auc_roc_1}") # AUC-ROC (Model 1): 0.6166666666666667 6 7# Calculate AUC-ROC for the second set of scores 8auc_roc_2 = roc_auc_score(y_true, y_scores_better) 9print(f"AUC-ROC (Model 2): {auc_roc_2}") # AUC-ROC (Model 2): 0.9666666666666668`

Running this code, you'll see an output like `AUC-ROC (Model 1): 0.6166`

and `AUC-ROC (Model 2): 0.96666`

, indicating that the second model is better at distinguishing between the classes.

In this lesson, we learned about **AUC-ROC**, an essential metric for evaluating binary classification models. We understood its components: the ROC curve and the AUC value. We also saw how to calculate these metrics using Python and `sklearn.metrics`

, and how to visualize the ROC curve using `matplotlib`

.

Understanding and interpreting **AUC-ROC** helps us evaluate how well our classification model can distinguish between different classes. By visualizing the ROC curve, we can see our model's performance at various threshold values, which is invaluable for model selection and tuning.

Now it's your turn! In the practice section, you'll get hands-on experience calculating **AUC-ROC**. This practice will solidify your understanding and help you apply what you've learned to real-world scenarios. Enjoy the practice, and remember: learning by doing is key!