Building and Understanding AUC-ROC

Lesson 4

Lesson Introduction

Welcome! Today, we are diving into a fascinating metric in machine learning called AUC-ROC.

Our goal for this lesson is to understand what AUC (Area Under the Curve) and ROC (Receiver Operating Characteristic) are, how to calculate and interpret the AUC-ROC metric, and how to visualize the ROC curve using Python. Ready to explore? Let's get started!

Understanding ROC

ROC (Receiver Operating Characteristic): This graph shows the performance of a classification model at different threshold settings. It plots the True Positive Rate (TPR) against the False Positive Rate (FPR). In this context, a threshold is a value that determines the cutoff point for classifying a positive versus a negative outcome based on the model's predicted probabilities. For example, if the threshold is set to 0.5, any predicted probability above 0.5 is classified as positive, and anything below is classified as negative. By varying this threshold, we generate different True Positive and False Positive rates, which are then used to plot the ROC curve.

Imagine you have a medical test used to detect a particular disease. True Positive Rate (TPR) measures how effective the test is at correctly identifying patients who have the disease (true positives). False Positive Rate (FPR), on the other hand, measures how often the test incorrectly indicates the disease in healthy patients (false positives).

$\text{TPR} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Negatives (FN)}}$

$\text{FPR} = \frac{\text{False Positives (FP)}}{\text{False Positives (FP)} + \text{True Negatives (TN)}}$

Note that:

When the threshold is set to 1, it means that we classify all values as negatives, resulting in both TPR and FPR being 0.
When the threshold is set to 0, it means that we classify all values as positives, resulting in both TPR and FPR being 1.

That means than the ROC curve will always start at point (0, 0) and end at point (1, 1).

Plotting the ROC Curve

Visualizing the ROC curve helps understand model performance at different thresholds. Let's look at a Python code snippet to see these concepts in action. We'll manually calculate the ROC data and then plot it using matplotlib.

Python
1import matplotlib.pyplot as plt
2import numpy as np
3from sklearn.metrics import confusion_matrix
4
5# Sample binary classification dataset
6y_true = np.array([0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1])
7y_scores = np.array([0.0, 0.4, 0.5, 0.8, 0.4, 0.8, 0.5, 0.8, 0.7, 0.5, 1])
8
9# Get unique thresholds
10thresholds = np.sort(np.unique(y_scores))
11
12# Initialize lists to hold TPR and FPR values
13tpr = []
14fpr = []
15
16# Calculate TPR and FPR for each threshold
17for thresh in thresholds:
18    y_pred = (y_scores >= thresh).astype(int)
19    tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
20    
21    tpr.append(tp / (tp + fn))  # True Positive Rate
22    fpr.append(fp / (fp + tn))  # False Positive Rate
23
24# Plotting ROC curve
25plt.plot(fpr, tpr, marker='.')
26plt.xlabel('False Positive Rate')
27plt.ylabel('True Positive Rate')
28plt.title('ROC Curve')
29plt.show()

In the example above, y_true represents the true labels, and y_scores is an array with the predicted probabilities.

Breaking Down the Code

Import Libraries: Import necessary libraries. We use numpy for numerical operations, matplotlib for plotting, and confusion_matrix from sklearn.metrics to compute confusion matrix values.
Define True Labels and Scores: y_true holds the binary class labels (0 for class 0, 1 for class 1), and y_scores contains the predicted probabilities.
Get Unique Thresholds: Extract unique threshold values from y_scores using np.sort and np.unique.
Initialize TPR and FPR Lists: These lists will collect True Positive Rate (TPR) and False Positive Rate (FPR) values for each threshold.
Calculate TPR and FPR for Each Threshold: Iterate over the thresholds, make predictions based on current threshold, compute tn, fp, fn, and tp using confusion_matrix. Use these values to compute TPR and FPR at each threshold and append them to their respective lists.
Plot the Curve: Use matplotlib.pyplot to plot these values. plt.plot(fpr, tpr, marker='.') plots the ROC curve with points marked by dots.
Add Labels: Add labels to the x- and y-axes and a title with plt.xlabel('False Positive Rate'), plt.ylabel('True Positive Rate'), and plt.title('ROC Curve').

Running this code, you'll see a graph (ROC curve) showing how TPR and FPR change with different threshold values:

Understanding AUC-ROC

AUC (Area Under the Curve): This single number summary indicates how well the model distinguishes between the two classes. An AUC of 1 means perfect distinction, while an AUC of 0.5 means the model's predictions are no better than random guessing.

Why AUC-ROC is Useful:

Useful for Imbalanced Classes: AUC is particularly useful when you have imbalanced classes. While accuracy can be misleading, AUC gives a better measure of model performance by focusing on the balance between TPR and FPR.
Threshold Independence: AUC-ROC evaluates the model performance across all classification threshold values, giving a comprehensive overview compared to metrics like precision or recall, which are threshold-dependent.

Comparing Models Using AUC-ROC

Let's define another set of predictions which are more accurate:

Python
1y_scores_better = np.array([0.0, 0.2, 0.6, 0.4, 0.9, 0.7, 0.9, 0.8, 0.9, 0.5, 1])

And plot the ROC curve for both sets of predictions. This time we will use a simpler way to calculate TPR and FPR lists – using the roc_curve function from sklearn:

Python
1from sklearn.metrics import roc_curve
2
3# Calculate ROC curve for first set of scores
4fpr1, tpr1, _ = roc_curve(y_true, y_scores)
5
6# Calculate ROC curve for second set of scores
7fpr2, tpr2, _ = roc_curve(y_true, y_scores_better)
8
9# Plotting both ROC curves
10plt.plot(fpr1, tpr1, marker='.', label='model 1')
11plt.plot(fpr2, tpr2, marker='.', label='model 2')
12plt.xlabel('False Positive Rate')
13plt.ylabel('True Positive Rate')
14plt.title('ROC Curve')
15plt.legend()
16plt.show()

The orange curve (Model 2) has a greater area under itself than the blue one (Model 1), which indicates a better performance of the corresponding model.

Calculating AUC-ROC with `sklearn`

Let's look at how to calculate the AUC-ROC score using the roc_auc_score function from sklearn.metrics:

Python
1from sklearn.metrics import roc_auc_score
2
3# Calculate AUC-ROC for the first set of scores
4auc_roc_1 = roc_auc_score(y_true, y_scores)
5print(f"AUC-ROC (Model 1): {auc_roc_1}")  # AUC-ROC (Model 1): 0.6166666666666667
6
7# Calculate AUC-ROC for the second set of scores
8auc_roc_2 = roc_auc_score(y_true, y_scores_better)
9print(f"AUC-ROC (Model 2): {auc_roc_2}")  # AUC-ROC (Model 2): 0.9666666666666668

Running this code, you'll see an output like AUC-ROC (Model 1): 0.6166 and AUC-ROC (Model 2): 0.96666, indicating that the second model is better at distinguishing between the classes.

Lesson Summary

In this lesson, we learned about AUC-ROC, an essential metric for evaluating binary classification models. We understood its components: the ROC curve and the AUC value. We also saw how to calculate these metrics using Python and sklearn.metrics, and how to visualize the ROC curve using matplotlib.

Understanding and interpreting AUC-ROC helps us evaluate how well our classification model can distinguish between different classes. By visualizing the ROC curve, we can see our model's performance at various threshold values, which is invaluable for model selection and tuning.

Now it's your turn! In the practice section, you'll get hands-on experience calculating AUC-ROC. This practice will solidify your understanding and help you apply what you've learned to real-world scenarios. Enjoy the practice, and remember: learning by doing is key!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.