Welcome! In this lesson, we'll dive deep into Lasso Regression, a powerful technique for making predictions and reducing overfitting by adding a penalty to the regression model.
Regression models help us understand relationships between variables and make predictions. We'll talk about Lasso Regression
and show how it works using simple Python code. By the end of the lesson, you'll know how to use it in your projects.
Imagine you're trying to predict the price of a house based on factors like area, number of bedrooms, and location. Including too many unnecessary factors can make predictions less accurate. Regularization helps by penalizing unnecessary factors.
Lasso Regression stands for "Least Absolute Shrinkage and Selection Operator." It adds a penalty for large coefficients, shrinking some to zero, selecting only the most important features.
Here’s the mathematical function for ordinary linear regression:
- : The predicted value.
- : The intercept term.
- : The coefficients for each feature.
- : The feature values.
In Lasso Regression
, we modify this function by including a penalty term, (lambda):
The penalty term helps shrink the coefficients of less important features to zero, effectively selecting only the important ones.
Note that while the regularization term is commonly denoted as , the Lasso
regression object in sklearn denotes it as alpha
.
We'll use numpy
for numerical operations, LinearRegression
and Lasso
from sklearn.linear_model
for our regression models, and load_diabetes
to load a sample dataset. We'll also use train_test_split
to divide the dataset into training and testing sets.
Python1import numpy as np 2from sklearn.linear_model import Lasso, LinearRegression 3from sklearn.datasets import load_diabetes 4from sklearn.model_selection import train_test_split 5 6# Load and split dataset 7X, y = load_diabetes(return_X_y=True) 8X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Now, let's create and train both Linear Regression
and Lasso Regression
models, like we did in the previous lessons. Once again, we'll compare their results using Mean Squared Error (MSE) metric.
Python1from sklearn.metrics import mean_squared_error 2 3# Train Linear Regression model 4linear_model = LinearRegression() 5linear_model.fit(X_train, y_train) 6lin_pred = linear_model.predict(X_test) 7 8# Train Lasso Regression model 9lasso_model = Lasso(alpha=0.1) 10lasso_model.fit(X_train, y_train) 11lasso_pred = lasso_model.predict(X_test) 12 13# Calculate and compare MSE 14lin_mse = mean_squared_error(y_test, lin_pred) 15lasso_mse = mean_squared_error(y_test, lasso_pred) 16 17print(f"Linear Regression MSE: {lin_mse}") 18print(f"Lasso Regression MSE: {lasso_mse}") 19# Linear Regression MSE: 2900.193628493483 20# Lasso Regression MSE: 2798.1934851697188
We trained both models, now let's examine their coefficients and intercepts to understand their outputs and compare them.
Python1# Print Linear Regression coefficients and intercept 2print(f"Linear Regression Coefficients: {linear_model.coef_}, Intercept: {linear_model.intercept_}") 3# Linear Regression Coefficients: [ 37.90402135 -241.96436231 542.42875852 347.70384391 -931.48884588 4# 518.06227698 163.41998299 275.31790158 736.1988589 48.67065743], Intercept: 151.34560453985995 5 6# Print Lasso Regression coefficients and intercept 7print(f"Lasso Regression Coefficients: {lasso_model.coef_}, Intercept: {lasso_model.intercept_}") 8# Lasso Regression Coefficients: [ 0. -152.66477923 552.69777529 303.36515791 -81.36500664 9# -0. -229.25577639 0. 447.91952518 29.64261704], Intercept: 151.57485282893947
Examining the coefficients helps us determine which features significantly influence our predictions and which are less important (some coefficients may be zero in Lasso Regression).
In today’s lesson, we explored Lasso Regression
:
- Introduction to Lasso Regression: We learned why it’s useful and how it helps in feature selection.
- Setting Up and Loading Data: We discussed the necessary libraries, loaded a dataset, and split it into training and testing sets.
- Training Models: We trained both
Linear Regression
andLasso Regression
models usingScikit-Learn
. - Model Interpretation: We learned how to examine and compare the coefficients and intercepts of both models.
Now that you understand Lasso Regression and its applications, it's time to practice! In the practice session, you'll get hands-on experience working with Lasso Regression
, reinforcing the concepts learned today. Happy coding!