Lesson 2

Welcome to our exciting second class in the **Regression and Gradient Descent** series! In the previous lesson, we covered *Simple Linear Regression*. Now, we're transitioning toward **Multiple Linear Regression**, a powerful tool for examining the relationship between a dependent variable and several independent variables.

Consider a case where we need to predict house prices, which undoubtedly depend on multiple factors, such as location, size, and the number of rooms. Multiple Linear Regression accounts for these simultaneous predictors. In today's lesson, you'll learn how to implement this concept in Python!

**Multiple Linear Regression** builds upon the concept of Simple Linear Regression, accounting for more than one independent variable.

Let's recall the Simple Linear Regression equation:

$y = \beta_0 + \beta_1x$

For Multiple Linear Regression, we add multiple independent variables, $x_1, x_2, ... x_m$:

Suppose we had n data points (equations), each with m features (x values) Then X would look like:

$\mathbf{X} = \begin{bmatrix} 1 & x_{1,1} & x_{1,2} & \ldots & x_{1,m} \\ 1 & x_{2,1} & x_{2,2} & \ldots & x_{2,m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_{n,1} & x_{n,2} & \ldots & x_{n,m} \\ \end{bmatrix}$Each row represents the m features for a single data point. Notice how we include a column of 1's the represent the intercept (also called bias) of each equation.

For each row (equation), there is a corresponding y value. So y looks like:

$\mathbf{y} = \begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{m} \end{bmatrix}$The normal equation results in a vector:

$\begin{bmatrix} \mathbf{β}_0 \\ \mathbf{β}_1 \\ \vdots \\ \mathbf{β}_{n} \end{bmatrix}$Now, for any set of features ${x_{1}}$ through ${x_{m}}$, we can predict the $\hat{y}$ value as:

$\hat{y} = (1 \cdot {β}_0) + ({β}_1 \cdot x_{1}) + ({β}_2 \cdot x_{2}) + ... + ({β}_m \cdot x_{m})$

To calculate all the predictions at once, we take the dot product of ${X}$ and ${β}$

$\mathbf{y} = \begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{m} \end{bmatrix} = \begin{bmatrix} 1 & x_{1,1} & x_{1,2} & \ldots & x_{1,n} \\ 1 & x_{2,1} & x_{2,2} & \ldots & x_{2,n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_{m,1} & x_{m,2} & \ldots & x_{m,n} \\ \end{bmatrix} \begin{bmatrix} \beta_{0} \\ \beta_{1} \\ \vdots \\ \beta_{n} \end{bmatrix} = X \cdot \mathbf{\beta}$To implement Multiple Linear Regression, we'll leverage some Linear Algebra concepts. Using the Normal Equation, we can calculate the coefficients for our regression equation:

$\beta = (X^T X)^{-1} X^T y$

Where $X$ is a matrix of features and $y$ is a vector of the target variable values. Like Simple Linear Regression, residuals (the differences between actual and predicted values) play a significant role. The smaller these residuals, the better the model fits.

Let's roll up our sleeves and start coding! We'll primarily rely on `NumPy`

to handle numerical operations and matrices.

First, we set up our dataset:

Python`1X = np.array([[73, 67, 43], 2 [91, 88, 64], 3 [87, 134, 58], 4 [102, 43, 37], 5 [69, 96, 70]], dtype='float32') 6 7y = np.array([56, 81, 119, 22, 103], dtype='float32')`

Next, we calculate our matrix of coefficients, $\beta$, using the Normal Equation:

- Enhance our feature matrix, $X$, with an extra column of ones to account for the intercept.

Python`1ones = np.ones(shape=(len(X), 1)) 2X = np.append(ones, X, axis=1)`

- Compute the coefficients $\beta$ using the Normal Equation.

Python`1beta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)`

We could also use `@`

operator instead of `.dot`

. You may choose the one you find more comfortable:

Python`1beta = np.linalg.inv(X.T @ X) @ X.T @ y`

After completing our model, we need to evaluate its performance. We employ the coefficient of determination ($R^2$ score) for that. It indicates how well our model fits the data. Let's recall it:

$R^2 = 1 - \frac{SS_{residuals}}{SS_{total}}$

Here, $SS_{residuals}$ denotes the residual sum of squares, and $SS_{total}$ is the total sum of squares:

$SS_{residuals} = \sum_{i=1}^{n} (y_i - \hat{y_i})^2$,

where $y_i$ represents the observed values, $\hat{y_i}$ represents the predicted values by the regression model.

$SS_{total} = \sum_{i=1}^{n} (y_i - \bar{y})^2$,

where $y_i$ represents the observed values, $\bar{y}$ stands for mean value of observed data.

A higher $R^2$ value (closer to 1) indicates a good model fit.

Python`1predictions = X.dot(beta) 2ss_residuals = np.sum(np.square(y - predictions)) 3ss_total = np.sum(np.square(y - np.mean(y))) 4r2_score = 1 - (ss_residuals/ss_total) 5 6print("R^2 Score:", r2_score) # Output: R^2 Score: 0.9992`

The $R^2$ score is very close to one, meaning the obtained model is very accurate – almost perfect!

Congratulations on mastering **Multiple Linear Regression**! You've effectively bridged the gap from concept to implementation, designing a regression model in Python from scratch.

Prepare for the upcoming lesson to delve more deeply into Regression Analysis. Meanwhile, make sure to practice and refine your newly acquired skills!