Lesson 5

Welcome to today's lesson on **Evaluating a Model with Tensorflow**. In this lesson, we're going to explore how to evaluate the performance of a model that we previously trained using **TensorFlow**. Specifically, we will be using the `evaluate()`

function provided by TensorFlow to assess how well our model performs on unseen data. Model evaluation is an essential step in the machine learning pipeline as it helps us gauge the effectiveness of our model and its ability to generalize to new data. We will also discuss the importance of splitting our data into training and testing sets for robust model evaluation. After this lesson, you should have a good understanding of how to perform model evaluation and interpret the results to fine-tune your model.

Before we dive into model evaluation, imagine we have a dataset containing the study habits of a group of students. More specifically, we have data on the number of hours each student studied and the amount of sleep they got.

Python`1import numpy as np 2 3# Example data: hours studied, hours slept 4X = np.array([ 5 [4, 6], [5, 7], [2, 8], [1, 3], [3, 4], [0, 5], 6 [1, 1], [2, 4], [3, 5], [5, 5], [0, 4], [4, 4], 7])`

The dataset has 12 observations and each observation has two features: hours studied and hours slept. We are using this data to predict whether a student passes (denoted as `1`

) or fails (denoted as `0`

) their exam. For the sake of simplicity, we've already labeled our data.

Python`1# Labels: 1 if passed, 0 if failed 2y = np.array([[1], [1], [1], [0], [0], [0], [0], [0], [1], [1], [0], [1]])`

We would like to build a model that takes in these two features and outputs a prediction of whether a student is likely to pass or fail.

In machine learning, it's crucial that we have two sets of data: a training set and a testing set. Our model learns from the training set and we evaluate our model's performance using the testing set. We can use the `train_test_split`

function from sklearn's model_selection module to divide our data.

The `test_size`

parameter specifies the proportion of the dataset to include in the test split, and the `random_state`

parameter is used to shuffle and partition the data randomly.

Python`1from sklearn.model_selection import train_test_split 2 3# Split the dataset into 80% training and 20% testing 4X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)`

We chose a split of 80% training and 20% testing, which is a common choice in machine learning projects. Now that our data is ready, let's move back to our model.

Our TensorFlow model is a simple neural network consisting of a Sequential model with two Dense layers. The first Dense layer has 5 neurons and uses the 'relu' activation function, while the second (output) layer has one neuron and uses the 'sigmoid' activation function.

The 'relu' activation function is one of the most commonly used activation functions due to its simplicity and efficiency, while the 'sigmoid' function is typically used in the output layer for binary classification problems. After building our model, we need to compile and train it, so it can be ready for evaluation

Python`1import tensorflow as tf 2 3# Initializing the model 4model = tf.keras.Sequential([ 5 tf.keras.layers.Input(shape=(2,)), 6 tf.keras.layers.Dense(5, activation='relu'), 7 tf.keras.layers.Dense(1, activation='sigmoid') 8]) 9 10# Compiling the model with 'accuracy' as chosen metric 11model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) 12 13# Training the model with the training data 14model.fit(X_train, y_train, epochs=10, verbose=0)`

In the compilation step, we chose `accuracy`

as the metric to evaluate our model's performance. This choice directly influences our evaluation because the `model.evaluate()`

function will return the metrics chosen during compilation. Other metrics could have been used to provide more detailed insights into model performance, such as `precision`

, which measures how many of the positive predictions made by the model are actually correct.

In the training step, we set `verbose=0`

to ensure that no output is generated during the training process. This can be useful when running multiple experiments or when you want to keep the output clean and uncluttered. In this lesson, we are focusing on other outputs for the evaluation.

To evaluate our trained model, we use the `evaluate()`

function. This function returns the loss value and metrics values for our model in test mode. The metric, in this case, reflects the `accuracy`

metric we specified during model compilation.

Python`1# Evaluate the model on the test data 2test_loss, test_accuracy = model.evaluate(X_test, y_test) 3print(f'\nTest accuracy: {test_accuracy}, Test loss: {test_loss}')`

The output of the above code will be:

Plain text`1Test accuracy: 0.6666666865348816, Test loss: 0.7558897137641907`

This output provides the test accuracy and loss, indicating how well the model is performing on unseen data. An accuracy of approximately 67% is reported in this example, along with the corresponding test loss.

While this indicates that the model has learned to some extent and is able to make correct predictions about two-thirds of the time, further improvement is needed. This could involve tuning hyperparameters, using a more complex model architecture, collecting more training data, or applying techniques like regularization to avoid overfitting. Interpreting these metrics is crucial, as they guide us in understanding the current limitations of our model and the steps we need to take to enhance its performance.

In today's lesson, we've looked at a crucial machine learning practice of evaluating a model. We've seen how to split the data into training and testing sets, and why this is important for assessing a model's performance. We've also learned how to use TensorFlow's `evaluate`

function to calculate the loss and accuracy of the model on the testing set.

So what's next? Practice! As we always highlight, practice is key to mastery. In the following exercises, you'll be given the opportunity to work with TensorFlow models, train, and evaluate them. This will reinforce the knowledge you've gained today and help you become more comfortable with the process of training and evaluating models. Happy learning!