Building Neural Networks with TensorFlow: A Beginner's Guide

Lesson 3

Topic Overview

Greetings Machine Learning enthusiast! Let's take a deep dive into the fascinating realm of neural networks. Specifically, we'll focus on constructing a neural network using the Pythonic TensorFlow library.

Neural networks attract global interest due to their capabilities of learning patterns from vast volumes of complex data, thus mimicking the learning ethos of the human brain. They have broken the barriers of conventional computing and have ushered in a new era where machines can understand complex tasks that were once exclusive to human intellect.

In today's tutorial, we'll explore how to build these intelligent systems by leveraging TensorFlow's robust functionalities. Our goal is to give you a behind-the-scenes look at the inner workings of these systems.

Introduction to TensorFlow

TensorFlow, an open-source library developed by the Google Brain Team, serves as a powerful tool for numerical computations, making it a popular choice for large-scale machine learning.

Let's familiarize ourselves with the triarchy of TensorFlow's architecture:

Tensors: These are essentially multi-dimensional arrays with a standardized type and act as the heart of TensorFlow.
Computation Graphs: TensorFlow operates using 'lazy execution.' It first designs a computational graph representing various tensor operations, which are then executed in sessions.
Sessions: Within the realm of TensorFlow, computations don't manifest instantaneously. They run within a scope called a session.

TensorFlow provides numerous utilities and methods to design, train, and execute neural networks, making it a powerful asset in any neural network project. It manages the low-level nuances, allowing you to concentrate more on improving your model.

Understanding the Structure of Neural Networks

Before we dive into creating a neural network, let's understand its structure. Neural networks consist of a compilation of interconnected artificial neurons or "nodes." They typically comprise three types of layers:

Input layer: This is the starting point for the data's journey within the network (labeled in the above image as I1, I2, I3).
Hidden layers: These layers handle the heavy lifting. They perform computations and extract useful features from the data (the unlabeled layers in the above image).
Output layer: This is the layer that provides the final result (labeled in the above image as O1, O2).

Each layer plays a significant role, similar to neurons in our brain. When neurons in our brains receive inputs, they process them and generate outputs. This exact mechanism occurs in a neural network. The number of input nodes in the input layer depends on the type of data we are working with and the number of output nodes in the output layer depends on the type of prediction we are making.

As far as the number of hidden layers and shape, that's where the art of Machine Learning comes in. There is no one best way to construct a neural network though there are best practices. It comes down to getting a deep understanding of the dataset and continuously experimenting.

Next, we have "activation functions" — one of the most critical aspects of neural networks. An activation function decides whether a neuron contributes to the next layer based on its input.

In the next section you'll see an example where, relu and softmax are activation functions. relu allows positive values to pass while replacing negative values with zeros. On the other hand, softmax transforms a list of numbers into a probability distribution.

Next time we'll go a little deeper into the mathematical representation of neural networks but for now let's see how to create them in TensorFlow.

Creating a Digits Recognition Model using TensorFlow

Let's start by defining a neural network using TensorFlow. We'll utilize TensorFlow's Keras Sequential API to create a Sequential model. This model allows for the easy construction of a neural network, where each layer connects to precisely one input tensor and one output tensor.

Consider the following:

Python
1# Import the necessary libraries
2from sklearn import datasets
3from sklearn.model_selection import train_test_split
4from sklearn.preprocessing import StandardScaler
5from tensorflow.keras.utils import to_categorical
6import tensorflow as tf
7
8# Load the Digits dataset
9digits = datasets.load_digits()
10
11# Split the data into features and target labels
12X = digits.images
13y = digits.target
14
15# Flatten the images
16n_samples = len(X)
17X = X.reshape((n_samples, -1))
18
19# Normalize the data
20scaler = StandardScaler()
21X_scaled = scaler.fit_transform(X)
22
23# Convert labels to one-hot encoding
24y_categorical = to_categorical(y)
25
26# Split the dataset into training and testing sets
27X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_categorical, test_size=0.3, random_state=42)
28
29# Define the model
30model = tf.keras.models.Sequential([
31    tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
32    tf.keras.layers.Dense(64, activation='relu'),
33    tf.keras.layers.Dense(10, activation='softmax')  # 10 classes for digits 0-9
34])

tf.keras.models.Sequential(): This initiates our network model as a Sequential model.
tf.keras.layers.Dense(): This creates a densely connected layer, meaning all neurons in a dense layer connect to all outputs in the previous layer.
64: This number represents the number of neurons in the layer.
activation='relu': relu stands for "Rectified Linear Activation." It's a simple function that allows positive values to pass through, while it turns negative values into zero.
input_shape=(X_train.shape[1],): This specifies the shape of the input that the model will receive. This must be specified for the first layer in your Sequential model.
tf.keras.layers.Dense(10, activation='softmax'): Our final layer is designed with 10 neurons, one each for interpreting digits 0-9. Here, we use the 'softmax' activation function to predict the digit's probability distribution.

Given how large this network architecture is, it's hard to visualize it, but worth a try. In the image below you can see the 3 layers that are 64 neuron wide (including the input layer) and the output layer with 10 neurons. Because these are dense layers, we have 64*64 = 4096 edges connecting them so each individual connection is hard to see.

Let's further examine our model structure using the summary() method. This provides a quick overview of our model's architecture, along with a summary of the layers, output shapes, and the number of parameters.

Python
1model.summary()

output

Python
1Model: "sequential"
2_________________________________________________________________
3 Layer (type)                Output Shape              Param #   
4=================================================================
5 dense (Dense)               (None, 64)                4160      
6                                                                 
7 dense_1 (Dense)             (None, 64)                4160      
8                                                                 
9 dense_2 (Dense)             (None, 10)                650       
10                                                                 
11=================================================================
12Total params: 8970 (35.04 KB)
13Trainable params: 8970 (35.04 KB)
14Non-trainable params: 0 (0.00 Byte)
15_________________________________________________________________

The model.summary() method provides detailed model information, notably the param # in the output. It reflects the total number of trainable parameters in the model. You might notice we said there are 4096 edges between the neurons but here we see 4160 trainable parameters between the deep layers, where is the difference coming from? The difference is coming from bias term associated with each neuron. More about weights and biases in the next lesson!

Lesson Summary and Practice Exercises

Congratulations on making it this far in this insightful lesson on Neural Networks with TensorFlow! Today, we reviewed the fundamentals of TensorFlow, delved into the structure of neural networks, and built a deep learning model using TensorFlow.

By now, you have learned about:

The core aspects of TensorFlow
The structure and mechanisms of Neural Networks
The method of defining a Neural Network model using TensorFlow

Now, it's time to solidify your newfound understanding with some hands-on practice exercises. These exercises invoke active learning and will establish a strong foundation for advanced neural network concepts. Good luck, and enjoy your journey with neural networks!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.