Welcome! This lesson explores the world of autoencoders, neural networks designed to learn efficient encodings of the input data. During this interactive session, we'll familiarize you with the autoencoder architecture, focusing on its encoder and decoder components, and how to implement these components using Python with the Keras API. We'll also show how to train and apply an autoencoder for digit image reconstruction with sklearn's digit dataset.
Autoencoders are a type of neural network that learns to compress the input data into a lower-dimensional space and then reconstruct the original input from this compressed version. They are widely used for tasks such as dimensionality reduction, denoising, and anomaly detection.
Imagine we have a house, and we want to build an exact copy of the house across town. We could go back and forth between the old and new house to get the details of the house, but this process is time-consuming and inefficient. Instead, we could use an autoencoder to create a blueprint of the house that captures all the essential details. We then use this blueprint to build the new house, saving time and effort.
The two major components of an autoencoder - the encoder and the decoder - help compress the input data into a latent space and reconstruct the original input from the compressed version. Autoencoders are often employed for tasks such as dimensionality reduction and denoising.
Before implementing the components, preprocessing the data is our first step. We'll use sklearn's load_digits() function to download the digits dataset and normalize it.
Python1from sklearn.datasets import load_digits 2from sklearn.preprocessing import MinMaxScaler 3 4data = load_digits().data 5scaler = MinMaxScaler() 6data = scaler.fit_transform(data)
The load_digits()
function downloads the digits dataset, which contains 8x8 pixel images of handwritten digits. We normalize the data using the MinMaxScaler
to ensure all features are within the same range. This step is crucial for training the autoencoder effectively.
The encoder will compress the digit images into a lower-dimensional space, while the decoder will attempt to reconstruct the original images from this compressed version. The autoencoder will learn to minimize the reconstruction error, improving its performance over time.
Once the data is ready, we move on to implementing the autoencoder components in Keras. The encoder transforms the input data into a latent-space representation. The decoder then reverts this process, attempting to recreate the original input. Note that in the previous lessons we used the Sequential model for building neural networks. However, for autoencoders, we'll use the following syntax, which allows us to define more complex architectures. With this approach we create an input layer and then stack additional layers on top of it.
Python1from tensorflow.keras.layers import Input, Dense 2 3# Create an input layer with 64 neurons to match the number of features in the digit images 4input_img_layer = Input(shape=(64,)) 5 6# Create a consecutive layer with 32 neurons and ReLU activation function on the top of the input layer 7encoded_layer = Dense(32, activation='relu')(input_img_layer) 8 9# Create a consecutive layer with 64 neurons and Sigmoid activation function on the top of the encoded layer to reconstruct the original input 10decoded_layer = Dense(64, activation='sigmoid')(encoded_layer)
The Input shape for the encoder layer is 64, which matches the number of features in our digit images. The encoder is followed by the decoder layer, which reconstructs the original 64-feature input from the condensed output of the encoder. Once the encoder and decoder layers are defined, we can create the autoencoder model by tying the input and output layers together. We'll see this in the next section.
Now that we've defined the encoder and decoder components, we can compile and train the autoencoder model. The autoencoder will learn to compress the input data into a lower-dimensional space and then reconstruct the original input from this compressed version. To do that we first need to create the autoencoder model by combining the input and output layers using the Model
class. Moving on, we compile the autoencoder and define the loss function and optimizer. The optimizer 'adam' controls the learning rate, while 'binary_crossentropy' functions as our loss metric.
Python1from tensorflow.keras.models import Model 2 3# Autoencoder 4autoencoder = Model(input_img_layer, decoded_layer) 5autoencoder.compile(optimizer='adam', loss='binary_crossentropy') 6 7autoencoder.fit(data, data, epochs=100, batch_size=256, shuffle=True)
With the autoencoder compiled, we train it using the actual data employing the fit
method. This process adjusts the autoencoder weights to reduce the reconstruction error, improving the model's performance over the training period or epochs.
Once the autoencoder is trained, we'll apply it to the test set and visualize its effectiveness in input reconstruction.
Python1reconstructed_data = autoencoder.predict(data)
We cross-verify the autoencoder's performance by visualizing a set of actual digit images and their reconstructed versions generated by the autoencoder:
Python1import matplotlib.pyplot as plt 2n = 10 3plt.figure(figsize=(20, 4)) 4plt.title('Original vs Reconstructed') 5for i in range(n): 6 # Display original 7 ax = plt.subplot(2, n, i + 1) 8 plt.imshow(data[i].reshape(8, 8)) 9 plt.gray() 10 ax.get_xaxis().set_visible(False) 11 ax.get_yaxis().set_visible(False) 12 # Display reconstruction as generated by our autoencoder 13 ax = plt.subplot(2, n, i + 1 + n) 14 plt.imshow(reconstructed_data[i].reshape(8, 8)) 15 plt.gray() 16 ax.get_xaxis().set_visible(False) 17 ax.get_yaxis().set_visible(False) 18plt.show()
If our autoencoder is effectively trained, the reconstructed images should reveal digits similarly shaped as in the original images.
In the above code, we display the original digit images on the top row and their reconstructed versions on the bottom row. The reconstructed images should closely resemble the original images, indicating that the autoencoder has learned to compress and reconstruct the input data effectively.
One of the major applications of an autoencoder is for dimensionality reduction. By extracting the encoding layers, you can transform the input features into a lower-dimensional space, making it easier to visualize and process.
Since we've already trained our autoencoder model, the encoder will have learned to compress the input images effectively. We can use this learned knowledge to transform our 64-dimensional images into a 32-dimensional encoded space.
To get the encoder, we create a new model using the autoencoder's input and the encoded output as follows:
Python1# Encoder Model 2encoder = Model(input_img_layer, encoded_layer) 3# Apply the encoder to the data 4encoded_data = encoder.predict(data) 5 6print('Shape of the original data:', data.shape) # Output: (1797, 64) 7print('Shape of encoded_data:', encoded_data.shape) # Output: (1797, 32)
In the above code, we first define a new Model that shares the same input layer as the original autoencoder, but only outputs the encoder's output. We then transform our data by calling predict
on the encoder model.
Now, encoded_data
will be a new representation of our original input data but in the latent space as defined by our encoder. This action is similar to how Principal Component Analysis (PCA) compresses original data into a lower-dimensional space. However, unlike PCA, the encoder, as part of the autoencoder, does not necessarily maintain the linearity of features. Encoders can capture complex, non-linear relationships, making them a powerful tool for dimensionality reduction and a stepping stone to more complex tasks in deep learning.
This compressed representation can be beneficial for various tasks, like feature reduction, data visualization, or even used for training other machine learning models where the fewer number of features can result in less computation.
We've successfully discovered the architecture behind autoencoders, implemented them using Python and Keras API, compiled and trained our model, and visualized the effectiveness of our autoencoder in digit image reconstruction. Brace yourself for the next level of autoencoder adventures. Next up, let's exercise to solidify these concepts! Keep practicing, and happy coding!