Welcome to our riveting exploration of "Understanding Activation Functions". We'll traverse the realm of activation functions — crucial components in neural networks that guide the network's output. Embark with us as we delve deep into the theory and Python implementations of five specific activation functions:
step function
,sigmoid function
,Rectified Linear Unit
(ReLU),hyperbolic tangent
(tanh)softplus function
. Let's embark on this enlightening journey through the realm of neural networks.
Let's unravel the role of activation functions in neural networks. They play a vital part in determining the neuron's output. Picturing them as computational gates can be helpful: these gates output results if the input crosses a threshold; otherwise, they remain silent. As we embark on our journey, we'll explore five types of activation functions listed above.
At the start of our expedition, let's explore the step function
, also known as the threshold function. This basic activation function works like a switch. If the input value is above or equal to a threshold value, the function returns 1; otherwise, it returns 0.
Implementing this in Python is straightforward due to its unique characteristic:
Python1def step_function(x): 2 return 1 if x >= 0 else 0
To see this in practice, let's generate a descriptive visualization:
Python1import matplotlib.pyplot as plt 2import numpy as np 3 4x = np.linspace(-10, 10, 100) 5y = [step_function(i) for i in x] 6plt.plot(x, y) 7plt.show()
The sigmoid function should map any value to a result between 0 and 1, generating an S-shaped curve. Its full potential is shown when predicting probabilities in binary classification problems.
Here's its succinct implementation in Python:
Python1import numpy as np 2 3def sigmoid_function(x): 4 return 1 / (1 + np.exp(-x))
To study this function's behavior, let's sketch a plot:
Python1y = [sigmoid_function(i) for i in x] 2plt.plot(x, y) 3plt.show()
ReLU reputedly rectifies the vanishing gradient problem. It is designed to return the input value itself if it's positive; otherwise, it returns zero.
ReLU
function implemented in Python:
Python1def relu_function(x): 2 return x if x > 0 else 0
Visualizing the ReLU
function:
Python1y = [relu_function(i) for i in x] 2plt.plot(x, y) 3plt.show()
The tanh function is an S-shaped curve akin to the sigmoid function, but it maps input values to a range of -1 to 1. It is much better suited to mapping both positive and negative input values.
Defining this function in Python saturates as follows:
Python1def tanh_function(x): 2 return (2 / (1 + np.exp(-2*x))) - 1
Illustrating this function's behavior through a plot:
Python1y = [tanh_function(i) for i in x] 2plt.plot(x, y) 3plt.show()
The softplus function forms a smooth approximation to the ReLU
function and directly addresses the dying ReLU
problem, as it is differentiable at zero.
Expressing it in Python appears to be simple:
Python1def softplus(x): 2 return np.log(1 + np.exp(x))
Visualizing the soft plus function` illustrates its characteristic positive outputs:
Python1y = [softplus_function(i) for i in x] 2plt.plot(x, y) 3plt.show()
We've successfully navigated through the theory and application of engaging activation functions. Thanks to the practical examples and vivid visualizations, we've comprehensively understood these essential functions that lay the foundation for neural networks.
Are you excited for some hands-on practice exercises with these functions? We hope you are. Let's delve into the practice!