In this section, we'll start our discussion by explaining a matrix. A matrix from a mathematical perspective, is a 2-dimensional array that contains numbers. These numbers could be either real or complex. Matrices are universally applicable mathematical entities that find extensive use in areas such as physics and engineering. In Python, the Numpy
library significantly simplifies the handling of matrices.
But why are matrices so fundamentally important in data science and machine learning? Consider an image. For humans, it's a visual depiction of an object or a scene. However, the image is an array of pixel values to a computer, making it a matrix. For tasks like image recognition, these matrices undergo processing and analysis. Thus, understanding how to manipulate matrices is invaluable for any data scientist or machine learning engineer.
To get started, let's import the Numpy
library:
Python1import numpy as np
With Numpy
, we can effortlessly create a matrix using the array
function. We simply need to provide a 2-dimensional list as a parameter:
Python1A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) 2print(A) 3""" 4[[1 2 3] 5 [4 5 6] 6 [7 8 9]] 7"""
In the above snippet, A
is a 3x3 matrix comprising three rows and three columns. Each row can be conceptualized as a 3-dimensional vector with every number acting as a coordinate in 3-dimensional space.
One admirable aspect of matrices is the intuitive way they can be manipulated. Operations such as addition, subtraction, and multiplication are carried out element-wise, i.e., on corresponding elements from each matrix. This greatly simplifies the task of performing these operations, which could be considerably more complex in higher dimensions.
For illustration, we'll perform some basic operations on matrices A
and B
:
Python1B = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]]) 2 3C = A + B 4print(C) 5""" 6[[10 10 10] 7 [10 10 10] 8 [10 10 10]] 9"""
In the above snippet, we're adding matrices A
and B
to produce matrix C
. Each element in C
is the sum of the corresponding elements in A
and B
. Similarly, we can carry out subtraction and multiplication:
Python1D = A - B 2print(D) 3""" 4[[-8 -6 -4] 5 [-2 0 2] 6 [ 4 6 8]] 7""" 8 9E = A * B 10print(E) 11""" 12[[ 9 16 21] 13 [24 25 24] 14 [21 16 9]] 15"""
Unlike basic operations, matrix division isn't as straightforward. Instead, we use the concept of a matrix inverse. The inverse of a matrix A
is defined as a matrix that, when multiplied with A
, yields the identity matrix. The identity matrix in the world of matrices is equivalent to the number 1, and any matrix multiplied by the identity matrix returns the original matrix.
In Numpy
, we use the np.linalg.inv()
function to find the inverse of a matrix:
Python1F = np.linalg.inv(E) # Finding the inverse of matrix E 2print(F) 3""" 4[[-0.73611111 0.88888889 -0.65277778] 5 [ 1.33333333 -1.66666667 1.33333333] 6 [-0.65277778 0.88888889 -0.73611111]] 7"""
However, one caveat to keep in mind: Not all matrices have an inverse. Matrices that lack an inverse are known as singular or degenerate matrices. Therefore, when executing matrix inversion, it's important to handle exceptions.
To resolve the issue with singular and degenerate matrices, there is an np.linalg.pinv()
function that is used to compute the (Moore-Penrose) pseudoinverse of a matrix.
The difference between these two functions is that np.linalg.inv(a)
can only be used for square matrices that are invertible, while np.linalg.pinv(a)
can be used for any matrix. If the original matrix is singular or non-square, np.linalg.inv(a)
will result in an error, whereas np.linalg.pinv(a)
will still return a result. This pseudoinverse is a generalization of the inverse concept, providing some sort of 'best guess' at an inverse even when one doesn't officially exist.
Python1# np.linalg.inv(A) doesn't exist, it will throw an exception 2FP = np.linalg.pinv(A) # Finding the pseudo-inverse matrix of A 3print(FP) 4""" 5[[-6.38888889e-01 -1.66666667e-01 3.05555556e-01] 6 [-5.55555556e-02 1.26893721e-16 5.55555556e-02] 7 [ 5.27777778e-01 1.66666667e-01 -1.94444444e-01]] 8"""
If you have ever rotated an image on your phone or computer, you used the concept of a transpose. In matrix terminology, transposing a matrix involves flipping the matrix over its diagonal, which reverses its row and column indices. This technique is especially useful in data analysis when we aim to interchange columns and rows.
In Numpy
, we can use either the np.transpose()
function or the T
attribute to find the transpose of a matrix:
Python1G = np.transpose(A) 2print(G) 3""" 4[[1 4 7] 5 [2 5 8] 6 [3 6 9]] 7""" 8 9H = A.T 10print(H) 11""" 12[[1 4 7] 13 [2 5 8] 14 [3 6 9]] 15"""
Matrix operations aren't just for academic purposes; they play a pivotal role in Machine Learning. For instance, they are extensively employed in tasks requiring multiplication, such as the dot product computation in deep learning models. Here, the weights of nodes in artificial neural networks are stored as matrices. These matrices are manipulated through matrix operations for forward and backward propagation, which are fundamental steps in training deep learning models.
Now that we understand the theory, let's put these matrix operations into action. For this, we'll use a simple example to simulate the preprocessing of a slice of our dataset:
Python1# Assuming these are two features from our dataset 2feature_1 = np.array([[123], [456], [789]]) 3feature_2 = np.array([[321], [654], [987]]) 4 5# Combine the two features into one matrix 6data_features = np.hstack((feature_1, feature_2)) 7print(data_features) 8""" 9[[123 321] 10 [456 654] 11 [789 987]] 12"""
In the above example, we used the np.hstack()
function to stack two feature arrays horizontally. It combined feature_1
and feature_2
side by side into a single 2-dimensional array or matrix.
With this, we can conveniently perform operations on our dataset. For instance, let's normalize our data so that all feature values fall within a specific range. This is a crucial preprocessing step in machine learning as it ensures no single feature dominates others due to its range of values:
Python1normalized_data_features = data_features / np.linalg.norm(data_features) 2print(normalized_data_features) 3""" 4[[0.08022761 0.2093745 ] 5 [0.2974292 0.42657609] 6 [0.51463079 0.64377768]] 7"""
While the above normalization puts all values within a [0, 1]
range, sometimes you might need to use Min-Max Normalization that makes sure the output always contains both 0
and 1
, i.e., all values are normalized between min and max values:
Python1normalized_data_features_minmax = (data_features - np.min(data_features)) / (np.max(data_features) - np.min(data_features)) 2print(normalized_data_features_minmax) 3""" 4[[0. 0.22916667] 5 [0.38541667 0.61458333] 6 [0.77083333 1. ]] 7"""
Now that we understand how these matrix operations function, let's apply them in a Machine Learning context. In machine learning, weights are often initialized and represented in matrices or tensors, and the computation of the weighted sum or the dot product is a staple operation. In fact, every layer of a deep learning model performs a variant of a dot product operation. So, let's illustrate these scenarios in the following code snippet:
Python1# Simulate the weights of features 2weights = np.array([0.4, 0.6]) 3 4# Calculate the weighted sum of features 5weighted_sum_features = np.dot(data_features, weights) 6print(weighted_sum_features) 7# Output: [241.8 574.8 907.8]
In the example above, we've multiplied two matrices — our processed features data and the weights matrix — using the dot product method np.dot()
. This operation is fundamental to many machine learning models, highlighting the relationship between matrix operations and machine learning.
Today, we delved into an essential topic in Numpy
— matrix operations. These operations provide the groundwork for handling machine learning datasets. We examined operations such as matrix addition, subtraction, and multiplication, learned how to find the matrix inverse, and took on practical tasks like normalizing data and calculating the weighted sum of inputs.
Fantastic job in keeping pace with this lesson! Next, we have a series of exercises that will enable you to practice what you have learned and further solidify your understanding of matrix operations. This crucial practice will take you a step further in mastering machine learning. Let's dive in!