Let's delve into Python's NumPy library and focus on the centerpiece of NumPy - arrays
. NumPy, an acronym for 'Numerical Python', specializes in efficient computations on arrays. Arrays in NumPy are more efficient than typical Python data structures.
The power of NumPy lies in its fast computations on large data arrays, making it crucial in data analysis. Before we start, let's import it:
Python1# Import NumPy as 'np' in Python 2import numpy as np
np
is a commonly used representation for numpy
.
NumPy arrays
, like a sorted shopping list, allow for swift computations. Arrays offer quick access to elements. Let's create a simple one-dimensional NumPy array:
Python1# Creating a one-dimensional (1D) numpy array 2array_1d = np.array([1, 2, 3, 4, 5]) 3print(array_1d) # prints: [1 2 3 4 5]
This code creates a five-element array.
We can create multi-dimensional arrays as much as we would with a multi-day shopping list. Here, each sublist []
forms a row in the final array:
Python1# Two-dimensional (2D) numpy array 2array_2d = np.array([[1, 2, 3],[4, 5, 6]]) 3print(array_2d) 4'''prints: 5[[1 2 3] 6 [4 5 6]] 7'''
Each row in the output corresponds to a sublist in the input list.
We can apply the same principle to create a three-dimensional array:
Python1# Three-dimensional (3D) numpy array 2array_3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]) 3print(array_3d) 4'''prints: 5[[[ 1 2 3] 6 [ 4 5 6]] 7 8 [[ 7 8 9] 9 [10 11 12]]] 10'''
NumPy arrays come with a series of built-in properties that give helpful information about the structure and type of data they hold. These are accessible via the size,
shape
, and type
fields, respectively.
Let's start with size
. This property indicates the total number of elements in the array. Elements can be numbers, strings, etc. This becomes especially useful when working with multi-dimensional arrays where manually counting elements can be tedious.
Python1array_3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]) 2print("Size:", array_3d.size) # Size: 12
The array above is a 3D array that contains two 2D arrays. Each of the 2D arrays has two arrays, and each of those has three elements. Therefore, the total number of elements is 2 * 2 * 3 = 12
.
Next, we have shape
, which gives us the array's dimensions. The shape property returns a tuple where the number of items is the dimension of the array, and the value of each item is the size of each dimension.
Python1print("Shape:", array_3d.shape) # Shape: (2, 2, 3)
In the example above, the shape (2, 2, 3)
is a tuple of three values, which indicates that our array is 3D and contains two arrays, each of which includes two more arrays, each of which holds three elements.
Lastly is dtype
, which stands for the data type. This property tells us about the elements stored in the array, whether they're integers, floats, strings, etc.
Python1print("Data type:", array_3d.dtype) # Data type: int64
For our example, the data type is int64
because our array only contains integers. If it had held floating point numbers, the dtype
would have reflected that.
Understanding these properties is vital for effectively working with NumPy arrays, as they provide information about the array's structure and content.
Great job! We have learned how to create basic and multi-dimensional NumPy arrays and examined the properties of arrays. Now, let's move on to some exercises to practice these concepts and prepare for future data analysis challenges.