Hello, Data Explorer! Today's lesson will focus on "selecting data" in NumPy arrays—our goal is to master integer and Boolean indexing and slicing. So, let's jump in!
In this section, we'll discuss "Indexing". It's akin to item numbering in lists, and Python implements it with a zero-based index system. Let's see this principle in action.
Python1import numpy as np 2 3# Let's form a 1D array and select, say, the 4th element 4array_1d = np.array([2, 3, 5, 7, 11, 13, 17, 19, 23, 29]) 5fourth_element = array_1d[3] # Selected element: 7 6 7# To grab the last element, use a negative index 8last_element = array_1d[-1] # Selected element: 29
Essentially, indexing is a numeric system for items in an array—relatively straightforward!
Are you mindful of higher dimensions? NumPy arrays can range from 1D to N-dimensions. To access specific elements, we use the index pair (i,j)
for a 2D array, (i,j,k)
for a 3D array, and so forth.
Python1# Form a 2D array and get the element at row 2 (indexed 1) and column 1 (indexed 0) 2array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) 3element = array_2d[1,0] # Selected element: 4
In this example, we selected the first element of the second row in a 2D-array—it's pretty simple!
Are you ready for some magic? Enter "Boolean Indexing", which functions like a 'Yes/No' filter, with 'Yes' representing True and 'No' for False.
Python1# A boolean index for even numbers 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_index = array_1d % 2 == 0 4even_numbers = array_1d[even_index] 5print(even_numbers) # Output: [8 4 4]
Or we can put the condition directly into []
brackets:
Python1# A boolean index for even numbers 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_numbers = array_1d[array_1d % 2 == 0] 4print(even_numbers) # Output: [8 4 4]
Voila! Now, we can filter data based on custom conditions.
Now that we've mastered the simple 'Yes/No' binary filter system, let's up the ante with "Complex Conditions in Boolean Indexing". This method refines our filtering process further, allowing us to set more detailed restrictions.
Imagine, for instance, that we want to create an index for even numbers greater than five. We'll merge two conditions to yield this result:
Python1# A combined boolean index for even numbers > 5 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_numbers_greater_than_five = array_1d[(array_1d % 2 == 0) & (array_1d > 5)] 4print(even_numbers_greater_than_five) # Output: [8]
In this query, we used the ampersand (&
) to signify intersection - i.e., we're selecting numbers that are both even AND larger than five. Note, that simple and
operator won't work here.
Similarly, we can use the pipe operator (|
) to signify union - i.e., selecting numbers that are either even OR larger than five:
Python1# A combined boolean index for even numbers or numbers > 5 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_numbers_or_numbers_greater_than_five = array_1d[(array_1d % 2 == 0) | (array_1d > 5)] 4print(even_numbers_or_numbers_greater_than_five) # Output: [8 4 7 4 11]
Awesome, right? This additional filtering layer empowers us to be more specific and intentional about the data we select.
NumPy arrays could be sliced in the same manner as the regular python list. Let's make a quick recall. The syntax is start:stop:step
, where start
is the first index to choose (inclusive), stop
is the last index to choose (exclusive), and step
defines the step of the selection. For example, if the step=1
, each element will be selected, and if step=2
– every other one will be skipped.
Let's take a look at simple examples:
Python1# Select elements at index 0, 1, 2 2array_1d = np.array([1, 2, 3, 4, 5, 6]) 3first_three = array_1d[0:3] 4print(first_three) # Output: [1 2 3]
Note that slicing is inclusive on the left and exclusive on the right.
Another example with a step
parameter:
Python1# Select elements at odd indices 1, 3, 5, ... 2array_1d = np.array([1, 2, 3, 4, 5, 6]) 3every_second = array_1d[1:6:2] 4print(every_second) # Output: [2 4 6]
In this case, we choose every second element, by starting with 1
and using step=2
.
Congratulations! We've traversed the landscape of NumPy arrays, delving into indexing, Boolean indexing, and slicing. Now, we'll dive into some hands-on exercises. After all, practice makes perfect. Let's go forth, data explorers!