Lesson 2
Unpacking Data Selection Techniques in NumPy Arrays
Lesson Overview and Plan

Hello, Data Explorer! Today's lesson will focus on "selecting data" in NumPy arrays—our goal is to master integer and Boolean indexing and slicing. So, let's jump in!

Understanding Indexing in 1D Arrays

In this section, we'll discuss "Indexing". It's akin to item numbering in lists, and Python implements it with a zero-based index system. Let's see this principle in action.

Python
1import numpy as np 2 3# Let's form a 1D array and select, say, the 4th element 4array_1d = np.array([2, 3, 5, 7, 11, 13, 17, 19, 23, 29]) 5fourth_element = array_1d[3] # Selected element: 7 6 7# To grab the last element, use a negative index 8last_element = array_1d[-1] # Selected element: 29

Essentially, indexing is a numeric system for items in an array—relatively straightforward!

Understanding Indexing in Multi-dimensional Arrays

Are you mindful of higher dimensions? NumPy arrays can range from 1D to N-dimensions. To access specific elements, we use the index pair (i,j) for a 2D array, (i,j,k) for a 3D array, and so forth.

Python
1# Form a 2D array and get the element at row 2 (indexed 1) and column 1 (indexed 0) 2array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) 3element = array_2d[1,0] # Selected element: 4

In this example, we selected the first element of the second row in a 2D-array—it's pretty simple!

Understanding Boolean Indexing

Are you ready for some magic? Enter "Boolean Indexing", which functions like a 'Yes/No' filter, with 'Yes' representing True and 'No' for False.

Python
1# A boolean index for even numbers 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_index = array_1d % 2 == 0 4even_numbers = array_1d[even_index] 5print(even_numbers) # Output: [8 4 4]

Or we can put the condition directly into [] brackets:

Python
1# A boolean index for even numbers 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_numbers = array_1d[array_1d % 2 == 0] 4print(even_numbers) # Output: [8 4 4]

Voila! Now, we can filter data based on custom conditions.

Understanding Complex Conditions in Boolean Indexing

Now that we've mastered the simple 'Yes/No' binary filter system, let's up the ante with "Complex Conditions in Boolean Indexing". This method refines our filtering process further, allowing us to set more detailed restrictions.

Imagine, for instance, that we want to create an index for even numbers greater than five. We'll merge two conditions to yield this result:

Python
1# A combined boolean index for even numbers > 5 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_numbers_greater_than_five = array_1d[(array_1d % 2 == 0) & (array_1d > 5)] 4print(even_numbers_greater_than_five) # Output: [8]

In this query, we used the ampersand (&) to signify intersection - i.e., we're selecting numbers that are both even AND larger than five. Note, that simple and operator won't work here.

Similarly, we can use the pipe operator (|) to signify union - i.e., selecting numbers that are either even OR larger than five:

Python
1# A combined boolean index for even numbers or numbers > 5 2array_1d = np.array([8, 4, 7, 3, 4, 11]) 3even_numbers_or_numbers_greater_than_five = array_1d[(array_1d % 2 == 0) | (array_1d > 5)] 4print(even_numbers_or_numbers_greater_than_five) # Output: [8 4 7 4 11]

Awesome, right? This additional filtering layer empowers us to be more specific and intentional about the data we select.

A Look at Slicing in NumPy Arrays

NumPy arrays could be sliced in the same manner as the regular python list. Let's make a quick recall. The syntax is start:stop:step, where start is the first index to choose (inclusive), stop is the last index to choose (exclusive), and step defines the step of the selection. For example, if the step=1, each element will be selected, and if step=2 – every other one will be skipped.

Let's take a look at simple examples:

Python
1# Select elements at index 0, 1, 2 2array_1d = np.array([1, 2, 3, 4, 5, 6]) 3first_three = array_1d[0:3] 4print(first_three) # Output: [1 2 3]

Note that slicing is inclusive on the left and exclusive on the right.

Another example with a step parameter:

Python
1# Select elements at odd indices 1, 3, 5, ... 2array_1d = np.array([1, 2, 3, 4, 5, 6]) 3every_second = array_1d[1:6:2] 4print(every_second) # Output: [2 4 6]

In this case, we choose every second element, by starting with 1 and using step=2.

Lesson Summary

Congratulations! We've traversed the landscape of NumPy arrays, delving into indexing, Boolean indexing, and slicing. Now, we'll dive into some hands-on exercises. After all, practice makes perfect. Let's go forth, data explorers!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.