Lesson 6

Navigating DataFrames with Index Column and Data Locating in Pandas

Introduction and Lesson Overviews

Welcome, future data analyzers! Today, we're tackling Index Columns and Locating Elements in a Pandas DataFrame. We'll learn how to handle index columns, locate specific data, and strengthen our understanding of DataFrames. Ready, set, code!

Understanding the Index Column in a Pandas DataFrame

In a Pandas DataFrame, an index is assigned to each row, much like the numbers on books in a library. When a DataFrame is created, Pandas establishes a default index. Let's refer to an example:

Python
1import pandas as pd 2 3data = { 4 "Name": ["John", "Anna", "Peter", "Linda"], 5 "Age": [28, 24, 35, 32], 6 "City": ["New York", "Paris", "Berlin", "London"] 7} 8 9df = pd.DataFrame(data) 10 11print(df) 12"""Output: 13 Name Age City 140 John 28 New York 151 Anna 24 Paris 162 Peter 35 Berlin 173 Linda 32 London 18"""

The numbers on the left are the default index.

Setting and Modifying the Index Column

Occasionally, we might need to establish a custom index. The Pandas' set_index() function allows us to set a custom index. To reset the index to its default state, we use reset_index().

To better understand these functions, let's consider an example in which we create an index using unique IDs:

Python
1df['ID'] = [101, 102, 103, 104] # Adding unique IDs 2df.set_index('ID', inplace=True) # Setting 'ID' as index 3 4print(df) 5"""Output: 6 Name Age City 7ID 8101 John 28 New York 9102 Anna 24 Paris 10103 Peter 35 Berlin 11104 Linda 32 London 12"""

In this example, ID column is displayed as an index. Let's reset the index to return to the original state:

Python
1df.reset_index(inplace=True) # Resetting index 2 3print(df) 4"""Output: 5 ID Name Age City 60 101 John 28 New York 71 102 Anna 24 Paris 82 103 Peter 35 Berlin 93 104 Linda 32 London 10"""

By setting inplace parameter to True, we ask pandas to reset the index in the original df dataframe. Otherwise, pandas will create a copy of the data frame with a reset index, leaving the original df untouched.

Locating Elements in a DataFrame

Let's consider a dataframe with a custom index. If you want to select a specific row based on its index value (for example, ID = 102), you can do this:

Python
1import pandas as pd 2 3data = { 4 "Name": ["John", "Anna", "Peter", "Linda"], 5 "Age": [28, 24, 35, 32], 6 "City": ["New York", "Paris", "Berlin", "London"] 7} 8 9df = pd.DataFrame(data) 10df['ID'] = [101, 102, 103, 104] # Adding unique IDs 11df.set_index('ID', inplace=True) # Setting 'ID' as index 12 13print(df.loc[102]) 14'''Output: 15Name Anna 16Age 24 17City Paris 18Name: 102, dtype: object 19'''
Selecting Multiple Rows with `loc`

For multiple rows, simply use list of ids:

Python
1print(df.loc[[102, 104]]) 2 3'''Output: 4 Name Age City 5ID 6102 Anna 24 Paris 7104 Linda 32 London 8'''

As you can see, the output of the .loc operation is some subset of the original dataframe.

Selecting Multiple Columns with `loc`

To select specific multiple columns for these rows, you can provide the column labels as well:

Python
1print(df.loc[[102, 104], ['Name', 'Age']]) 2'''Output: 3 Name Age 4ID 5102 Anna 24 6104 Linda 32 7'''

Also you can select all rows for specific columns, providing : as a set of index labels:

Python
1print(df.loc[:, ['Name', 'Age']]) 2'''Output: 3 Name Age 4ID 5101 John 28 6102 Anna 24 7103 Peter 35 8104 Linda 32 9'''
Using `iloc` for Location by Index Position

The iloc function enables us to select elements in a data frame based on their index positions. iloc works like the loc, but it expects the index number of the rows. For example, we can select the 3rd row:

Python
1print(df.iloc[3]) 2'''Output: 3Name Linda 4Age 32 5City London 6Name: 104, dtype: object 7'''

You can also use slicing here:

Python
1print(df.iloc[1:3]) 2'''Output: 3 Name Age City 4ID 5102 Anna 24 Paris 6103 Peter 35 Berlin 7'''
Lesson Summary and Next Steps

That's it! We've covered the index column, how to set it, and how to locate data in a DataFrame. Exciting exercises are up next. Let's practice and strengthen the skills you've learned today. Let the fun begin!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.