Hello there, future data expert! In this lesson, we're diving into the realm of data analysis in R. Our focus will be on extracting data from vectors and matrices. Let's start with recalling the concept of vectors and matrices. Think of them as data containers: vectors hold a row or column of data, while matrices store rows and columns, much like shelves.
We'll be picking and sorting data from vectors and matrices, mastering skills that are fundamental when dealing with real-life data, which often consists of extensive values. Are you ready to delve into data manipulation in R? Let's set sail!
Vectors and matrices in R are crucial data structures. Think of a vector as a line of data holding values in a single dimension. Here's how you can create a numeric vector:
R1# Create a numeric vector 2ages <- c(35, 22, 48, 50, 27, 36, 25) 3print(ages)
A matrix, on the other hand, is more akin to a table, where data is stored in rows and columns. A matrix can be created through the matrix()
function, where we specify a vector and amount of rows.
R1# Create a matrix 2age_height <- matrix(c(35, 22, 48, 50, 27, 36, 25, 175, 160, 180, 185, 168, 175, 170), nrow = 7) 3print(age_height)
We can also specify number of columns using ncol
:
R1# Create a matrix with ncol specified 2age_height <- matrix(c(35, 22, 48, 50, 27, 36, 25, 175, 160, 180, 185, 168, 175, 170), ncol = 2) 3print(age_height)
We can also specify both for clarity:
R1# Create a matrix with ncol and nrow specified 2age_height <- matrix(c(35, 22, 48, 50, 27, 36, 25, 175, 160, 180, 185, 168, 175, 170), ncol = 2, nrow = 7) 3print(age_height)
In this case, ncol * nrow
should be equal to the length of the provided data vector.
The output of all three code snippets looks like this:
1 [,1] [,2] 2[1,] 35 175 3[2,] 22 160 4[3,] 48 180 5[4,] 50 185 6[5,] 27 168 7[6,] 36 175 8[7,] 25 170
This table contains people's age and height. Each row is one person, the first value of the row is this person's age, the second is the height.
In R, vectors and matrices have positions. The position of data values in vectors is one-dimensional and starts from 1. In matrices, data is positioned in rows and columns. So, how do you select these? The answer is simple – use their position, which is called an 'index'!
R1ages <- c(35, 22, 48, 50, 27, 36, 25) 2 3# Select the fifth person’s age 4print(ages[5]) # Output: 27
R1age_height <- matrix(c(35, 22, 48, 50, 27, 36, 25, 175, 160, 180, 185, 168, 175, 170), nrow = 7) 2# Select the height (second column) of the third person 3print(age_height[3, 2]) # Output: 180
The matrix in R is indexed by [row, column]
.
It's time for some hands-on learning! Accessing several elements from the vector is straightforward: simply specify the starting and the ending indices.
R1ages <- c(35, 22, 48, 50, 27, 36, 25) 2# Select multiple elements from a vector 3print(ages[2:4]) # Output: 22 48 50
In matrices, data can be selected from specific columns or rows by stating the row and column indices:
R1age_height <- matrix(c(35, 22, 48, 50, 27, 36, 25, 175, 160, 180, 185, 168, 175, 170), nrow = 7) 2# Selecting ages and heights of the second and third people 3print(age_height[2:3, ])
The output looks like this. It is the second and third people's data:
1 [,1] [,2] 2[1,] 22 160 3[2,] 48 180
One common error in data selection is index out of bounds
, which happens if you try to access a position beyond the size of your matrix or vector.
R1# Produces an error 2print(ages[10]) # Error: subscript out of bounds 3print(age_height[, 10]) # Error: subscript out of bounds
To avoid these, just be sure that your index isn't larger than your data container!
Congratulations! You've now learned the concept of vectors and matrices in R, as well as how to select data from them, and how to avoid common errors.
Next, you should apply these skills in practice exercises to solidify your understanding. Brace yourself as we delve deeper into data analysis! See you in the next lesson.