In this lesson, we will explore working with CSV files — a prevalent format used for data storage and interchange. By the end of this lesson, you will learn how to read data from CSV files, identify rows and columns separated by commas, and convert string data into integers for further manipulation. This lesson builds on your existing knowledge of file parsing in C# and introduces new techniques to enhance your data-handling capabilities.
CSV stands for Comma-Separated Values and is a format that stores tabular data in plain text. Each line represents a data row, and columns are separated by commas, allowing for easy storage and interpretation.
Imagine you have a CSV file named data.csv
:
1Name,Age,Occupation 2John,28,Engineer 3Alice,34,Doctor 4Bob,23,Artist
In this file:
- The first line contains the headers:
Name
,Age
, andOccupation
. - Each subsequent line contains data for an individual, with values separated by commas.
Understanding the structure of CSV files is crucial as it guides us on how to parse the data effectively in our programming environment.
To effectively manage the data lines from a CSV file, we'll define a class to represent a row of data with properties.
C#1// Define a class to hold a row of data 2class Person 3{ 4 public string? Name { get; set; } 5 public int? Age { get; set; } 6 public string? Occupation { get; set; } 7}
Here, the Person
class includes three properties: Name
, Age
, and Occupation
, representing each column in our CSV file.
Let's start reading the CSV file using File.ReadAllLines
to handle file input.
C#1var lines = File.ReadAllLines("data.csv").Skip(1); // Skip the header
Here, we use File.ReadAllLines
to handle file input and the header Name,Age,Occupation
is skipped using Skip(1)
as it's not actual data meant for processing.
In parsing each line of the CSV file, we utilize C#'s Split
method to extract individual data fields:
-
Source String: The line read from the file, which contains the row of data.
-
Split Method: Utilized on this source string to split the comma-separated values into an array of strings. Each element of this array represents a data column.
-
Storing Data: Pieces of data are then individually accessed using indices and stored in a new
Person
object.
Here's how we use it in our code:
C#1foreach (var line in lines) 2{ 3 var columns = line.Split(','); // Split line into columns 4 var person = new Person 5 { 6 Name = columns[0], 7 Age = int.Parse(columns[1]), // Convert string to integer 8 Occupation = columns[2] 9 }; 10 persons.Add(person); // Add the person object to the list 11}
In this code snippet:
-
line.Split(',')
: Executes the split operation for each line based on commas, storing the result in an arraycolumns
. -
int.Parse(columns[1])
: Converts the string-based age to an integer, allowing us to handle numerical computations later if necessary.
To ensure the CSV data is correctly parsed and stored in Person
objects, we can print the contents of the persons
list:
C#1Console.WriteLine("Parsed CSV Data:"); 2foreach (var person in persons) 3{ 4 Console.WriteLine($"{person.Name} {person.Age} {person.Occupation}"); 5}
Expected console output should look like this, confirming accurate parsing:
Plain text1Parsed CSV Data: 2John 28 Engineer 3Alice 34 Doctor 4Bob 23 Artist
This output indicates that each line from the CSV has been successfully converted into a Person
object.
In this lesson, we covered how to parse CSV files in C#, focusing on reading data with commas as delimiters and converting string data to integers. You've learned how to define data structures for holding parsed data and how to use the file reading and manipulation techniques provided by C# efficiently to handle and manage CSV content.
As you move on to practice exercises, remember to verify the correctness of your parsed data and reflect on potential applications of what you've learned. Keep up the excellent work and continue exploring more advanced data-handling techniques in C#.