Lesson 3
Parsing CSV Files in C#
Introduction and Context Setting

In this lesson, we will explore working with CSV files — a prevalent format used for data storage and interchange. By the end of this lesson, you will learn how to read data from CSV files, identify rows and columns separated by commas, and convert string data into integers for further manipulation. This lesson builds on your existing knowledge of file parsing in C# and introduces new techniques to enhance your data-handling capabilities.

Understanding CSV Structure and Delimiter

CSV stands for Comma-Separated Values and is a format that stores tabular data in plain text. Each line represents a data row, and columns are separated by commas, allowing for easy storage and interpretation.

Imagine you have a CSV file named data.csv:

1Name,Age,Occupation 2John,28,Engineer 3Alice,34,Doctor 4Bob,23,Artist

In this file:

  • The first line contains the headers: Name, Age, and Occupation.
  • Each subsequent line contains data for an individual, with values separated by commas.

Understanding the structure of CSV files is crucial as it guides us on how to parse the data effectively in our programming environment.

Setting Up The Data Structure for CSV Parsing

To effectively manage the data lines from a CSV file, we'll define a class to represent a row of data with properties.

C#
1// Define a class to hold a row of data 2class Person 3{ 4 public string? Name { get; set; } 5 public int? Age { get; set; } 6 public string? Occupation { get; set; } 7}

Here, the Person class includes three properties: Name, Age, and Occupation, representing each column in our CSV file.

Reading CSV Data

Let's start reading the CSV file using File.ReadAllLines to handle file input.

C#
1var lines = File.ReadAllLines("data.csv").Skip(1); // Skip the header

Here, we use File.ReadAllLines to handle file input and the header Name,Age,Occupation is skipped using Skip(1) as it's not actual data meant for processing.

Parsing Each Line

In parsing each line of the CSV file, we utilize C#'s Split method to extract individual data fields:

  1. Source String: The line read from the file, which contains the row of data.

  2. Split Method: Utilized on this source string to split the comma-separated values into an array of strings. Each element of this array represents a data column.

  3. Storing Data: Pieces of data are then individually accessed using indices and stored in a new Person object.

Here's how we use it in our code:

C#
1foreach (var line in lines) 2{ 3 var columns = line.Split(','); // Split line into columns 4 var person = new Person 5 { 6 Name = columns[0], 7 Age = int.Parse(columns[1]), // Convert string to integer 8 Occupation = columns[2] 9 }; 10 persons.Add(person); // Add the person object to the list 11}

In this code snippet:

  • line.Split(','): Executes the split operation for each line based on commas, storing the result in an array columns.

  • int.Parse(columns[1]): Converts the string-based age to an integer, allowing us to handle numerical computations later if necessary.

Verifying Parsed Output

To ensure the CSV data is correctly parsed and stored in Person objects, we can print the contents of the persons list:

C#
1Console.WriteLine("Parsed CSV Data:"); 2foreach (var person in persons) 3{ 4 Console.WriteLine($"{person.Name} {person.Age} {person.Occupation}"); 5}

Expected console output should look like this, confirming accurate parsing:

Plain text
1Parsed CSV Data: 2John 28 Engineer 3Alice 34 Doctor 4Bob 23 Artist

This output indicates that each line from the CSV has been successfully converted into a Person object.

Summary and Preparing for Practice

In this lesson, we covered how to parse CSV files in C#, focusing on reading data with commas as delimiters and converting string data to integers. You've learned how to define data structures for holding parsed data and how to use the file reading and manipulation techniques provided by C# efficiently to handle and manage CSV content.

As you move on to practice exercises, remember to verify the correctness of your parsed data and reflect on potential applications of what you've learned. Keep up the excellent work and continue exploring more advanced data-handling techniques in C#.

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.