Reading CSV files in Python

In this article, we will learn how to read data from csv files in python of different formats.

Reading different types of CSV files

In python, we use csv.reader() module to read the csv file. Here, we will show you how to read different types of csv files with different delimiter like quotes(""), pipe(|) and comma(,).

Normal CSV file

We have a csv file called people.csv having default delimiter comma(,) with following data:

SN, Name, City
1, John, Washington
2, Eric, Los Angeles
3, Brad, Texas

Example 1: Read people.csv file, where delimiter is comma (,)

import csv

with open('people.csv', 'r') as csvFile:
    reader = csv.reader(csvFile)
    for row in reader:
        print(row)

csvFile.close()

When we run the above program, the output will be

['SN', ' Name', ' City']
['1', ' John', ' Washington']
['2', ' Eric', ' Los Angeles']
['3', ' Brad', ' Texas']

In the above program, we read the people.csv file. Then, we print the row of each columns.


CSV files with Initial Spaces

As you can see in Example 1, we had spaces after the delimeter due to which we got spaces in the output too. Well, for this kind of files csv() library has a solution for the programmers.

We can read the csv file and remove whitespaces, by registering new dialects using csv.register_dialect() class of csv module. A dialect describes the format of the csv file that is to be read.

In dialects we have a parameter skipinitialspace which is used for removing whitespaces after the delimeter. By default it has value false but in our situation we have to make it true.

We will use people.csv which is used in Example 1, with following data

SN, Name, City
1, John, Washington
2, Eric, Los Angeles
3, Brad, Texas

Example 2: Read people.csv file, where after delimiter we have spaces

import csv

csv.register_dialect('myDialect',
delimiter = ',',
skipinitialspace=True)

with open('people.csv', 'r') as csvFile:
    reader = csv.reader(csvFile, dialect='myDialect')
    for row in reader:
        print(row)

csvFile.close()

When we run the above program, the output will be

['SN', 'Name', 'City']
['1', 'John', 'Washington']
['2', 'Eric', 'Los Angeles']
['3', 'Brad', 'Texas']

In the above program we registered a new dialect with delimiter = ',' and skipinitialspace=True which tells the compiler that there are whitespaces after the delimiter.

*Note: A dialect is a class of csv module which helps to define parameters for reading and writing CSV. It allows you to create, store, and re-use various formatting parameters for your data.


CSV file with quotes

We can read the csv file with quotes, by registering new dialects using csv.register_dialect() class of csv module.

Here, we have quotes.csv file with following data.

"SN", "Name", "Quotes"
1, Buddha, "What we think we become"
2, Mark Twain, "Never regret anything that made you smile"
3, Oscar Wilde, "Be yourself everyone else is already taken"

Example 3: Read quotes.csv file, where delimiter is comma(,) but with quotes

import csv

csv.register_dialect('myDialect',
delimiter = ',',
quoting=csv.QUOTE_ALL,
skipinitialspace=True)

with open('quotes.csv', 'r') as f:
    reader = csv.reader(f, dialect='myDialect')
    for row in reader:
        print(row[2])

When we run the above program, the output will be

Quotes
What we think we become
Never regret anything that made you smile
Be yourself everyone else is already taken

In the above program, we register a dialect with name myDialect. Then, we used csv.QUOTE_ALL to display all the characters after double quotes.


CSV files with Custom Delimiters

We can read csv file having custom delimeter by registering a new dialect with the help of csv.register_dialect().

Here, we have following data in file called dialects.csv.

"pencil"|"eraser"|"sharpner"
"book"|"chair"|"table"
"apple"|"mango"|"grapes"

Example 4: Read dialects.csv file

import csv

csv.register_dialect('myDialect', delimiter = '|')

with open('dialects.csv', 'r') as f:
    reader = csv.reader(f, dialect='myDialect')
    for row in reader:
        print(row)

When we run the above program, the output will be

['pencil', 'eraser', 'sharpner']
['book', 'chair', 'table']
['apple', 'mango', 'grapes']

In the above program, we register a new dialects as myDialect. Then, we use delimiter=| where a pipe(|) is considered as column separator. 


Reading CSV file into a dictionary

To read a csv file into a dictionary can be done by using DictReader() class of csv module which works similar to the reader() class but creates an object which maps data to a dictionary. The keys are given by the fieldnames parameter.

Example 5: Read people.csv file into a dictionary

import csv

with open("people.csv", 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(dict(row))

csvFile.close()

When we run the above program, the output will be

{'SN': '1', ' Name': ' John', ' City': ' Washington'}
{'SN': '2', ' Name': ' Eric', ' City': ' Los Angeles'}
{'SN': '3', ' Name': ' Brad', ' City': ' Texas'}

In above program, we use DictReader() to read people.csv file and map into a dictionary. Then, we use dict() to print the data in dictionary format without order.

If we remove dict() function from the above program and only used print(row), output will be

OrderedDict([('SN', '1'), (' Name', ' John'), (' City', ' Washington')])
OrderedDict([('SN', '2'), (' Name', ' Eric'), (' City', ' Los Angeles')])
OrderedDict([('SN', '3'), (' Name', ' Brad'), (' City', ' Texas')])

We can also register new dialects and use it in the DictReader() methods. Suppose we have a people_data.csv in the following format

SN| Name| City
1| John| Washington
2| Eric| Los Angeles
3| Brad| Texas

Example 6: Read people_data.csv into a dictionary by registering a new dialect

import csv

csv.register_dialect('myDialect',
delimiter = '|',
skipinitialspace=True)

with open("people.csv", 'r') as csvfile:
    reader = csv.DictReader(csvfile, dialect='myDialect')
    for row in reader:
        print(row)

csvfile.close()

When we run the above program, the output will be

OrderedDict([('SN', '1'), ('Name', 'John'), ('City', 'Washington')])
OrderedDict([('SN', '2'), ('Name', 'Eric'), ('City', 'Los Angeles')])
OrderedDict([('SN', '3'), ('Name', 'Brad'), ('City', 'Texas')])

*Note: In case of python 3, using DictReader() gives OrderedDict by default. An OrderedDict is a dictionary subclass which saves the order in which its contents are added.