Dealing with Excel Spreadsheets using Python

3 min read
Dealing with Excel Spreadsheets using Python

One day at work, I was asked to read data from an excel spreadsheet provided by a third-party. The file contain hundreds of lines detailing responses of a SOAP server. This spreadsheet is updated regularly and my task was to have the SOAP client verify responses from that spreadsheet and perform designated actions.

I used xlrd to read the spreadsheets. It was very easy and straightforward. To open an excel spreadsheet, call the open_workbook() function.

#!/usr/bin/env python
import xlrd

workbook = xlrd.open_workbook('Lotto Results.xlsx')

To get the details about the number of worksheets and the worksheet names, we can use nsheets and sheet_names().

print(workbook.nsheets)  # number of worksheets
print(workbook.sheet_names())  # list of worksheet names

Output:

5
['Lotto 642', 'Mega Lotto 645', 'Super Lotto 649', 'Grand Lotto 655', 'Ultra Lotto 658']

There are 2 ways to access worksheets. The first way is to specify the index number. The other method is to specify the sheet name.

worksheet1 = workbook.sheet_by_index(0)
worksheet2 = workbook.sheet_by_name('Mega Lotto 645')

To iterate over the rows of the worksheets, use get_rows().

for row in worksheet1.get_rows():
    print(row)

Output:

[text:'LOTTO GAME', text:'COMBINATIONS', text:'DRAW DATE', text:'JACKPOT', text:'WINNERS']
[text:'Lotto 6/42', text:'06-41-07-22-39-33', xldate:39816.0, number:9568598.4, number:0.0]
[text:'Lotto 6/42', text:'18-42-39-20-22-29', xldate:39819.0, number:14573811.6, number:2.0]
[text:'Lotto 6/42', text:'31-04-01-24-12-41', xldate:39823.0, number:4536991.8, number:0.0]
[text:'Lotto 6/42', text:'08-38-01-40-34-32', xldate:39826.0, number:9185963.4, number:0.0]
[text:'Lotto 6/42', text:'31-02-06-15-14-04', xldate:39830.0, number:13915504.8, number:3.0]
[text:'Lotto 6/42', text:'02-31-27-10-37-05', xldate:39833.0, number:3731112.0, number:0.0]
[text:'Lotto 6/42', text:'20-21-22-03-15-02', xldate:39837.0, number:8096659.2, number:1.0]
[text:'Lotto 6/42', text:'29-32-41-38-05-08', xldate:39840.0, number:3814696.8, number:0.0]
[text:'Lotto 6/42', text:'30-41-39-17-07-38', xldate:39844.0, number:8385651.0, number:1.0]

Note that each item in the list are Cell objects. To get the values of each cell, use cell.value.

for row in worksheet1.get_rows():
    print([cell.value for cell in row])

Output:

['LOTTO GAME', 'COMBINATIONS', 'DRAW DATE', 'JACKPOT', 'WINNERS']
['Lotto 6/42', '06-41-07-22-39-33', 39816.0, 9568598.4, 0.0]
['Lotto 6/42', '18-42-39-20-22-29', 39819.0, 14573811.6, 2.0]
['Lotto 6/42', '31-04-01-24-12-41', 39823.0, 4536991.8, 0.0]
['Lotto 6/42', '08-38-01-40-34-32', 39826.0, 9185963.4, 0.0]
['Lotto 6/42', '31-02-06-15-14-04', 39830.0, 13915504.8, 3.0]
['Lotto 6/42', '02-31-27-10-37-05', 39833.0, 3731112.0, 0.0]
['Lotto 6/42', '20-21-22-03-15-02', 39837.0, 8096659.2, 1.0]
['Lotto 6/42', '29-32-41-38-05-08', 39840.0, 3814696.8, 0.0]
['Lotto 6/42', '30-41-39-17-07-38', 39844.0, 8385651.0, 1.0]

How about writing Excel spreadsheets?

For this, we will be using the Python XlsxWriter library. To open a workbook for writing, use the Workbook class.

#!/usr/bin/env python
import xlsxwriter


workbook = xlsxwriter.Workbook('output.xlsx')

Next step is to add a worksheet.

worksheet = workbook.add_worksheet('Amortization Schedule')

In writing rows, we need to specify the row number and column number where the data will be inserted. For example, if we want to put the header at row=0 and col=0

worksheet.write_row(row=0, col=0, data=['Number', 'Amount', 'Interest', 'Principal', 'Balance'])

To write multiple rows, increment row number, otherwise, the previous rows will be overwritten.

row = 0
for number, amount, interest, principal, balance in amortization_schedule(150000, 0.1, 36):
    row += 1
    worksheet.write_row(row=row, col=0, data=[number, amount, interest, principal, balance])

Do not forget to call close() to save the spreadsheet.

workbook.close()

Any shorter way?

When using pandas dataframe, simply use from_excel() and to_excel().

import pandas as pd

df = pd.read_excel('Lotto Results.xlsx', sheet_name='Lotto 642')  # read excel spreadsheet

df.to_excel('output.xlsx', sheet_name='Sheet')  # write to spreadsheet

Conclusion

With the existence of xlrd and XlsxWriter, working with excel spreadsheets becomes an easy task. The main advantage of using Python instead of Excel (program) when dealing with spreadsheets is that we can use libraries like pandas and matplotlib.

Read more articles like this in the future by buying me a coffee!

Buy me a coffeeBuy me a coffee