Python openpyxl

This is a tutorial on the Python library, openpyxl.

The Python library openpyxl deals with the powerful spreadsheet application, Microsoft Excel. The openpyxl module allows your Python programs to read and modify Excel spreadsheet files.

Python openpyxl really shines when it comes to automating tasks for Microsoft Excel. For instance, you might have to job of going through thousands of spreadsheets to extract a certain piece of information. Instead of spending weeks slaving away doing such a tedious job. Why not write a simple python program in space of a few hours to do the job?


Excel Terminology

We’ll quickly go through some Excel terminology here for those who aren’t very familiar.

An Excel spreadsheet document (also called workbook) is saved with the .xlsx extension. Each Document can contain several sheets. These sheets are comprised of rows and columns. The row’s numbering starts from 1. Whereas the column numbering starts from A. A box at a particular row and column is called a cell. The sheet that the user is current viewing/using is known as the Active sheet.

Installation

openpyxl is not a standard Python Library hence it has to be downloaded and installed separately. Install it using the command prompt as shown in the Getting started guide. To check if it downloaded correctly, try importing it into your python program.

import openpyxl

Opening Excel Documents

To open Excel documents, you need to use the openpyxl.load_workbook() Function. It takes as an input parameter the name of the Excel document.

excel = openpyxl.load_workbook("example.xlsx")

Once you have connected to the document successfully, you need to open one of the many sheets in the documents. We will be using the workbook object that is currently contained within the excel variable. Using the below code we can print out a list of all the sheets in the workbook.

excel = openpyxl.load_workbook("example.xlsx")
print(excel.sheetnames)

Output:

['Sheet1', 'Sheet2', 'Sheet3']

Now that we have the sheet names on hand, we can connect directly to them. Below is the full code for this process.

import openpyxl

excel = openpyxl.load_workbook("example.xlsx")
sheet = excel["Sheet1"]

Reading Excel Documents

Now that we’ve connected ourselves to a sheet, we can begin transferring data between our program and the sheet.

Python Openpyxl Example(s)

We can access individual cells, and change their values accordingly. Remember to use the save() function to save the Excel file after modifying it. You can either create a new file by using a new name, or save to the same file by using that file’s name.

import openpyxl

excel = openpyxl.load_workbook("example.xlsx")
sheet = excel["Sheet1"]

#Creating cell objects
a = sheet["A1"]
b = sheet["D2"]
c = sheet["F7"]

#Printing cell values
print(a.value)
print(b.value)
print(c.value)

#Directly obtaining values
print(sheet["A8"].value)

You can also directly access cells using it’s column and row number as well.

import openpyxl

excel = openpyxl.load_workbook("example.xlsx")
sheet = excel["Sheet1"]

#Creating cell values
a = sheet.cell(row=1, column=2)
b = sheet.cell(row=3, column=6)

#Printing cell values
print(a.value)
print(b.value)

#Directly obtaining values
print(sheet.cell(row=4, column=5).value)

The second method of accessing cells can be used to access many cells at once. We can do so using a loop approach. We’ll be using for loops for this example.

import openpyxl

excel = openpyxl.load_workbook("example.xlsx")
sheet = excel["Sheet1"]

#Gives max column number which contains data
col = sheet.max_column

#Gives max column number which contains data
row = sheet.max_row

for i in range(col):
    for j in range(row):
        print(sheet.cell(row = j+1, column = i+1).value)

Since the range function returns values starting from 0, and Excel columns and rows start from 1, we added 1 into values of j and i to counter this.


Inserting Data into Excel Documents

Inserting data is more or less the same as reading data. You simply have to reverse the process.

Let’s start by changing individual values in the Excel spreadsheet.

import openpyxl

excel = openpyxl.load_workbook("example.xlsx")
sheet = excel["Sheet1"]

#Creating cell objects
a = sheet["A3"]
b = sheet.cell(row = 2, column = 4)

#Changing Values
a.value = 5
b.value = 10

#Directly changing values
sheet["A5"].value = 5
sheet.cell(row = 1, column = 7).value = 10

The example below is designed to replace every cell in the Spreadsheet with a random integer value ranging from 1 to 10. While not necessary for this this guide, you can check up on the Random module here.

import openpyxl
import random

excel = openpyxl.load_workbook("example.xlsx")
sheet = excel["Sheet1"]

col = sheet.max_column
row = sheet.max_row

for i in range(col):
    for j in range(row):
        print(random.randint(1,10))
        sheet.cell(row = j+1, column = i+1).value = random.randint(1,10)

excel.save("example.xlsx")


This marks the end of Python openpyxl Guide. Here’s a link back to the main Python Libraries Article. Questions regarding content can be asked in the comments section below.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments