The Python library openpyxl deals with the powerful spreadsheet application, Microsoft Excel. The openpyxl module allows your Python programs to read and modify Excel spreadsheet files.

Python openpyxl really shines when it comes to automating tasks for Microsoft Excel. For instance, you might have to job of going through thousands of spreadsheets to extract a certain piece of information. Instead of spending weeks slaving away doing such a tedious job. Why not write a simple python program in space of a few hours to do the job?

### Excel Terminology

We’ll quickly go through some Excel terminology here for those who aren’t very familiar.

An Excel spreadsheet document (also called workbook) is saved with the .`xlsx`

extension. Each Document can contain several **sheets**. These sheets are comprised of rows and columns. The row’s numbering starts from **1**. Whereas the column numbering starts from **A**. A box at a particular row and column is called a **cell**. The sheet that the user is current viewing/using is known as the **Active sheet**.

### Installation

**openpyxl** is not a standard Python Library hence it has to be downloaded and installed separately. Install it using the command prompt as shown in the Getting started guide. To check if it downloaded correctly, try importing it into your python program.

`import openpyxl`

### Opening Excel Documents

To open Excel documents, you need to use the `openpyxl.load_workbook()`

Function. It takes as an input parameter the name of the Excel document.

`excel = openpyxl.load_workbook("example.xlsx")`

Once you have connected to the document successfully, you need to open one of the many sheets in the documents. We will be using the workbook object that is currently contained within the `excel`

variable. Using the below code we can print out a list of all the sheets in the workbook.

excel = openpyxl.load_workbook("example.xlsx") print(excel.sheetnames)

Output:

`['Sheet1', 'Sheet2', 'Sheet3']`

Now that we have the sheet names on hand, we can connect directly to them. Below is the full code for this process.

import openpyxl excel = openpyxl.load_workbook("example.xlsx") sheet = excel["Sheet1"]

## Reading Excel Documents

Now that we’ve connected ourselves to a sheet, we can begin transferring data between our program and the sheet.

We can access individual cells, and change their values accordingly. Remember to use the `save()`

function to save the Excel file after modifying it. You can either create a new file by using a new name, or save to the same file by using that file’s name.

import openpyxl excel = openpyxl.load_workbook("example.xlsx") sheet = excel["Sheet1"] #Creating cell objects a = sheet["A1"] b = sheet["D2"] c = sheet["F7"] #Printing cell values print(a.value) print(b.value) print(c.value) #Directly obtaining values print(sheet["A8"].value)

You can also directly access cells using it’s column and row number as well.

import openpyxl excel = openpyxl.load_workbook("example.xlsx") sheet = excel["Sheet1"] #Creating cell values a = sheet.cell(row=1, column=2) b = sheet.cell(row=3, column=6) #Printing cell values print(a.value) print(b.value) #Directly obtaining values print(sheet.cell(row=4, column=5).value)

The second method of accessing cells can be used to access many cells at once. We can do so using a loop approach. We’ll be using for loops for this example.

import openpyxl excel = openpyxl.load_workbook("example.xlsx") sheet = excel["Sheet1"] #Gives max column number which contains data col = sheet.max_column #Gives max column number which contains data row = sheet.max_row for i in range(col): for j in range(row): print(sheet.cell(row = j+1, column = i+1).value)

Since the range function returns values starting from 0, and Excel columns and rows start from 1, we added 1 into values of j and i to counter this.

### Inserting Data into Excel Documents

Inserting data is more or less the same as reading data. You simply have to reverse the process.

Let’s start by changing individual values in the Excel spreadsheet.

import openpyxl excel = openpyxl.load_workbook("example.xlsx") sheet = excel["Sheet1"] #Creating cell objects a = sheet["A3"] b = sheet.cell(row = 2, column = 4) #Changing Values a.value = 5 b.value = 10 #Directly changing values sheet["A5"].value = 5 sheet.cell(row = 1, column = 7).value = 10

The example below is designed to replace every cell in the Spreadsheet with a random integer value ranging from 1 to 10. While not necessary for this this guide, you can check up on the Random module here.

import openpyxl import random excel = openpyxl.load_workbook("example.xlsx") sheet = excel["Sheet1"] col = sheet.max_column row = sheet.max_row for i in range(col): for j in range(row): print(random.randint(1,10)) sheet.cell(row = j+1, column = i+1).value = random.randint(1,10) excel.save("example.xlsx")

This marks the end of Python openpyxl Guide. Here’s a link back to the main Python Libraries Article.