Python: Excel Read and Write


Python uses openpyxl module to read, write and modify Excel files. This module acts as an interface between our python script and the excel file.

What is Excel File, Workbook, Row, and column?

An Excel file called as Workbook is a collection of sheets called as Worksheets. Each sheet is further divided into rows and columns.

The intersection of a row and a column is called a Cell. Cells hold the data of an Excel sheet.

How Python read & write to excel file?

In order to work with an Excel sheet, we first need to load that excel workbook into our program and then fetch the required worksheet from the workbook object. Once the workbook will load successfully you can point to any sheet in the workbook and then read and write to cell based on selected rows and columns.

In the further section, you will get to know about openpyxl module methods and attributes for each type of case to fetch data or update data to an excel file.

Load Workbook

load_workbook method can be used to load a workbook.

import openpyxl as xl
work_book=xl.load_workbook(r'Employee.xlsx')

Workbook Sheet Names

Once the workbook load successfully, Then you can get the list of sheet names or index the sheets to fetch the worksheet.

  • Sheet Names : sheetnames attribute of workbook object is used to display all the sheet names available in the workbook.
print('Sheet names:',work_book.sheetnames)
  • Specific Sheet: By indexing the sheet name can use to fetch data from specific sheet
work_sheet=work_book['Emp']

Accessing the cells

A cell in a worksheet can be accessed in two ways. 

  • The cell method of worksheet object can be used, we need to pass the row and column numbers as arguments to this method. 
    Note: The row and column numbers will begin from 1.
cell1=work_sheet.cell(1,1)
  • The worksheet object can be indexed with the cell reference to retrieve the corresponding cell object. such as ‘A2’ can be used to select the cell from 2nd row 1st column.
cell2=work_sheet['A2']

Note: The above two methods will return cell objects. The value stored in the cell can be retrieved using the value attribute of the cell object.

cell1.value
cell2.value

Maximum Rows and Columns

To check the maximum number of rows and columns with data in a given sheet. we can use max_row and max_column attributes of the worksheet object.

work_sheet.max_row
work_sheet.max_column

Sheet Slicing

In order to extract a portion of sheet, we need to use slicing. ws[‘A1′:’C4’] is used to extract cells that belong to rows 1 to 4 and columns ‘A’ to ‘C’.

cells=ws['A1':'C4']
for row in cells:
    for cell in row:
        print(cell.value,end=' ')
    print()

This will return a tuple of tuples each representing a row. Every row tuple will consist of cells that represent the columns.

Special Indexing

To retrieve individual rows and individual columns of the worksheet, we are going to use the respective indices of the rows and columns.

Example: ws[1] will retrieve all the cells that belong to 1st row. ws[‘A’] will retrieve all the cells that belong to 1st column.

cells=ws[1]
print('Items of row 1 are:',end=' ')
for cell in cells:
    print(cell.value, end=' ')
cells=ws['B']
print('\nItems of column B are:')
for cell in cells:
    print(cell.value)

This will return a tuple of cells.

Accessing Particular Records

Let us see how to retrieve the record of employee with id 3

import openpyxl as xl
wb=xl.load_workbook(r'Employee.xlsx')
ws=wb['Emp']
cells=tuple()
for row_num in range(2,ws.max_row+1):
    if ws.cell(row_num,1).value==3: 
        cells=ws[row_num]
        break
if cells:
    for cell in cells:
        print(cell.value,end=' ')
else:
    print('Employee record not found')

Note: Iterating is starting from the second row, since the excel sheet has headers in its first row.

Accessing Particular Column

Columns can be accessed in two ways.

  1. If we know the position of the columns, then we can access it using the column reference. 2nd column in a worksheet can be accessed as WS[‘B’]
  2. If we don’t know the column, then we need to validate using the headers and retrieve all the cells.

The below example displays names of all the employees

import openpyxl as xl
wb=xl.load_workbook(r'Employee.xlsx')
ws=wb['Emp']
cells=tuple()
for col_num in range(1,ws.max_column+1):
    if ws.cell(1,col_num).value.lower().strip()=='name': 
        for row_num in range(2,ws.max_row+1):
            cells+=ws.cell(row_num,col_num),
        break
if cells:
    for cell in cells:
        print(cell.value)
else:
    print('No Employees present')

Deleting a specific record

We can delete a record using delete_rows method of worksheet object. It accepts two parameters, 

  1. index – The index at which, we need to delete the row
  2. num – number of rows to be deleted

The below code is used to delete the record of employee with id 3.

import openpyxl as xl
filepath=r'Employee.xlsx'
wb=xl.load_workbook(filepath)
ws=wb['Emp']
for row_num in range(1,ws.max_row+1):
    if ws.cell(row_num,1).value==3: 
        ws.delete_rows(row_num,1)
        wb.save(filepath)
        print('Employee deleted')
        break
else:
    print('Employee not found')

Note: Any changes made will be persisted if and only if the save method of workbook object is invoked. Else, the changes will be lost

Deleting a specific column

We can delete a column using delete_cols method of worksheet object. It accepts two parameters, 

  1. index – The index at which, we need to delete the row
  2. num – number of rows to be deleted

The below code is used to delete salary column.

import openpyxl as xl
filepath=r'Employee.xlsx'
wb=xl.load_workbook(filepath)
ws=wb['Emp']
for col_num in range(1,ws.max_column+1):
    if ws.cell(1,col_num).value.lower().strip()=='salary': 
        ws.delete_cols(col_num,1)
        wb.save(filepath)
        print('Column deleted')
        break
else:
    print('Column not found')

Adding a new record

 To add a new record we need to use append method of worksheet object. append method accepts a list of items which represent a row.

The below code is used to add new employees to the existing sheet.

import openpyxl as xl
filepath=r'Employee.xlsx'
wb=xl.load_workbook(filepath)
ws=wb['Emp']
ws.append([4,'Bharti',10000])
wb.save(filepath)

Creating a new workbook

In order to create a workbook we need to create an object of Workbook class. The following example will help you to create a workbook,create a sheet, remove a sheet. Insert data into it and save it.

import openpyxl as xl
wb=xl.Workbook()
print(wb.sheetnames)
wb.create_sheet(title='Emp', index=0)
print(wb.sheetnames)
wb.remove(wb['Sheet'])
print(wb.sheetnames)
ws=wb['Emp']
 
ws['A1']="Name"
ws['B1']='Salary'
ws.append(['Saurabh',23400.0])
ws.append(['Gaurav',21529.12])
wb.save(r'Employee1.xlsx')

append method is used to insert a new row into the worksheet.
save method is used to save the current workbook.
create_sheet is used to create a new sheet.
remove_sheet is used to remove an existing sheet.

Summary

In this blog we have covered following sections to read and write to excel through Python by using openpyxl module:

  • Load data from workbook
  • Loading complete column and row data
  • Sheet slicing
  • Deleting/Displaying a particular row and column data
  • Create a new work book
  • Add and remove a work sheet from workbook
  • Insert data to worksheet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s