Menu
Python

Lab 15: Excel File Management using Python

Python

Lab 15: Excel File Management using Python

Opening existing file, adding data, saving Data in a new Workbook

To perform operation on an excel file we need to download and install a library called openpyxl

 pip3 install openpyxl
Collecting openpyxl
  Using cached openpyxl-3.0.4-py2.py3-none-any.whl (241 kB)
Requirement already satisfied: jdcal in c:\users\lenovo\appdata\local\programs\python\python38\lib\site-packages (from openpyxl) (1.4.1)
Requirement already satisfied: et-xmlfile in c:\users\lenovo\appdata\local\programs\python\python38\lib\site-packages (from openpyxl) (1.0.1)
Installing collected packages: openpyxl
Successfully installed openpyxl-3.0.4

To use open an existing excel file and perform excel operations in the file, we need to use load_workbook() function and Workbook() class in our script:

  • from openpyxl import load_workbook(), Workbook This command import only the load_workbook function and Workbook() class from the openpyxl library
from openpyxl import load_workbook,Workbook 

To open the excel file and perform excel operations on the file using python, we need to create an object of the file using load_workbook() function and make the same object an instance of the Workbook() class

  • load_workbook() function creates an object to read the excel file
  • Workbook() creates an instance of the Workbook class of the same object to perform excel operations.
mywb = load_workbook('students1.xlsx') 
mywb = Workbook()

To work with sheetnames in the workbook, the following commands are used

  • sheetnames Attribute returns the existing sheet names in the workbook
  • create_sheet() Function creates a new sheet in the excel workbook at a specified position
mywb.sheetnames
mywb.create_sheet('accounts',2)

To select a sheet in the workbook and edit the values in cells of the sheet, the following commands are used:

  • sheet_object = workbook_object['sheetname'] Creates an object for the sheetname specified using the workbook_object
  • cell_object = sheet_object['cell_block'] Creates an object for the cell which needs to be edited specified using the sheet_object
  • cell_object.value Attribute returns the data stored in the cell
  • cell_object.value = "data" Appends “data” into the cell_object
mysh = mywb['accounts']
mycell = mysh['B6']
mycell.value
mycell.value = "RST Forum"

To select multiple cells and append data in all cells, the following commands are used

  • block_object = sheet_object[block_range] Creates an object for multiple block cells of a sheet specified in the block_range
  • for variable_row in block_range: Iterates every row using for loop
  • for variable_cell in variable_row: Iterates every cell in iterated row
  • variable_cell.value = "data" Appends data to the selected cell object
  • workbook_object.save(filename) Saves the edited data in the workbook_object in the specified excel file
myblock = mysh['F1:H10']
for eachrow in myblock:
	for eachcell in eachrow:
		eachcell.value = 'PyExcel'
mywb.save('newstudents.xlsx')

Read saved data from Excel file using Python

To read saved data from an excel workbook, we need to load the workbook using an object in python:

  • path_object = path of the excel workbook Give the location of the file
  • workbook_object = load_workbook(path_object) To use the workbook in python, workbook_object is created
path = "/home/rst/students1.xlsx"
students1 = openpyxl.load_workbook(path) 

To select the active sheet in the workbook, an object is create with the help of the active attribute of the load_workbook function

active_sheet_object = workbook_object.active Attribute loads active sheet in the memory and can be accessed using the active_sheet_object

active_sheet = students1.active

Another way of selecting a cell is to pass the row and column number that can provide the location information of the cell. The first row and column number start from 1 and not 0. Cell object can be created using the following function:

  • cell_object= active_sheet_object.cell(row = integer, column = integer) Function creates a cell object by using it’s row and column number.
cell = active_sheet.cell(row = 1, column = 1)

To print the value of the cell

  • cell_object.value Attribute returns the value of the cell
print("Value of the cell:", cell.value)
('Value of the cell:', u'Roll ')

To print the total number of rows

  • active_sheet_object.max_row Attribute returns total rows in the active sheet
max_row = active_sheet.max_row
print("Total number of rows", active_sheet.max_row)
('Total number of rows', 4)

To print the total number of columns

  • active_sheet_object.max_column Attribute returns total columns in the active sheet
max_col = active_sheet.max_column
print("Total number of columns", active_sheet.max_column)
('Total number of columns', 5)

To print all column names

  • for variable in range(1, max_col_object + 1): Iterates every column from 1st column till last column
  • cell_columns_object = active_sheet_object.cell(row = integer, column = variable)Function selects every cell in the iterated column and can be accessed using the cell_columns_object variable
  • cell_columns_object.value Attribute returns the value of the iterated cell_columns_object
for i in range(1, max_col + 1):
 cells_column = active_sheet.cell(row = 1, column = i)
 print("Names of the columns", cells_column.value)
('Names of the columns', u'Roll ')
('Names of the columns', u'Name')
('Names of the columns', u'Subject')
('Names of the columns', u'Batch Date')
('Names of the columns', u'Attendance')

To print all values of first column

  • for variable in range(1, max_row_object + 1): Iterates every row from 1st row till last row
  • cell_row_object = active_sheet_object.cell(row = variable, column = integer )Function selects every cell in the iterated row and can be accessed using the cell_row_object variable
  • cell_row_object.value Attribute returns the value of the iterated cell_row_object
for i in range(1, max_row + 1):
 cells_row = active_sheet.cell(row = i, column = 1)
 print("Values of the 1st column:", cells_row.value)
('Values of the 1st column:', u'Roll ')
('Values of the 1st column:', 10)
('Values of the 1st column:', 20)
('Values of the 1st column:', 30)

To print a particular row value

for variable in range(1, max_col_object + 1): Iterates every column from 1st column till last column
cell_object = active_sheet_object.cell(row = integer, column = variable) Function selects every cell in the iterated column and can be accessed using the cell_object variable
cell_object.value Attribute returns the value of the iterated cell_columns_object

for i in range(1, max_col):
 cell_obj = active_sheet.cell(row = 2, column = i)
 print("Value of the row 2, column i:",cell_obj.value)
('Value of the row 2, column i:', 10)
('Value of the row 2, column i:', u'John Doe')
('Value of the row 2, column i:', u'Python')
('Value of the row 2, column i:', u'1st July')
('Value of the row 2, column i:', u'Present')

Create and save data in a new Workbook

To create an excel workbook using python, we need to create an object using of the Workbook() class of the openpyxl library:

  • from openpyxl import Workbook Imports Workbook() class from openpyxl library
from openpyxl import Workbook

To create a new object of the Workbook() class, following command is used

new_object = Classname() Creates an instance of the class.

 new_mywb = Workbook()

To select and change the title of the active sheet, the following command is used

  • active_sheet_object = new_object.active Attribute selects the active sheet and can be accessed using the active_sheet_object variable
  • title_object = active_sheet_object.title Attribute selects the title of the active sheet and can be accesses using the title_object variable
  • print("active sheet title:" + title_object) Function prints the title of the active sheet
  • new_title = "Python" New title of string data-type
  • active_sheet_object.title = new_title Assigns the title attribute the value of new_title variable
  • print("sheet name is renamed as: " + title_object) Function prints the updated title of the active sheet
my_sheet = new_wb.active 
title = my_sheet.title 
print("active sheet title: " + title) 
title = "python"
my_sheet.title = title  
print("sheet name is renamed as: " + title)
active sheet title: Sheet
sheet name is renamed as: python

To insert values in cells of the active sheet, the following commands are used

  • Cell objects also have row, column and coordinate attributes that provide location information for the cell.
  • Note: The first row or column integer is 1, not 0. Cell object is created by using sheet object’s cell() method.
  • cell_object = active_sheet_object.cell(row = integer, column =integer)
  • cell_object "RST" Writes values to cells.
  • cell_object = active_sheet_object['A2'] Once have a Worksheet object, one can access a cell object by its name also. A2 means column = 1 & row = 2.
  • cell_object = active_sheet_object ['B2'] B2 means column = 2 & row = 2.
c1 = my_sheet.cell(row = 1, column = 1)
c1.value = "RST"
  
c2 = my_sheet.cell(row= 1 , column = 2) 
c2.value = "Forum"
  
c3 = my_sheet['A2'] 
c3.value = "Cisco"
  
c4 = my_sheet['B2'] 
c4.value = "Red Hat"

To save the new workbook object data in file, the following commands are used new_workbook_object.save('filename') Saves and writes the object to the specified file name.

newwb.save("prac.xlsx")

Python Script to change cell dimensions in Excel Workbook

To change the cell dimensions in an excel sheet, the following commands are used:

  • active_sheet_object.row_dimensions[Row Number].height = value Changes the height of the specified row with the value
  • active_sheet_object.column_dimensions[Column Name].width = value Changes the width of the column with the value
sheet.row_dimensions[1].height = 70
sheet.column_dimensions['B'].width = 20

Python Script to merge cells in Excel Workbook

To merge and unmerge the cells in an excel sheet, the following commands are used

  • merge cell from A2 to D4 i.e. A2, B2, C2, D2, A3, B3, C3, D3, A4, B4, C4 and D4 .
  • merge cell from C6 to D6
  • active_sheet_object.merge_cells('Range of cells to merge') A2:D4′ merges 12 cells into a single cell.
  • active_sheet_object.cell(row = integer, column = integer).value = 'data'Write data into merged cells
  • workbook_object.save('filename') Save and write the edited object to an excel file.
sheet.merge_cells('A2:D4')
sheet.cell(row = 2, column = 1).value = 'Twelve cells join together.'
sheet.merge_cells('C6:D6')
sheet.cell(row = 6, column = 3).value = 'Two merge cells.'
 wb.save('mergeXL.xlsx')


Python Script to unmerge cells in Excel Workbook

To unmerge the cells in an excel sheet, the following commands are used

  • unmerge cells from A2 to D4 i.e. A2, B2, C2, D2, A3, B3, C3, D3, A4, B4, C4 and D4 .
  • unmerge cells from C6 to D6
  • active_sheet_object.unmerge_cells('Range of cells to unmerge') A2:D4′ merges 12 cells into a single cell.
  • active_sheet_object.cell(row = integer, column = integer).value = 'data'Write data into merged cells
  • workbook_object.save('filename') Save and write the edited object to an excel file.
sheet.unmerge_cells('A2:D4')
sheet.unmerge_cells('C6:D6')
wb.save('unmergeXL.xlsx')