How to create Zip File Backups in Python?

Note, this article merely explains how to create backup using Zip Files in Python. For an in depth explanation into what Zip Files are, and the zipfile library, read our Python Zip File Article.

Code

Here is the code for the back up system. See the line by line breakdown below for an expalanation.

from zipfile import ZipFile
import os

extension = input('Input file extension: ')

Zippy = ZipFile("Backup.zip", 'w')

for folder, subfolders, file in os.walk("C:\\Users"):
    for subfolders in subfolders:
        path = folder + subfolders 
    for x in file:
        if x.endswith(extension):
            filepath = folder + "\\" + x
            print(filepath)
            Zippy.write(filepath, compress_type=zipfile.ZIP_DEFLATED)
Zippy.close()

Line 1: Imports the ZipFile function from the zipfile library. When only using one or two functions from a library, it’s best to only import only that function to avoid clutter.

Line 2: Imports the os module. We’ll be using this module to deal with the files and directories in the operating system.

Line 4: Takes input from the user with the input function. The user is expected enter an appropriate file extension like .txt or .pdf.

Line 6: Opens a ZipFile for writing data with the name “Backup.zip”. Since no file path is specified, it is saved at the same location as the place this script is executing form. Also known as the current working directory.

Line 8: This line calls the os.walk function while simultaneously creating a for loop. The os.walk returns three values per iteration, a folder name, the names of the subfolders and the file name. The os.walk proceeds to go over every single file in the specified directory. All the returned file names will be unique, but since a folder can have many subfolders and subfolders can have many files, you’ll often see these two repeated.

Line 9: Remember, the os.walk function returns a list of subfolders. Hence we need to iterate over each of them individually before doing anything.

Line 10: The purpose of this line is to determine the location of the file. This file path does not however, include the name of the file. For instance, if the full file path is C://Users/CodersLegacy/testfile.txt, this will give us only C://Users/CodersLegacy.

Line 11: Just like we did with subfolders, here we will iterate over each file individually.

Line 12: Utilization of the endswith function. Checks to see if the string ends with the same characters that were passed into it’s parameters. In this, we use it to check the file extension.

Line 13: If there was a match, we create the full file path by adding the file name to it as well.

Line 14: Printing of the full file path. Merely for debugging and observation purpose. Not required.

Line 15: Writes the file at the given file path to the ZipFile which we created earlier. There are multiple types of compression one may use while writing to these Zip files.

Line 16: Closes the ZipFile we opened, freeing up memory resources to be used elsewhere.

Side Notes:

filepath variables contains a concatenated string comprised of directory name + file name which gives a complete filepath.
Double slashes (“\\”) are compulsory as “\” is a special character. The first “\”, “escapes” the second “\”.
The path variable remains unused. In fact, it was not necessary to include the for loop for sub folders.
compress_type is an optional parameter, it does not have to be included. compress_level is another optional parameter that takes values from 0 – 9. (sometimes 1 – 9, depending on compress type).

This marks the end of the Python Zip Files Backup Guide. In the event you are able to devise a better backup system, let us know and we’ll see if we can put it up here. Any questions can be asked in the comments section below.