PyAutoGUI Screenshot and Image Recognition

PyAutoGUI is mostly known for it’s Mouse and Keyboard Automation. However, PyAutoGUI also has a wide range of other abilities, including the ability to take a full/partial screenshot of the screen and even some simple, but powerful Image Recognition abilities.


Taking Screenshots in PyAutoGUI

Let’s first take a look at how we can take Screenshots in PyAutoGUI. There are two ways we can take a screenshot, depending on our requirements.

If we need an image representation of our screen for use in our program, then we can simply call the screenshot() function in the following manner.

img = pyautogui.screenshot()

This returns a Pillow Image object (Pillow or PIL, is a Python Library for Images) that we can further use. We will later explore some image recognition based functions, where this might come in handy.

If you want to create and save an image file using the screenshot, then pass in a string containing the filename as shown below.

img = pyautogui.screenshot("screenshot.png')

The above code creates an image file on your system, called screenshot.png.


How to Screenshot a portion of the Screen

Taking a full-sized screenshot is not an immediate process, and takes a little time (about 100 milliseconds on a 1920 x 1080 screen). If you don’t require a screenshot of the whole screen, and just a portion of it, then you can save memory and processing time by simply telling PyAutoGUI the locations and dimensions of the portion that you want.

The below examples illustrates how we can do this.

img = pyautogui.screenshot(region=(0, 0, 400, 300))

The above code returns an image object, that contains a 400×300 section of your screen, starting from the top-left corner. The first two parameters specify the origin point of the partial screenshot to be taken. The third and fourth parameters specify the width and weight of the screenshot to be taken.


Image Location and Recognition

There are 5 functions which PyAutoGUI offers for the Image Recognition.

  1. locateOnScreen(image) -> Returns (left, top, width, height) coordinate of first found instance of the image on the screen.
  2. locateCenterOnScreen(image) -> Returns the x and y coordinates of the center of the first found instance of the image on the screen.
  3. locateAllOnScreen(image) ->  Returns a list with (left, top, width, height) tuples for each image found on the screen.
  4. locate(needleImage, haystackImage, grayscale=False) -> Returns the (left, top, width, height) coordinate of the first found instance of needleImage in haystackImage.
  5. locateAll(needleImage, haystackImage, grayscale=False) -> Returns a generator that yields (left, top, width, height) tuples for the locations where the needleImage is found in haystackImage.

Let’s take a look at a simple example. Here we have a simple Chrome Icon, that we will attempt to locate on our screens.

Finding a Chrome Icon using PyAutoGUI

The below code will attempt to locate this image and move the mouse over it.

import pyautogui

box = pyautogui.locateOnScreen("chrome_icon.png")
pyautogui.moveTo(box.left, box.top, duration = 1)
print(box)    # Box contains left, top, width and height values

Open the image somewhere on your screen, and make sure it isn’t being blocked by any other window. It should be visible to you in-order for PyAutoGUI to find it accurately. If the image was not found, the locate function(s) will return a None type object.

The other functions also operate very similarly. A personal favorite of mine is the locateCenterOnScreen() function, as it returns the co-ordinates of the center of the image that it finds.

import pyautogui

x, y= pyautogui.locateCenterOnScreen("chrome_icon.png")
pyautogui.moveTo(x, y, duration = 1)
print(x, y)

For a more practical implementation of these Image Recognition functions, check out our “Automating a Calculator with PyAutoGUI” tutorial.


PyAutoGUI with OpenCV

If your code still isn’t working, refer to our article that talks about how PyAutoGUI can be used with OpenCV to actually detect partial matches. For example, you could look for a 90% match, or a 70% match. Using this option helps avoids scenarios where due to minor pixel differences, the image is not found.

This is commonly observed when using JPG files, as they have been compressed and slightly modified in the process (on the pixel level).


This marks the end of the PyAutoGUI Screenshot and Image recognition tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content can be asked in the comments section below.

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments