Selenium: How to scroll to the Bottom of the Page (Python)

In this tutorial, we will explore how to use Selenium in Python to scroll to the bottom of a webpage gradually. Scrolling is often required when dealing with dynamically loading content or capturing data from a website that requires scrolling to access all information.

Common examples of such sites are YouTube (shorts) where videos are generated as you scroll down. Another example is e-commerce sites which display alot of products, and use lazy-loading on their images (to load the images only as you scroll down to them).

Many such sites exist, hence learning how to automate scrolling is necessary.


Prerequisites

Before you begin, make sure you have Python and the necessary packages installed. You can install the required packages using the following:

pip install selenium webdriver_manager

Setting up Selenium with ChromeDriver

First, let’s set up Selenium with ChromeDriver. We’ll use webdriver_manager to automatically download and manage the ChromeDriver executable (Follow the link to learn how to use this module for the other browsers).

from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium import webdriver
import time

# Set up ChromeDriver
service = ChromeService(executable_path=ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)

Next, we load up the webpage we wish to scrape (or automate). We will be using Wikipedia as an example, as its main page is usually lengthy enough to demonstrate the scroll effect.

driver.get("https://en.wikipedia.org/wiki/Main_Page")

You may replace this with any other URL. The scrolling technique we will learn in this tutorial is not url-specific.


Scrolling Down Gradually

Now, let’s define a JavaScript function to scroll down gradually. This function will be executed using the execute_script function, which takes JavaScript code as a parameter in the form of a string.

scroll_script = """
    function scrollToBottom() {
        var scrollHeight = Math.max(document.documentElement.scrollHeight, document.body.scrollHeight);
        var scrollStep = 200;  // Adjust this value to control the scroll speed
        var scrollInterval = 100;  // Adjust this value to control the scroll interval

        function scroll() {
            window.scrollBy(0, scrollStep);
            var reachedBottom = window.innerHeight + window.scrollY >= scrollHeight;

            if (!reachedBottom) {
                setTimeout(scroll, scrollInterval);
            }
        }

        scroll();
    }

    scrollToBottom();
"""

driver.execute_script(scroll_script)
driver.quit()

Adjust the values of scrollStep and scrollInterval to control the scroll speed and interval between scrolls, respectively. If your internet connection is slow, or the website is heavy/slow, you may want to reduce the scroll speed (by lowering interval and scroll step) so that the site loads alongside the scroll.

This code may seem a little complicated, but just remember there is only one “core” line of code here, the window.scrollBy function.

window.scrollBy(0, scrollStep)

This function is responsible for the actual “scrolling”. The rest of the code is simply built around this line, to ensure we don’t scroll beyond the limit of the page. The window object contains useful information about the page which can be used to determine when we have reached the end.

var reachedBottom = window.innerHeight + window.scrollY >= scrollHeight;

The above line is an expression which evaluates to True if we have reached the bottom of the page. It is calculated by checking if the sum of the current viewport height and scroll position is equal to or greater than the total scrollable height.


This marks the end of the Selenium Tutorial on How to scroll to the Bottom of the Page. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content can be asked in the comments section below.

Subscribe
Notify of
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments