How to Find elements by Class in BeautifulSoup

BeautifulSoup is a Python library used for web scraping and parsing HTML and XML documents. One of the most commonly used features of BeautifulSoup is the ability to find elements in an HTML document based on their class attribute. In this tutorial, we will learn how to find elements by class in BeautifulSoup, including how to find elements with multiple classes.

Before we get started, make sure you have installed BeautifulSoup by running the following command in your command prompt or terminal:

pip install beautifulsoup4

Finding elements by class using the find_all() method

The most common way to find elements by class in BeautifulSoup is to use the find_all() method. This method searches the HTML document for all elements with a specific class and returns a list of all matching elements.

The basic syntax for using the find_all() method to find elements by class is:

soup.find_all('tag', class_='class_name')

In the above syntax, replace tag with the HTML tag you want to search for, class_name with the name of the class you want to find, and soup with the BeautifulSoup object that represents the HTML document you want to search.


Example

Here is an example that demonstrates how to use the find_all() method to find all elements with a specific class in an HTML document.

In the below example, we first create a BeautifulSoup object from an HTML document that contains three <p> elements. Two of the <p> elements have the class “first”, whereas one of them has the class “second”. We then use the find_all() method to find all <p> elements with class “first”, and print the text content of each matching element.

from bs4 import BeautifulSoup

html = """
<div class="example">
    <p class="first">First paragraph</p>
    <p class="second">Second paragraph</p>
    <p class="first">Another first paragraph</p>
</div>
"""

soup = BeautifulSoup(html, 'html.parser')

# find all elements with class "first"
first_elements = soup.find_all('p', class_='first')

# print the text content of each "first" element
for elem in first_elements:
    print(elem.text)

Output:

First paragraph
Another first paragraph

Finding elements by multiple classes

You can also use the find_all() method to find elements with multiple classes. To do this, simply pass a list of class names to the class_ parameter of the find_all() method, like so:

soup.find_all('tag', class_=['class1', 'class2'])

Here is an example, where we look for any <p> elements, which have either the “first” or “second” classes. If any <p> element has both, it will also be returned.

from bs4 import BeautifulSoup

html = """
<div class="example">
    <p class="first second">First paragraph</p>
    <p class="second">Second paragraph</p>
    <p class="first">Another first paragraph</p>
</div>
"""

soup = BeautifulSoup(html, 'html.parser')

# find all elements with both class "first" and "second"
elements = soup.find_all('p', class_=['first', 'second'])

# print the text content of each matching element
for elem in elements:
    print(elem.text)

Output:

First paragraph
Another first paragraph
Second paragraph

Sometimes you may only the <p> elements which have “both” the “first” and “second” classes. Basically we want a certain combination of classes. In order to do this, we can do the following:

from bs4 import BeautifulSoup

html = """
<div class="example">
    <p class="first second">First paragraph</p>
    <p class="second">Second paragraph</p>
    <p class="first">Another first paragraph</p>
    <p class="first third">First and third paragraph</p>
</div>
"""

soup = BeautifulSoup(html, 'html.parser')

# find all elements with both class "first" and "second"
elements = soup.find_all('p', class_='first second')

# print the text content of each matching element
for elem in elements:
    print(elem.text)

Output:

First paragraph

To summarize, we didn’t use a list of classes, rather we passed a single string into the class_ parameter, which contained both classes, separated by a single space.


This marks the end of the How to Find elements by Class in BeautifulSoup Tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the article content can be asked in the comments section below.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments