Python BeautifulSoup: Extract Text from HTML Tags

Here's the Python code to find the text of an <h> tag with a specified heading and all <p> tags under that <h> tag:

from bs4 import BeautifulSoup

# Read the HTML file
with open("index.html") as file:
    html = file.read()

# Create a BeautifulSoup object
soup = BeautifulSoup(html, "html.parser")

# Specify the heading you want to find
heading = "Heading 1"

# Find the &#x3C;h&#x3E; tag with the specified heading
h_tag = soup.find("h1", string=heading)

# Find all the &#x3C;p&#x3E; tags under the &#x3C;h&#x3E; tag
p_tags = h_tag.find_all_next("p")

# Get the text of the &#x3C;p&#x3E; tags
p_texts = [p.get_text() for p in p_tags]

# Print the results
print(f"The &#x3C;h&#x3E; tag found is &#x27;{h_tag.text}&#x27;")
print("The text of all the &#x3C;p&#x3E; tags under that &#x3C;h&#x3E; tag:")
print("\n".join(p_texts))

Make sure to have the BeautifulSoup library installed (pip install beautifulsoup4) before running this code. Also, adjust the file path in the open() function to match the location of your HTML file.

Python BeautifulSoup: Extract Text from HTML Tags