Welcome! In this lesson, we will take our first steps into the world of gathering data from the Web in Python using the requests
library. You will understand how to retrieve web pages and display their content. Let's get our hands dirty with requests
!
In modern web development, data exchange between the client (your web browser or application) and the server (where the data is stored) is handled through HTTP requests. We generally use four types of requests, namely GET, POST, PUT, and DELETE, for fetching, sending, updating, and deleting data respectively. But for now, let's focus on the GET request, which we use to fetch data, such as the HTML code of a web page.
Python provides us with a wonderful library, requests
, to handle these HTTP requests with ease in our Python programs. The requests
library abstracts the complexities of making HTTP requests behind a simple API, allowing you to send HTTP requests with just a few lines of code.
Understanding the concept of HTTP requests, let's move on to how we can fetch a web page's content using Python requests
.
Python1import requests 2url = 'http://quotes.toscrape.com' 3response = requests.get(url)
Here, we have imported the requests
library and then used the get
function to send a GET request to the URL http://quotes.toscrape.com. The response from the server is stored in the variable response
.
How do we know if our fetch operation was successful? It's quite simple - we check the HTTP response status code. A status code of 200 means the request was successful. Anything in the range of 400-499 indicates a client-side error, and anything between 500-599 indicates a server-side error.
Our response
object has an attribute ok
which returns True if our request was successful (status code less than 400). Let's write some code to validate this:
Python1if response.ok: 2 print("Content fetched successfully!")
The output of the above code will be:
Plain text1Content fetched successfully!
This output confirms that the content was successfully fetched from the provided URL.
So far so good. But we haven't done much with the content we've fetched. Let's print it out.
Python1print(response.text[:500]) # Display the first 500 characters of the webpage content
The output will be:
Plain text1<!DOCTYPE html> 2<html lang="en"> 3<head> 4 <meta charset="UTF-8"> 5 <title>Quotes to Scrape</title> 6 <link rel="stylesheet" href="/static/bootstrap.min.css"> 7 <link rel="stylesheet" href="/static/main.css"> 8</head> 9<body> 10 <div class="container"> 11 <div class="row header-box"> 12 <div class="col-md-8"> 13 <h1> 14 <a href="/" style="text-decoration: none">Quotes to Scrape</a> 15 </h1> 16 </div> 17 <div class="col-md
This output shows the HTML content of the Quotes to Scrape webpage, demonstrating how the requests
library can fetch the HTML data from a website.
By running the entire snippet:
Python1import requests 2 3# Fetch content from a website 4url = 'http://quotes.toscrape.com' 5response = requests.get(url) 6 7if response.ok: 8 print("Content fetched successfully!") 9 print(response.text[:500]) # Display the first 500 characters of the webpage content 10else: 11 print("Failed to fetch content.")
And voila! You can now fetch and display content from a web page using the Python requests
library.
You now understand how to fetch data from a web page using Python requests
- a basic, yet crucial aspect of programming that's applicable in many areas, such as writing a web scraper or interacting with an API.
Well done! You've mastered the basic concept of fetching web page data using Python's requests
library. The more you practice, the better you'll get. So, try to fetch content from different URLs and have fun browsing through the HTML content. Keep coding, keep exploring, and see you in the next lesson!