Downloading web pages
The basic ability to download a web page involves making an HTTP GET
request against a URL. This is the basic operation of any web browser.
Let's quickly recap the different parts of this operation, as it has three distinct elements:
- Using the HTTP protocol. This deals with the way the request is structured.
- Using the
GET
method, which is the most common HTTP method. We'll see more in the Accessing web APIs recipe. - A full URL describing the address of the page, including the server (for example:Â
mypage.com
) and the path (for example:/page
).
That request will be routed toward the server by the internet and processed by the server, then a response will be sent back. This response will contain a status code, typically 200 if everything went fine, and a body with the result, which will normally be text with an HTML page.
Most of this is handled automatically by the HTTP client used to perform the request. We...