Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Learning Python Networking
Learning Python Networking

Learning Python Networking: Utilize Python 3 to get network applications up and running quickly and easily

Arrow left icon
Profile Icon Samuel B Washington Profile Icon Dr. M. O. Faruque Sarker Profile Icon Sam Washington
Arrow right icon
$29.99 $43.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.7 (7 Ratings)
eBook Jun 2015 320 pages 1st Edition
eBook
$29.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Samuel B Washington Profile Icon Dr. M. O. Faruque Sarker Profile Icon Sam Washington
Arrow right icon
$29.99 $43.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.7 (7 Ratings)
eBook Jun 2015 320 pages 1st Edition
eBook
$29.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$29.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Table of content icon View table of contents Preview book icon Preview Book

Learning Python Networking

Chapter 2. HTTP and Working with the Web

The Hypertext Transfer Protocol (HTTP) is probably the most widely-used application layer protocol. It was originally developed to allow academics to share HTML documents. Nowadays, it is used as the core protocol of innumerable applications across the Internet, and it is the principle protocol of the World Wide Web.

In this chapter, we will cover the following topics:

  • The HTTP protocol structure
  • Using Python for talking to services through HTTP
  • Downloading files
  • HTTP capabilities, such as compression and cookies
  • Handling errors
  • URLs
  • The Python standard library urllib package
  • Kenneth Reitz's third-party Requests package

The urllib package is the recommended Python standard library package for HTTP tasks. The standard library also has a low-level module called http. Although this offers access to almost all aspects of the protocol, it has not been designed for everyday use. The urllib package has a simpler interface, and it deals with everything...

Request and response

HTTP is an application layer protocol, and it is almost always used on top of TCP. The HTTP protocol has been deliberately defined to use a human-readable message format, but it can still be used for transporting arbitrary bytes data.

An HTTP exchange consists of two elements. A request made by the client, which asks the server for a particular resource specified by a URL, and a response, sent by the server, which supplies the resource that the client has asked for. If the server can't provide the resource that the client has requested, then the response will contain information about the failure.

This order of events is fixed in HTTP. All interactions are initiated by the client. The server never sends anything to the client without the client explicitly asking for it.

This chapter will teach you how to use Python as an HTTP client. We will learn how to make requests to servers and then interpret their responses. We will look at writing server-side applications...

Requests with urllib

We have already seen some examples of HTTP exchanges while discussing the RFC downloaders in Chapter 1, Network Programming and Python. The urllib package is broken into several submodules for dealing with the different tasks that we may need to perform when working with HTTP. For making requests and receiving responses, we employ the urllib.request module.

Retrieving the contents of a URL is a straightforward process when done using urllib. Load your Python interpreter and do the following:

>>> from urllib.request import urlopen
>>> response = urlopen('http://www.debian.org')
>>> response
<http.client.HTTPResponse object at 0x7fa3c53059b0>
>>> response.readline()
b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">\n'

We use the urllib.request.urlopen() function for sending a request and receiving a response for the resource at http://www.debian...

Response objects

Let's take a closer look at our response object. We can see from the preceding example that urlopen() returns an http.client.HTTPResponse instance. The response object gives us access to the data of the requested resource, and the properties and the metadata of the response. To view the URL for the response that we received in the previous section, do this:

>>> response.url
'http://www.debian.org'

We get the data of the requested resource through a file-like interface using the readline() and read() methods. We saw the readline() method in the previous section. This is how we use the read() method:

>>> response = urlopen('http://www.debian.org')
>>> response.read(50)
b'g="en">\n<head>\n  <meta http-equiv="Content-Type" c'

The read() method returns the specified number of bytes from the data. Here it's the first 50 bytes. A call to the read() method with no argument will return...

Status codes

What if we wanted to know whether anything unexpected had happened to our request? Or what if we wanted to know whether our response contained any data before we read the data out? Maybe we're expecting a large response, and we want to quickly see if our request has been successful without reading the whole response.

HTTP responses provide a means for us to do this through status codes. We can read the status code of a response by using its status attribute.

>>> response.status
200

Status codes are integers that tell us how the request went. The 200 code informs us that everything went fine.

There are a number of codes, and each one conveys a different meaning. According to their first digit, status codes are classified into the following groups:

  • 100: Informational
  • 200: Success
  • 300: Redirection
  • 400: Client error
  • 500: Server error

A few of the more frequently encountered codes and their messages are as follows:

  • 200: OK
  • 404: Not Found
  • 500: Internal Server Error

The official...

Handling problems

Status codes help us to see whether our response was successful or not. Any code in the 200 range indicates a success, whereas any code in either the 400 range or the 500 range indicates failure.

Status codes should always be checked so that our program can respond appropriately if something goes wrong. The urllib package helps us in checking the status codes by raising an exception if it encounters a problem.

Let's go through how to catch these and handle them usefully. For this try the following command block:

>>> import urllib.error
>>> from urllib.request import urlopen
>>> try:
...   urlopen('http://www.ietf.org/rfc/rfc0.txt')
... except urllib.error.HTTPError as e:
...   print('status', e.code)
...   print('reason', e.reason)
...   print('url', e.url)
...
status: 404
reason: Not Found
url: http://www.ietf.org/rfc/rfc0.txt

Here we've requested RFC 0, which doesn't exist. So the server has...

Request and response


HTTP is an application layer protocol, and it is almost always used on top of TCP. The HTTP protocol has been deliberately defined to use a human-readable message format, but it can still be used for transporting arbitrary bytes data.

An HTTP exchange consists of two elements. A request made by the client, which asks the server for a particular resource specified by a URL, and a response, sent by the server, which supplies the resource that the client has asked for. If the server can't provide the resource that the client has requested, then the response will contain information about the failure.

This order of events is fixed in HTTP. All interactions are initiated by the client. The server never sends anything to the client without the client explicitly asking for it.

This chapter will teach you how to use Python as an HTTP client. We will learn how to make requests to servers and then interpret their responses. We will look at writing server-side applications in Chapter...

Requests with urllib


We have already seen some examples of HTTP exchanges while discussing the RFC downloaders in Chapter 1, Network Programming and Python. The urllib package is broken into several submodules for dealing with the different tasks that we may need to perform when working with HTTP. For making requests and receiving responses, we employ the urllib.request module.

Retrieving the contents of a URL is a straightforward process when done using urllib. Load your Python interpreter and do the following:

>>> from urllib.request import urlopen
>>> response = urlopen('http://www.debian.org')
>>> response
<http.client.HTTPResponse object at 0x7fa3c53059b0>
>>> response.readline()
b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">\n'

We use the urllib.request.urlopen() function for sending a request and receiving a response for the resource at http://www.debian.org, in this case an HTML page. We...

Response objects


Let's take a closer look at our response object. We can see from the preceding example that urlopen() returns an http.client.HTTPResponse instance. The response object gives us access to the data of the requested resource, and the properties and the metadata of the response. To view the URL for the response that we received in the previous section, do this:

>>> response.url
'http://www.debian.org'

We get the data of the requested resource through a file-like interface using the readline() and read() methods. We saw the readline() method in the previous section. This is how we use the read() method:

>>> response = urlopen('http://www.debian.org')
>>> response.read(50)
b'g="en">\n<head>\n  <meta http-equiv="Content-Type" c'

The read() method returns the specified number of bytes from the data. Here it's the first 50 bytes. A call to the read() method with no argument will return all the data in one go.

The file-like interface is limited...

Status codes


What if we wanted to know whether anything unexpected had happened to our request? Or what if we wanted to know whether our response contained any data before we read the data out? Maybe we're expecting a large response, and we want to quickly see if our request has been successful without reading the whole response.

HTTP responses provide a means for us to do this through status codes. We can read the status code of a response by using its status attribute.

>>> response.status
200

Status codes are integers that tell us how the request went. The 200 code informs us that everything went fine.

There are a number of codes, and each one conveys a different meaning. According to their first digit, status codes are classified into the following groups:

  • 100: Informational

  • 200: Success

  • 300: Redirection

  • 400: Client error

  • 500: Server error

A few of the more frequently encountered codes and their messages are as follows:

  • 200: OK

  • 404: Not Found

  • 500: Internal Server Error

The official...

Handling problems


Status codes help us to see whether our response was successful or not. Any code in the 200 range indicates a success, whereas any code in either the 400 range or the 500 range indicates failure.

Status codes should always be checked so that our program can respond appropriately if something goes wrong. The urllib package helps us in checking the status codes by raising an exception if it encounters a problem.

Let's go through how to catch these and handle them usefully. For this try the following command block:

>>> import urllib.error
>>> from urllib.request import urlopen
>>> try:
...   urlopen('http://www.ietf.org/rfc/rfc0.txt')
... except urllib.error.HTTPError as e:
...   print('status', e.code)
...   print('reason', e.reason)
...   print('url', e.url)
...
status: 404
reason: Not Found
url: http://www.ietf.org/rfc/rfc0.txt

Here we've requested RFC 0, which doesn't exist. So the server has returned a 404 status code, and urllib has spotted this...

HTTP headers


Requests, and responses are made up of two main parts, headers and a body. We briefly saw some HTTP headers when we used our TCP RFC downloader in Chapter 1, Network Programming and Python. Headers are the lines of protocol-specific information that appear at the beginning of the raw message that is sent over the TCP connection. The body is the rest of the message. It is separated from the headers by a blank line. The body is optional, its presence depends on the type of request or response. Here's an example of an HTTP request:

GET / HTTP/1.1
Accept-Encoding: identity
Host: www.debian.com
Connection: close
User-Agent: Python-urllib/3.4

The first line is called the request line. It is comprised of the request method, which is GET in this case, the path to the resource, which is / here, and the HTTP version, 1.1. The rest of the lines are request headers. Each line is comprised of a header name followed by a colon and a header value. The request in the preceding output only contains...

Customizing requests


To make use of the functionality that headers provide, we add headers to a request before sending it. To do this, we can't just use urlopen(). We need to follow these steps:

  • Create a Request object

  • Add headers to the request object

  • Use urlopen() to send the request object

We're going to learn how to customize a request for retrieving a Swedish version of the Debian home page. We will use the Accept-Language header, which tells the server our preferred language for the resource it returns. Note that not all servers hold versions of resources in multiple languages, so not all servers will respond to Accept-LanguageLinux home page.

First, we create a Request object:

>>> from urllib.request import Request
>>> req = Request('http://www.debian.org')

Next we add the header:

>>> req.add_header('Accept-Language', 'sv')

The add_header() method takes the name of the header and the contents of the header as arguments. The Accept-Language header takes two-letter...

Content negotiation


Content compression with the Accept-Encoding header and language selection with the Accept-Language header are examples of content negotiation, where the client specifies its preferences regarding the format and the content of the requested resource. The following headers can also be used for this:

  • Accept: For requesting a preferred file format

  • Accept-Charset: For requesting the resource in a preferred character set

There are additional aspects to the content negotiation mechanism, but because it's inconsistently supported and it can become quite involved, we won't be covering it in this chapter. RFC 7231 contain all the details that you need. Take a look at sections such as 3.4, 5.3, 6.4.1, and 6.5.6, if you find that your application requires this.

Content types

HTTP can be used as a transport for any type of file or data. The server can use the Content-Type header in a response to inform the client about the type of data that it has sent in the body. This is the primary...

User agents


Another request header worth knowing about is the User-Agent header. Any client that communicates using HTTP can be referred to as a user agent. RFC 7231 suggests that user agents should use the User-Agent header to identify themselves in every request. What goes in there is up to the software that makes the request, though it usually comprises a string that identifies the program and version, and possibly the operating system and the hardware that it's running on. For example, the user agent for my current version of Firefox is shown here:

Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20140722 Firefox/24.0 Iceweasel/24.7.0

Although it has been broken over two lines here, it is a single long string. As you can probably decipher, I'm running Iceweasel (Debian's version of Firefox) version 24 on a 64-bit Linux system. User agent strings aren't intended for identifying individual users. They only identify the product that was used for making the request.

We can view the user agent...

Cookies


A cookie is a small piece of data that the server sends in a Set-Cookie header as a part of the response. The client stores cookies locally and includes them in any future requests that are sent to the server.

Servers use cookies in various ways. They can add a unique ID to them, which enables them to track a client as it accesses different areas of a site. They can store a login token, which will automatically log the client in, even if the client leaves the site and then accesses it later. They can also be used for storing the client's user preferences or snippets of personalizing information, and so on.

Cookies are necessary because the server has no other way of tracking a client between requests. HTTP is called a stateless protocol. It doesn't contain an explicit mechanism for a server to know for sure that two requests have come from the same client. Without cookies to allow the server to add some uniquely identifying information to the requests, things such as shopping carts...

Redirects


Sometimes servers move their content around. They also make some content obsolete and put up new stuff in a different location. Sometimes they'd like us to use the more secure HTTPS protocol instead of HTTP. In all these cases, they may get traffic that asks for the old URLs, and in all these cases they'd probably prefer to be able to automatically send visitors to the new ones.

The 300 range of HTTP status codes is designed for this purpose. These codes indicate to the client that further action is required on their part to complete the request. The most commonly encountered action is to retry the request at a different URL. This is called a redirect.

We'll learn how this works when using urllib. Let's make a request:

>>> req = Request('http://www.gmail.com')
>>> response = urlopen(req)

Simple enough, but now, look at the URL of the response:

>>> response.url
'https://accounts.google.com/ServiceLogin?service=mail&passive=true&r m=false...'

This...

URLs


Uniform Resource Locators, or URLs are fundamental to the way in which the web operates, and they have been formally described in RFC 3986. A URL represents a resource on a given host. How URLs map to the resources on the remote system is entirely at the discretion of the system admin. URLs can point to files on the server, or the resources may be dynamically generated when a request is received. What the URL maps to though doesn't matter as long as the URLs work when we request them.

URLs are comprised of several sections. Python uses the urllib.parse module for working with URLs. Let's use Python to break a URL into its component parts:

>>> from urllib.parse import urlparse
>>> result = urlparse('http://www.python.org/dev/peps')
>>> result
ParseResult(scheme='http', netloc='www.python.org', path='/dev/peps', params='', query='', fragment='')

The urllib.parse.urlparse() function interprets our URL and recognizes http as the scheme, https://www.python.org...

Left arrow icon Right arrow icon

Description

If you're a Python developer or a system administrator with Python experience and you're looking to take your first steps in network programming, then this book is for you. Basic knowledge of Python is assumed.

What you will learn

  • Develop an understanding of network stacks and the power of encapsulation
  • Design highperformance network server applications
  • Implement socketbased network applications using asynchronous models
  • Build client applications for major web APIs, including Amazon S3 and Twitter
  • Interact with email servers using SMTP, POP3, and IMAP protocols
  • Deal with remote network servers using SSH, FTP, SNMP, SMB/CIFS, and LDAP protocols
  • Work with IP addresses including GeoIP lookups
  • Download objects from the Web and craft custom HTTP requests with urllib and the Requests library

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 17, 2015
Length: 320 pages
Edition : 1st
Language : English
ISBN-13 : 9781784391157
Languages :
Concepts :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want

Product Details

Publication date : Jun 17, 2015
Length: 320 pages
Edition : 1st
Language : English
ISBN-13 : 9781784391157
Languages :
Concepts :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 158.97
Learning Penetration Testing with Python
$54.99
Learning Python Networking
$54.99
Python Web Penetration Testing Cookbook
$48.99
Total $ 158.97 Stars icon

Table of Contents

11 Chapters
1. Network Programming and Python Chevron down icon Chevron up icon
2. HTTP and Working with the Web Chevron down icon Chevron up icon
3. APIs in Action Chevron down icon Chevron up icon
4. Engaging with E-mails Chevron down icon Chevron up icon
5. Interacting with Remote Systems Chevron down icon Chevron up icon
6. IP and DNS Chevron down icon Chevron up icon
7. Programming with Sockets Chevron down icon Chevron up icon
8. Client and Server Applications Chevron down icon Chevron up icon
9. Applications for the Web Chevron down icon Chevron up icon
A. Working with Wireshark Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.7
(7 Ratings)
5 star 42.9%
4 star 14.3%
3 star 28.6%
2 star 0%
1 star 14.3%
Filter icon Filter
Top Reviews

Filter reviews by




homie Jan 10, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book for those who are new to Network programming.
Amazon Verified review Amazon
Keep Learning Nov 19, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
If you are not clear/scared/confused about the topics revolving around Networking (IP Address, Protocols, Sockets, Security etc.) then this book will make sure you are not the same person after completing it.A great introduction to Networking for any Coder. It creates a good foundation, the topics covered are sufficient for any Mobile or WebDeveloper or any other developer to get the basics correct. Later if required they can explore more from other resources.Python is always are great language to learn any new topic or domain because1. Python is easy and fun.2. Python has very good set of libraries to make your life easy.
Amazon Verified review Amazon
Timoteo Sep 23, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This is a great introduction to network programming. I have no experience in the topic, but I felt the author walked through each topic very thoroughly. I would highly recommend this book for anyone interested in networking. I do recommend being at an intermediate-level python programmer, as it will be easy to follow the programming. I will buy his companion networking cookbook as well.
Amazon Verified review Amazon
Nick K Jan 05, 2016
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Sure I thought it's quite useful. For specialist knowledge I needed a specialist book. This does just that. Highly recommended.
Amazon Verified review Amazon
Ryan Jun 25, 2018
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
Most of the content is okay, I guessBut you're going to spend most of your time making the codefrom the book actually compile. Also, the errata sucks and the author is apparentlynot available to verify errata submissions.meh..
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.