An introduction to TCP/IP networks
The Internet protocol suite, often referred to as TCP/IP, is a set of protocols designed to work together to provide end-to-end transmission of messages across interconnected networks.
The following discussion is based on Internet Protocol version 4 (IPv4). Since the Internet has run out of IPv4 addresses, a new version, IPv6, has been developed, which is intended to resolve this situation. However, although IPv6 is being used in a few areas, its deployment is progressing slowly and a majority of the Internet will likely be using IPv4 for a while longer. We'll focus on IPv4 in this section, and then we will discuss the relevant changes in IPv6 in second part of this chapter.
TCP/IP is specified in documents called Requests for Comment (RFCs) which are published by the Internet Engineering Task Force (IETF). RFCs cover a wide range of standards and TCP/IP is just one of these. They are freely available on the IETF's website, which can be found at www.ietf.org/rfc.html. Each RFC has a number, IPv4 is documented by RFC 791, and other relevant RFCs will be mentioned as we progress.
Note that you won't learn how to set up your own network in this chapter because that's a big topic and unfortunately, somewhat beyond the scope of this book. But, it should enable you at least to have a meaningful conversation with your network support people!
IP addresses
So, let's get started with something you're likely to be familiar with, that is, IP addresses. They typically look something like this:
203.0.113.12
They are actually a single 32-bit number, though they are usually written just like the number shown in the preceding example; they are written in the form of four decimal numbers that are separated by dots. The numbers are sometimes called octets or bytes because each one represents 8-bits of the 32-bit number. As such, each octet can only take values from 0 to 255, so valid IP addresses range from 0.0.0.0 to 255.255.255.255. This way of writing IP addresses is called dot-decimal notation.
IP addresses perform two main functions. They are as follows:
- They uniquely address each device that is connected to a network
- They help the traffic to be routed between networks
You may have noticed that the network-connected devices that you use have IP addresses assigned to them. Each IP address that is assigned to a network device is unique and no two devices can share an IP address.
Network interfaces
You can find out what IP addresses have been assigned to your computer by running ip addr
(or ipconfig /all
on Windows) on a terminal. In Chapter 6, IP and DNS, we'll see how to do this when using Python.
If we run one of these commands, then we can see that the IP addresses are assigned to our device's network interfaces. On Linux, these will have names, such as eth0
; on Windows these will have phrases, such as Ethernet adapter Local Area Connection
.
You will get the following output when you run the ip addr
command on Linux:
$ ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether b8:27:eb:5d:7f:ae brd ff:ff:ff:ff:ff:ff inet 192.168.0.4/24 brd 192.168.0.255 scope global eth0 valid_lft forever preferred_lft forever
In the preceding example, the IP addresses for the interfaces appear after the word inet
.
An interface is a device's physical connection to its network media. It could be a network card that connects to a network cable, or a radio that uses a specific wireless technology. A desktop computer may only have a single interface for a network cable, whereas a Smartphone is likely to have at least two interfaces, one for connecting to Wi-Fi networks and one for connecting to mobile networks that use 4G or other technologies.
An interface is usually assigned only one IP address, and each interface in a device has a different IP address. So, going back to the purposes of IP addresses discussed in the preceding section, we can now more accurately say that their first main function is to uniquely address each device's connection to a network.
Every device has a virtual interface called the
loopback interface, which you can see in the preceding listing as interface 1
. This interface doesn't actually connect to anything outside the device, and only the device itself can communicate with it. While this may sound a little redundant, it's actually very useful when it comes to local network application testing, and it can also be used as a means of inter-process communication. The loopback interface is often referred to as
localhost, and it is almost always assigned the IP address 127.0.0.1.
Assigning IP addresses
IP addresses can be assigned to a device by a network administrator in one of two ways: statically, where the device's operating system is manually configured with the IP address, or dynamically, where the device's operating system is configured by using the Dynamic Host Configuration Protocol (DHCP).
When using DHCP, as soon as the device first connects to a network, it is automatically allocated an address by a DHCP server from a predefined pool. Some network devices, such as home broadband routers provide a DHCP server service out-of-the-box, otherwise a DHCP server must be set up by a network administrator. DHCP is widely deployed, and it is particularly useful for networks where different devices may frequently connect and disconnect, such as public Wi-Fi hotspots or mobile networks.
IP addresses on the Internet
The Internet is a huge IP network, and every device that sends data over it is assigned an IP address.
The IP address space is managed by an organization called the Internet Assigned Numbers Authority (IANA). IANA decides the global allocation of the IP address ranges and assigns blocks of addresses to Regional Internet Registries (RIRs) worldwide, who then allocate address blocks to countries and organizations. The receiving organizations have the freedom to allocate the addresses from their assigned blocks as they like within their own networks.
There are some special IP address ranges. IANA has defined ranges of private addresses. These ranges will never be assigned to any organization, and as such these are available for anyone to use for their networks. The private address ranges are as follows:
- 10.0.0.0 to 10.255.255.255
- 172.16.0.0 to 172.31.255.255
- 192.168.0.0 to 192.168.255.255
You may be thinking that if anybody can use them, then would'nt that mean that devices on the Internet will end up using the same addresses, thereby breaking IP's unique addressing property? This is a good question, and this problem has been avoided by forbidding traffic from private addresses from being routed over the public Internet. Wherever a network using private addresses needs to communicate with the public Internet, a technique called Network Address Translation (NAT) is used, which essentially makes the traffic from the private network appear to be coming from a single valid public Internet address, and this effectively hides the private addresses from the Internet. We'll discuss NAT later on.
If you inspect the output of ip addr
or ipconfig /all
on your home network, then you will find that your devices are using private range addresses, which would have been assigned to them by your broadband router through DHCP.
Packets
We'll be talking about network traffic in the following sections, so let's get an idea of what it is.
Many protocols, including the principle protocols in the Internet protocol suite, employ a technique called packetization to help manage data while it's being transmitted across a network.
When a packetizing protocol is given some data to transmit, it breaks it up into small units — sequences of bytes, typically a few thousand bytes long and then it prefixes each unit with some protocol-specific information. The prefix is called a header, and the prefix and data together form a packet. The data within a packet is often called its payload.
What a packet contains is shown in the following figure:
Some protocols use alternative terms for packets, such as frames, but we'll stick with the term packets for now. The header includes all the information that the protocol implementation running on another device needs to be able to interpret what the packet is and how to handle it. For example, the information in an IP packet header includes the source IP address, the destination IP address, the total length of the packet, and the checksum of the data in the header.
Once created, the packets are sent onto the network, where they are independently routed to their destination. Sending the data in packets has several advantages, including multiplexing (where more than one device can send data over the network at once), rapid notification of errors that may occur on the network, congestion control, and dynamic re-routing.
Protocols may call upon other protocols to handle their packets for them; passing their packets to the second protocol for delivery. When both the protocols employ packetization, nested packets result, as shown in the following figure:
This is called encapsulation, and as we'll see shortly, it is a powerful mechanism for structuring network traffic.
Networks
A network is a discrete collection of connected network devices. Networks can vary greatly in scale, and they can be made up of smaller networks. Your network-connected devices at home or the network-connected computers in a large office building are examples of networks.
There are quite a few ways of defining a network, some loose, some very specific. Depending on the context, networks can be defined by physical boundaries, administrative boundaries, institutional boundaries, or network technology boundaries.
For this section, we're going to start with a simplified definition of a network, and then work toward a more specific definition, in the form of IP subnets.
So for our simplified definition, our common defining feature of a network will be that all devices on the network share a single point of connection to the rest of the Internet. In some large or specialized networks, you will find that there is more than one point of connection, but for the sake of simplicity we'll stick to a single connection here.
This connection point is called a gateway, and usually it takes the form of a special network device called a router. The job of a router is to direct traffic between networks. It sits between two or more networks and is said to sit at the boundary of these networks. It always has two or more network interfaces; one for each network it is attached to. A router contains a set of rules called a routing table, which tells it how to direct the packets that are passing through it onwards, based on the packets' destination IP addresses.
The gateway forwards the packets to another router, which is said to be upstream, and is usually located at the network's Internet Service Provider (ISP). The ISP's router falls into a second category of routers, that is, it sits outside the networks described earlier, and routes traffic between network gateways. These routers are run by ISPs and other communications entities. They are generally arranged in tiers, and the upper regional tiers route the traffic for some large sections of countries or continents and form the Internet's backbone.
Because these routers can sit between many networks, their routing tables can become very extensive and they need to be updated continuously. A simplified illustration is shown in the following diagram:
The preceding diagram gives us an idea of the arrangement. Each ISP gateway connects an ISP network to the regional routers, and each home broadband router has a home network connected to it. In the real world, this arrangement gets more complicated as one goes toward the top. ISPs will often have more than one gateway connecting them to the regional routers, and some of these will also themselves be acting as regional routers. Regional routers also have more tiers than shown here, and they have many connections between one another, which are in arrangements that are much more complicated than this simple hierarchy. A rendering of a section of the Internet from data gathered in 2005 provides a beautiful illustration of just how complex this becomes, it can be found at http://en.wikipedia.org/wiki/Internet_backbone#/media/File:Internet_map_1024.jpg.
Routing with IP
We mentioned that routers are able to route traffic toward a destination network, and implied that this is somehow done by using IP addresses and routing tables. But what's really going on here?
One perhaps obvious method for routers to determine the correct router to forward traffic to would be to program every router's routing table with a route for every IP address. However, in practice, with 4 billion plus IP addresses and constantly changing network routes, this turns out to be a completely infeasible method.
So, how is routing done? The answer lies in another property of IP addresses. An IP address can be interpreted as being made up of two logical parts: a network prefix and a host identifier. The network prefix uniquely identifies the network a device is on, and the device can use this to determine how to handle traffic that it generates, or receives for forwarding. The network prefix is the first n bits of the IP address when it's written out in binary (remember an IP address is really just a 32-bit number). The n bits are supplied by the network administrator as a part of a device's network configuration at the same time that it is given its IP address.
You'll see that n is written in one of two ways. It can simply be appended to the IP address, separated by a slash, as follows:
192.168.0.186/24
This is called CIDR notation. Alternatively, it can be written as a subnet mask, which is sometimes just called a netmask. This is the way in which you will usually see n being specified in a device's network configuration. A subnet mask is a 32-bit number written in dot-decimal notation, just like an IP address.
255.255.255.0
This subnet mask is equivalent to /24
. We get n from it by looking at it in binary. A few examples are as follows:
255.0.0.0 = 11111111 00000000 00000000 00000000 = /8 255.192.0.0 = 11111111 11000000 00000000 00000000 = /10 255.255.255.0 = 11111111 11111111 11111111 00000000 = /24 255.255.255.240 = 11111111 11111111 11111111 11110000 = /28
n is simply the number of 1 bits in the subnet mask. (It's always the leftmost bits that are set to 1 because this allows us to quickly get the Network prefix in binary by doing a bitwise AND
operation on the IP address and the subnet mask).
So, how does this help in routing? When a network device generates network traffic that needs to be sent across a network, it first compares the destination's IP address with its own network prefix. If the destination IP address has the same network prefix as that of the sending device, then the sending device will recognise that the destination device is on the same network and, therefore, it can then send the traffic directly to it. If the network prefixes differ, then it will send the message to its default gateway, which will forward it on towards the receiving device.
When a router receives traffic that has to be forwarded, it first checks whether the destination IP address matches the network prefix of any of the networks that it's connected to. If that is the case, then it will send the message directly to the destination device on that network. If not, it will consult its routing table. If it finds a matching rule, then it sends the message to the router that it found listed, and if there are no explicit rules defined, then it will send the traffic to its own default gateway.
When we create a network with a given network prefix, in the 32-bits of the IP address, the digits to the right of the network prefix are available for assignment to the network devices. We can calculate the number of the available addresses by raising 2 to the power of the number of available bits. For example, in a /28
network prefix, we have 4 bits left, which means that 16 addresses are available. In reality, we are able to assign fewer addresses, since two of the addresses in the calculated range are always reserved. These are: the first address in the range, which is called the
network address and the last address in the range, which is called the
broadcast address.
This range of addresses, which is identified by its network prefix, is called a subnet. Subnets are the basic unit of assignment when IANA, an RIR or an ISP allocates IP address blocks to organizations. Organizations assign subnets to their various networks.
Organizations can further partition their addresses into subnets simply by employing a longer network prefix than the one they had been assigned. They might do this either to make more efficient use of their addresses or to create a hierarchy of networks, which can be delegated across the organization.
DNS
We've discussed connecting to network devices by using IP addresses. However, unless you work with networks or in systems administration, it is unlikely that you will get to see an IP address very often, even though many of us use the Internet every day. When we browse the web or send an e-mail, we usually connect to servers using host names or domain names. These must somehow map to the servers' IP addresses. But how is this done?
Documented as RFC 1035, the Domain Name System (DNS) is a globally distributed database of mappings between hostnames and IP addresses. It is an open and hierarchical system with many organizations choosing to run their own DNS servers. DNS is also a protocol, which devices use to query DNS servers for resolving hostnames to IP addresses (and vice-versa).
The nslookup
tool comes with most Linux and Windows systems and it lets us query DNS on the command line, as follows:
$ nslookup python.org Server: 192.168.0.4 Address: 192.168.0.4#53 Non-authoritative answer: Name: python.org Address: 104.130.43.121
Here, we determined that the python.org
host has the IP address 104.130.42.121
. DNS distributes the work of looking up hostnames by using an hierarchical system of caching servers. When connecting to a network, your network device will be given a local DNS server through either DHCP or manually, and it will query this local server when doing DNS lookups. If that server doesn't know the IP address, then it will query its own configured higher tier server, and so on until an answer can be found. ISPs run their own DNS caching servers, and broadband routers often act as caching servers as well. In this example, my device's local server is 192.168.0.4
.
A device's operating system usually handles DNS, and it provides a programming interface, which applications use to ask it to resolve hostnames and IP addresses. Python provides an interface for this, which we'll discuss in Chapter 6, IP and DNS.
The protocol stack or why the Internet is like a cake
The Internet Protocol is a member of the set of protocols that make up the Internet protocol suite. Each protocol in the suite has been designed to solve specific problems in networking. We just saw how IP solves the problems of addressing and routing.
The core protocols in the suite are designed to work together within a stack. That is, each protocol occupies a layer within the stack, and the other protocols are situated above and below that layer. So, it is layered just like a cake. Each layer provides a specific service to the layers above it, while hiding the complexity of its own operation from them, following the principle of encapsulation. Ideally, each layer only interfaces with the layer below it in order to benefit from the entire range of the problem solving powers of all the layers below.
Python provides modules for interfacing with different protocols. As the protocols employ encapsulation, we typically only need to work with one module to leverage the power of the underlying stack, thus avoiding the complexity of the lower layers.
The TCP/IP Suite defines four layers, although five layers are often used for clarity. These are given in the following table:
Layer |
Name |
Example protocols |
---|---|---|
5 |
Application layer |
HTTP, SMTP, IMAP |
4 |
Transport layer |
TCP, UDP |
3 |
Network layer |
IP |
2 |
Data-link layer |
Ethernet, PPP, FDDI |
1 |
Physical layer |
- |
Layers 1 and 2 correspond to the first layer of the TCP/IP suite. These two bottom layers deal with the low level network infrastructure and services.
Layer 1 corresponds to the physical media of the network, such as a cable or a Wi-Fi radio. Layer 2 provides the service of getting the data from one network device to another, directly connected network device. This layer can employ all sorts of layer 2 protocols, such as Ethernet or PPP, as long as the Internet Protocol in layer 3 can ask it to get the data to the next device in the network by using any type of available physical medium.
We don't need to concern ourselves with the two lowest layers, since we will rarely need to interface with them when using Python. Their operation is almost always handled by the operating system and the network hardware.
Layer 3 is variously called the Network layer and the Internet layer. It exclusively employs the Internet Protocol. As we have already seen, it has been tasked primarily with internetwork addressing and routing. Again, we don't typically directly interface with this layer in Python.
Layers 4 and 5 are more interesting for our purposes.
Layer 4 – TCP and UDP
Layer 4 is the first layer that we may want to work with in Python. This layer can employ one of two protocols: the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP). Both of these provide the common service of end-to-end transportation of data between applications on different network devices.
Network ports
Although IP facilitates the transport of data from one network device to another, it doesn't provide us with a way of letting the destination device know what it should do with the data once it receives it. One possible solution to this would be to program every process running on the destination device to check all of the incoming data to see if they are interested in it, but this would quickly lead to obvious performance and security problems.
TCP and UDP provide the answer by introducing the concept of ports. A port is an endpoint, which is attached to one of the IP addresses assigned to the network device. Ports are claimed by a process running on the device, and the process is then said to be listening on that port. Ports are represented by a 16-bit number, so that each IP address on a device has 65,535 possible ports that the processes can claim (port number 0 is reserved). Ports can only be claimed by one process at a time, even though a process can claim more than one port at a time.
When a message is sent over the network through TCP or UDP, the sending application sets the destination port number in the header of the TCP or UDP packet. When the message arrives at the destination, the TCP or UDP protocol implementation running on the receiving device reads the port number and then delivers the message payload to the process that is listening on that port.
Port numbers need to be known before the messages are sent. The main mechanism for this is convention. In addition to managing the IP address space, it is also the responsibility of IANA to manage the assignment of port numbers to network services.
A service is a class of application, for example a web server, or a DNS server, which is usually tied to an application protocol. Ports are assigned to services rather than specific applications, because it gives service providers the flexibility to choose what kind of software they want to use to provide a service, without having to worry about the users who would need to look up and connect to a new port number simply because the server has started using Apache instead of IIS, for example.
Most operating systems contain a copy of this list of services and their assigned port numbers. On Linux, this is usually found at /etc/services
, and on Windows this is usually found at c:\windows\system32\drivers\etc\services
. The complete list can also be viewed online at http://www.iana.org/assignments/port-numbers.
TCP and UDP packet headers may also include a source port number. This is optional for UDP, but mandatory for TCP. The source port number tells the receiving application on the server where it should send replies to when sending data back to the client. Applications can specify the source port that they wish to use, or if a source port has not been specified for TCP, then one is assigned randomly by the operating system when the packet is sent. Once the OS has a source port number, it assigns it to the calling application and starts listening on it for a reply. If a reply is received on that port, then the received data is passed to the sending application.
So, both TCP and UCP provide an end-to-end transport for the application data through the provision of ports, and both of them employ the Internet Protocol to get the data to the destination device. Now, let's look at their features.
UDP
UDP is documented as RFC 768. It is deliberately uncomplicated: it provides no services other than those that we described in the previous section. It just takes the data that we want to send, packetizes it with the destination port number (and optional source port number), and hands it off to the local Internet Protocol implementation for delivery. Applications on the receiving end see the data in the same discrete chunks in which it was packetized.
Both IP and UDP are what are called connectionless protocols. This means that they attempt to deliver their packets on a best effort basis, but if something goes wrong, then they will just shrug their metaphorical shoulders and move on to delivering the next packet. There is no guarantee that our packets will reach their destinations, and no error notification if a delivery fails. If the packets do make it, then there is no guarantee that they will do so in the same order as they were sent. It's up to a higher layer protocol or the sending application to determine if the packets have arrived and whether to handle any problems. These are protocols in the fire-and-forget style.
The typical applications of UDP are internet telephony and video streaming. DNS queries are also transported using UDP.
We'll now look at UDP's more dependable sibling, TCP, and then discuss the differences, and why applications may choose to use one or the other.
TCP
The Transmission Control Protocol is documented as RFC 761. As opposed to UDP, TCP is a connection based protocol. In such a protocol, no data is sent until the server and the client have performed an initial exchange of control packets. This exchange is called a handshake. This establishes a connection, and from then on data can be sent. Each data packet that is received is acknowledged by the receiving party, and it does so by sending a packet called an ACK. As such, TCP always requires that the packets include a source port number, because it depends on the continual two-way exchange of messages.
From an application's point of view, the key difference between UDP and TCP is that the application no longer sees the data in discrete chunks; the TCP connection presents the data to the application as a continuous, seamless stream of bytes. This makes things much simpler if we are sending messages that are larger than a typical packet, however it means that we need to start thinking about framing our messages. While with UDP, we can rely on its packetization to provide a means of doing this, with TCP we must decide a mechanism for unambiguously determining where our messages start and end. We'll see more about this in Chapter 8, Client and Server Applications.
TCP provides the following services:
- In-order delivery
- Receipt acknowledgment
- Error detection
- Flow and congestion control
Data sent through TCP is guaranteed to get delivered to the receiving application in the order that it was sent in. The receiving TCP implementation buffers the received packets on the receiving device and then waits until it can deliver them in the correct order before passing them to the application.
Because the data packets are acknowledged, sending applications can be sure that the data is arriving and that it is okay to continue sending the data. If an ACK is not received for a sent packet, then within a set time period the packet will be resent. If there's still no response, then TCP will keep resending the packet at increasing intervals, until a second, longer timeout period expires. At this point, it will give up and notify the calling application that it has encountered a problem.
The TCP header includes a checksum of the header data and the payload. This allows the receiver to verify whether a packet's contents have been modified during the transmission.
TCP also includes algorithms which ensure that traffic is not sent too quickly for the receiving device to process, and these algorithms also infer network conditions and regulate the transmission rate to avoid network congestion.
Together these services provide a robust and reliable transport system for application data. This is one of the reasons many popular higher level protocols, such as HTTP, SMTP, SSH, and IMAP, depend on TCP.
UDP versus TCP
Given the features of TCP, you may be wondering what the use of a connectionless protocol like UDP is. Well, the Internet is still a pretty reliable network, and most of the packets do get delivered. The connectionless protocols are useful where the minimum transfer overhead is required, and where the occasional dropped packet is not a big deal. TCP's reliability and congestion control comes at the cost of needing additional packets and round-trips, and the introduction of deliberate delays when packets are lost in order to prevent congestion. These can drastically increase latency, which is the arch-nemesis of real-time services, while not providing any real benefit for them. A few dropped packets might result in a transient glitch or a drop in signal quality in a media stream, but as long as the packets keep coming, the stream can usually recover.
UDP is also the main protocol that is used for DNS, which is interesting because most DNS queries fit inside a single packet, so TCP's streaming abilities aren't generally needed. DNS is also usually configured such that it does not depend upon a reliable connection. Most devices are configured with multiple DNS servers, and it's usually quicker to resend a query to a second server after a short timeout rather than wait for a TCP back-off period to expire.
The choice between UDP and TCP comes down to the message size, whether latency is an issue, and how much of TCP's functionality the application wants to perform itself.
Layer 5 – The application layer
Finally we come to the top of the stack. The application layer is deliberately left open in the IP protocol suite, and it's really a catch-all for any protocol that is developed by application developers on top of TCP or UDP (or even IP, though these are rarer). Application layer protocols include HTTP, SMTP, IMAP, DNS, and FTP.
Protocols may even become their own layers, where an application protocol is built on top of another application protocol. An example of this is the Simple Object Access Protocol (SOAP), which defines an XML-based protocol that can be used over almost any transport, including HTTP and SMTP.
Python has standard library modules for many application layer protocols and third-party modules for many more. If we write low-level server applications, then we will be more likely to be interested in TCP and UDP, but if not, then application layer protocols are the ones we'll be working with, and we'll be looking at some of them in detail over the next few chapters.
On to Python!
Well, that's it for our rundown of the TCP/IP stack. We'll move on to the next section of this chapter, where we'll look at how to start using Python and how to work with some of the topics we've just covered.