Analyzing enterprise application behavior with Wireshark 2

One of the important things that you can use Wireshark for is application analysis and troubleshooting. When the application slows down, it can be due to the LAN (quite uncommon in wired LAN), the WAN service (common due to insufficient bandwidth or high delay), or slow servers or clients. It can also be due to slow or problematic applications.

The purpose of this article is to get into the details of how applications work, and provide relevant guidelines and recipes for isolating and solving these problems. In the first recipe, we will learn how to find out and categorize applications that work over our network. Then, we will go through various types of applications to see how they work, how networks influence their behavior, and what can go wrong.

Further, we will learn how to use Wireshark in order to resolve and troubleshoot common applications that are used in an enterprise network. These are Microsoft Terminal Server and Citrix, databases, and Simple Network Management Protocol (SNMP).

This is an excerpt from Network Analysis using Wireshark 2 Cookbook - Second Edition written by Nagendra Kumar Nainar, Yogesh Ramdoss, Yoram Orzach.

Find out what is running over your network

The first thing to do when monitoring a new network is to find out what is running over it. There are various types of applications and network protocols, and they can influence and interfere with each other when all of them are running over the network.

In some cases, you will have different VLANs, different Virtual Routing and Forwardings (VRFs), or servers that are connected to virtual ports in a blade server. Eventually, everything is running on the same infrastructure, and they can influence each other.

There is a common confusion between VRFs and VLANs. Even though their purpose is quite the same, they are configured in different places. While VLANs are configured in the LAN in order to provide network separation in the OSI layers 1 and 2, VRFs are multiple instances of routing tables to make them coexist in the same router. This is a layer 3 operation that separates between different customer's networks. VRFs are generally seen in service provider environments using Multi-Protocol Label Switching (MPLS) to provide layer 3 connectivity to different customers over the same router's network, in such a way that no customer can see any other customer's network.

In this recipe, we will see how to get to the details of what is running over the network, and the applications that can slow it down.

The term blade server refers to a server enclosure, which is a chassis of server shelves on the front and LAN switches on the back. There are several different acronyms for it; for example, IBM calls them blade center and HP calls them blade system.

Getting ready

When you get into a new network, the first thing to do is connect Wireshark to sniff what is running over the applications and protocols. Make sure you follow these points:

When you are required to monitor a server, port-mirror it and see what is running on its connection to the network.

When you are required to monitor a remote office, port-mirror the router port that connects you to the WAN connection. Then, check what is running over it.

When you are required to monitor a slow connection to the internet, port-mirror it to see what is going on there.

In this recipe, we will see how to use the Wireshark tools for analyzing what is running and what can cause problems.

How to do it...

For analyzing, follow these steps:

Connect Wireshark using one of the options mentioned in the previous section.

You can use the following tools:
- Navigate to Statistics | Protocol Hierarchy to view the protocols that run over the network and the percentage of the total traffic
- Navigate to Statistics | Conversations to see who is talking and what protocols are used

In the Protocol Hierarchy feature, you will get a window that will help you analyze who is talking over the network. It is shown in the following screenshot:

analyzing-enterprise-application-behavior-with-wireshark-2-img-0

In the preceding screenshot, you can see the protocol distribution:

Ethernet: IP, Logical-Link Control (LLC), and configuration test protocol (loopback)

Internet Protocol Version 4: UDP, TCP, Protocol Independent Multicast (PIM), Internet Group Management Protocol (IGMP), and Generic Routing Encapsulation (GRE)

If you click on the + sign, all the underlying protocols will be shown.

To see a specific protocol throughput, click down to the protocols as shown in the following screenshot. You will see the application average throughput during the capture (HTTP in this example):

analyzing-enterprise-application-behavior-with-wireshark-2-img-1

Clicking on the + sign to the left of HTTP will open a list of protocols that run over HTTP (XML, MIME, JavaScripts, and more) and their average throughput during the capture period.

There's more...

In some cases (especially when you need to prepare management reports), you are required to provide a graphical picture of the network statistics. There are various sources available for this, for example:

Etherape (for Linux): http://etherape.sourceforge.net/

Compass (for Windows): http://download.cnet.com/Compass-Free/3000-2085_4-75447541.html?tag=mncol;1

Analyzing Microsoft Terminal Server and Citrix communications problems

Microsoft Terminal Server, which uses Remote Desktop Protocol (RDP) and Citrix metaframe Independent Computing Architecture (ICA) protocols, are widely used for local and remote connectivity for PCs and thin clients. The important thing to remember about these types of applications is that they transfer screen changes over the network. If there, are only a few changes, they will require low bandwidth. If there many changes, they will require high bandwidth.

Another thing is that the traffic in these applications is entirely asymmetric. Downstream traffic takes from tens of Kbps up to several Mbps, while the upstream traffic will be at most several Kbps. When working with these applications, don't forget to design your network according to this.

In this recipe, we will see some typical problems of these applications and how to locate them. For the convenience of writing, we will refer to Microsoft Terminal Server, and every time we write Microsoft Terminal Server, we will refer to all applications in this category, for example, Citrix Metaframe.

Getting ready

When suspecting a slow performance with Microsoft Terminal Server, first check with the user what the problem is. Then, connect the Wireshark to the network with port-mirror to the complaining client or to the server.

How to do it...

For locating a problem when Microsoft Terminal Server is involved, start with going to the users and asking questions. Follow these steps:

When users complain about a slow network, ask them a simple question: Do they see the slowness in the data presented on the screen or when they switch between windows?

If they say that the switch between windows is very fast, it is not a Microsoft Terminal Server problem. Microsoft Terminal Server problems will cause slow window changes, picture freezes, slow scrolling of graphical documents, and so on.

If they say that they are trying to generate a report (when the software is running over Microsoft Terminal Server), but the report is generated after a long period of time, this is a database problem and not Microsoft Terminal Server or Citrix.

When a user works with Microsoft Terminal Server over a high-delay communication line and types very fast, they might experience delays with the characters. This is because Microsoft Terminal Server is transferring window changes, and with high delays, these windows changes will be transferred slowly.

When measuring the communication line with Wireshark:
- Use I/O graphs to monitor the line
- Use filters to monitor the upstream and the downstream directions
- Configure bits per second on the y axis

You will get the following screenshot:

analyzing-enterprise-application-behavior-with-wireshark-2-img-2

In the preceding screenshot, you can see a typical traffic pattern with high downstream and very low upstream traffic. Notice that the Y-Axis is configured to Bits/Tick. In the time between 485s and 500s, you see that the throughput got to the maximum. This is when applications will slow down and users will start to feel screen freezes, menus that move very slowly, and so on.

When a Citrix ICA client connects to a presentation server, it uses TCP ports 2598 or 1494.

When monitoring Microsoft Terminal Server servers, don't forget that the clients access the server with Microsoft Terminal Server and the servers access the application with another client that is installed on the server. The performance problem can come from Microsoft Terminal Server or the application.

If the problem is an Microsoft Terminal Server problem, it is necessary to figure out whether it is a network problem or a system problem:
- Check the network with Wireshark to see if there are any loads. Loads such as the one shown in the previous screenshot can be solved by simply increasing the communication lines.
- Check the server's performance. Applications like Microsoft Terminal Server are mostly memory consuming, so check mostly for memory (RAM) issues.

How it works...

Microsoft Terminal Server, Citrix Metaframe, and applications simply transfer window changes over the network. From your client (PC with software client or thin client), you connect to the terminal server; and the terminal server, runs various clients that are used to connect from it to other servers. In the following screenshot, you can see the principle of terminal server operation:

analyzing-enterprise-application-behavior-with-wireshark-2-img-3

There's more...

From the terminal server vendors, you will hear that their applications improve two things. They will say that it improves manageability of clients because you don't have to manage PCs and software for every user; you simply install everything on the server, and if something fails, you fix it on the server. They will also say that traffic over the network will be reduced.

Well, I will not get into the first argument. This is not our subject, but I strongly reject the second one. When working with a terminal client, your traffic entirely depends on what you are doing:

When working with text/character-based applications, for example, some Enterprise Resource Planning (ERP) screens, you type in and read data. When working with the terminal client, you will connect to the terminal server that will connect to the database server. Depending on the database application you are working with, the terminal server can improve performance significantly or does not improve it at all. We will discuss this in the database section. Here, you can expect a load of tens to hundreds of Kbps.

If you are working with regular office documents such as Word, PowerPoint, and so on, it entirely depends on what you are doing. Working with a simple Word document will require tens to hundreds of Kbps. Working with PowerPoint will require hundreds of Kbps to several Mbps, and when you present the PowerPoint file with full screen (the F5 function), the throughput can jump up to 8 to 10 Mbps.

Browsing the internet will take between hundreds of Kbps and several Mbps, depending on what you will do over it. High resolution movies over terminal server to the internet-well, just don't do it.

Before you implement any terminal environment, test it. I once had a software house that wanted their logo (at the top-right corner of the user window) to be very clear and striking. They refreshed it 10 times a second, which caused the 2 Mbps communication line to be blocked. You never know what you don't test!

Analyzing the database traffic and common problems

Some of you may wonder why we have this section here. After all, databases are considered to be a completely different branch in the IT environment. There are databases and applications on one side and the network and infrastructure on the other side. It is correct since we are not supposed to debug databases; there are DBAs for this. But through the information that runs over the network, we can see some issues that can help the DBAs with solving the relevant problems.

In most of the cases, the IT staff will come to us first because people blame the network for everything. We will have to make sure that the problems are not coming from the network and that's it. In a minority of the cases, we will see some details on the capture file that can help the DBAs with what they are doing.

Getting ready

When the IT team comes to us complaining about the slow network, there are some things to do just to verify that it is not the case. Follow the instructions in the following section to make sure you avoid the slow network issue.

How to do it...

In the case of database problems, follow these steps:

When you get complaints about the slow network responses, start asking these questions:
- Is the problem local or global? Does it occur only in the remote offices or also in the center? When the problem occurs in the entire network, it is not a WAN bandwidth issue.
- Does it happen the same for all clients? If not, there might be a specific problem that happens only with some users because only those users are running a specific application that causes the problem.
- Is the communication line between the clients and the server loaded? What is the application that loads them?
- Do all applications work slowly, or is it only the application that works with the specific database? Maybe some PCs are old and tired, or is it a server that runs out of resources?

When we are done with the questionnaire, let's start our work:
1. Open Wireshark and start capturing packets. You can configure port-mirror to a specific PC, the server, a VLAN, or a router that connects to a remote office in which you have the clients.
2. Look at the TCP events (expert info). Do they happen on the entire communication link, on specific IP address/addresses, or on specific TCP port number/numbers? This will help you isolate the problem and check whether it is on a specific link, server, or application.

When measuring traffic on a connection to the internet, you will get many retransmissions and duplicate ACKs to websites, mail servers, and so on. This is the internet. In an organization, you should expect 0.1 to 0.5 percent of retransmissions. When connecting to the internet, you can expect much higher numbers.

But there are some network issues that can influence database behavior. In the following example, we see the behavior of a client that works with the server over a communication line with a round trip delay of 35 to 40 ms.

We are looking at the TCP stream number 8 (1) and the connection started with TCP SYN/SYN-ACK/ACK. I've set this as a reference (2). We can see that the entire connection took 371 packets (3):

analyzing-enterprise-application-behavior-with-wireshark-2-img-4

The connection continues, and we can see time intervals of around 35 ms between DB requests and responses:

analyzing-enterprise-application-behavior-with-wireshark-2-img-5

Since we have 371 packets travelling back and forth, 371 x 35 ms gives us around 13 seconds. Add to this some retransmissions that might happen and some inefficiencies, and this leads to a user waiting for 10 to 15 seconds and more for a database query.

In this case, you should consult the DBA on how to significantly reduce the number of packets that run over the network, or you can move to another way of access, for example, terminal server or web access.

Another problem that can happen is that you will have a software issue that will reflect in the capture file. If you have a look at the following screenshot, you will see that there are five retransmissions, and then a new connection is opened from the client side. It looks like a TCP problem but it occurs only in a specific window in the software. It is simply a software procedure that stopped processing, and this stopped the TCP from responding to the client:

analyzing-enterprise-application-behavior-with-wireshark-2-img-6

How it works...

Well, how databases work was always be a miracle to me. Our task is to find out how they influence the network, and this is what we've learned in this section.

There's more...

When you right-click on one of the packets in the database client to the server session, a window with the conversation will open. It can be helpful to the DBA to see what is running over the network.

When you are facing delay problems, for example, when working over cellular lines over the internet or over international connections, the database client to the server will not always be efficient enough. You might need to move to web or terminal access to the database.

An important issue is how the database works. If the client is accessing the database server, and the database server is using files shared from another server, it can be that the client-server works great; but the problems come from the database server to the shared files on the file server. Make sure that you know all these dependencies before starting with your tests.

And most importantly, make sure you have very professional DBAs among your friends. One day, you will need them!

Analyzing SNMP

SNMP is a well-known protocol that is used to monitor and manage different types of devices in a network by collecting data and statistics at regular intervals. Beyond just monitoring, it can also be used to configure and modify settings with appropriate authorization given to SNMP servers. Devices that typically support SNMP are switches, routers, servers, workstations, hosts, VoIP Phones, and many more.

It is important to know that there are three versions of SNMP: SNMPv1, SNMPv2c, and SNMPv3. Versions v2c and v3, which came later, offer better performance and security.

SNMP consists of three components:

The device being managed (referred to as managed device).

SNMP Agent. This is a piece of software running on the managed device that collects the data from the device and stores it in a database, referred to as the Managed Information Base (MIB) database. As configured, this SNMP agent exports the data/statistics to the server (using UDP port 161) at regular intervals, and also any events and traps.

SNMP server, also called Network Management Server (NMS). This is a server that communicates with all the agents in the network to collect the exported data and build a central repository. SNMP server provides access to the IT staff managing network; they can monitor, manage, and configure the network remotely.

It is very important to be aware that some of the MIBs implemented in a device could be vendor-specific. Almost all the vendors publicize these MIBs implemented in their devices.

Getting ready

Generally, the complaints we get from the network management team are about not getting any statistics or traps from a device(s) for a specific interval, or having completely no visibility to a device(s). Follow the instructions in the following section to analyze and troubleshoot these issues.

How to do it...

In the case of SNMP problems, follow these steps.

When you get complaints about SNMP, start asking these questions:

Is this a new managed device that has been brought into the network recently? In other words, did the SNMP in the device ever work properly?
- If this is a new device, talk to relevant device administrator and/or check the SNMP-related configurations, such as community strings.
- If SNMP configurations looks correct, make sure that the NMS's IP address configured is correct and also check the relevant password credentials.
- If SNMP v3 is in use, which supports encryption, make sure to check encryption-related settings like transport methods.
- If the setting and configuration looks valid and correct, make sure the managed devices have connectivity with the NMS, which can be verified by simple ICMP pings.

If it is a managed device that has been working properly and didn't report any statistics or alerts for a specific duration:
- Did the device in discussion have any issues in the control plane or management plane that stopped it from exporting SNMP statistics? Please be aware that for most devices in the network, SNMP is a least-priority protocol, which means that if a device has a higher-priority process to work on, it will hold the SNMP requests and responses in the queue.
- Is the issue experienced only for a specific device or for multiple devices in the network?
- Did the network (between managed device and NMS) experience any issue? For example, during any layer 2 spanning-tree convergence, traffic loss could occur between the managed device and SNMP server, by which NMS would lose visibility to the managed devices.

As you can see in the following picture, an SNMP Server with IP address 172.18.254.139 is performing SNMP walk with a sequence of GET-NEXT-REQUEST to a workstation with IP address 10.81.64.22, which in turn responds with GET-RESPONSE. For simplicity, the Wireshark filter used for these captures is SNMP.

The workstation is enabled with SNMP v2c, with community string public.

Let's discuss some of the commonly seen failure scenarios.

Polling a managed device with a wrong SNMP version

As I mentioned earlier, the workstation is enabled with v2c, but when the NMS polls the device with the wrong SNMP version, it doesn't get any response. So, it is very important to make sure that the managed devices are polled with the correct SNMP version.

Polling a managed device with a wrong MIB object ID (OID)

In the following example, the NMS is polling the managed device to get a number of bytes sent out on interfaces. The MIB OID for byte count is .1.3.6.1.2.1.2.2.1.16, which is ifOutOctets. The managed device in discussion has two interfaces, mapped to OID .1.3.6.1.2.1.2.2.1.16.1 and .1.3.6.1.2.1.2.2.1.16.2. When NMS polls the device to check the statistics for the third interface (which is not present), it returns a noSuchInstance error.

How it works...

As you have learned in the earlier sections, SNMP is a very simple and straightforward protocol and all its related information on standards and MIB OIDs is readily available in the internet.

There's more...

Here are some of the websites with good information about SNMP and MIB OIDs:

Microsoft TechNet SNMP: https://technet.microsoft.com/en-us/library/cc776379(v=ws.10).aspx

Cisco IOS MIB locator: http://mibs.cloudapps.cisco.com/ITDIT/MIBS/servlet/index

We have learned to perform enterprise-level network analysis with real-world examples like Analyzing Microsoft Terminal Server and Citrix communications problems. Get to know more about security and network forensics from our book Network Analysis using Wireshark 2 Cookbook - Second Edition.

What’s new in Wireshark 2.6 ?

Top 5 penetration testing tools for ethical hackers

5 pen testing rules of engagement: What to consider while performing Penetration testing