Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Instant Debian - Build a Web Server

You're reading from   Instant Debian - Build a Web Server Build strong foundations for your future-ready web application using the universal operating system, Debian

Arrow left icon
Product type Paperback
Published in Sep 2013
Publisher Packt
ISBN-13 9781849518840
Length 74 pages
Edition 1st Edition
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Jose Miguel Parrella Jose Miguel Parrella
Author Profile Icon Jose Miguel Parrella
Jose Miguel Parrella
Arrow right icon
View More author details
Toc

Monitoring your server's operation (Medium)


Monitoring is a part of any operation management practice. As with any other monitoring discipline, you will need to choose which Key Performance Indicators (KPIs) are applicable to your business and application, and which thresholds/ranges are acceptable and unacceptable for you.

Getting ready

Yes, you can do real-time pedestrian monitoring by running command-line tools yourself. Let's face it, a lot of systems administrators will still login via SSH to their servers and run top for a reason; immediate snapshots have immediate value in the monitoring process. But you can also install a monitoring product and map the values over time. A mix of the two approaches will be valuable while managing a Debian system.

Some of the monitoring products out there, such as Munin, Nagios, and Zenoss, will have default values for most of the usual metrics monitored for web servers; however, you need to perform test runs that might span a couple weeks or so, with different types of loads, to understand your acceptable ranges.

How it works…

The following are some examples of KPIs for your web server on Debian:

  1. Ping round-trip time (RTT), which, on a lower scale, helps determine network outages, and on a higher scale, helps understand latency issues to your server.

  2. Network interface throughput, which helps understand capacity and usage.

  3. Disk usage, memory usage, and CPU usage. Also, tools such as vmstat will help you do real-time analysis. The three metrics combined will help you find bottlenecks in your application, which in most cases will be I/O bound. The independent metrics will just give you an idea whether you need to add capacity or not.

  4. TCP response times, which helps measure the consistency of the time the network stack takes to respond to a request on the port where your server's running (both the web and the database servers if you're using TCP connections).

  5. HTTP request/response times, which helps measure the consistency of the time the web server takes to respond to a request. More advanced monitoring of this includes expected responses, specific URIs, and form workloads for testing (which may be useful for determining defacements, for example).

  6. Database query response times, which helps find bottlenecks in your application. In teams with DBAs, this is usually done by the DBA as part of a performance optimization effort, but DevOps team might put that into their monitoring plate.

How to do it…

The easiest way to get a monitoring glimpse of your server's operation is to log in via SSH and interpret the output of certain commands.

  1. Run vmstat. You will get a fairly cryptical snapshot of the following:

    • procs-r: It indicates processes waiting for runtime. It is a potential indicative of CPU-bound applications. The lower, the better—0 is best (test under load).

    • procs-b: It indicates processes on uninterruptible sleep which is not good. 0 is best.

    • memory-swpd: It indicates swapped memory. Here, 0 is normal. Having any amount here is bad because the processes will wait until the disk spins to virtual memory. You can also tweak the swappiness (tendency to swap). Swap in/swap out, the lower the better; it will be 0 if you have no swapped memory.

    • memory-cache: It indicates cached memory. It is generally good as it will avoid a slower I/O.

    • memory-free: It indicates free memory. As opposed to Windows, Linux tends to leave free memory untouched. So it is not always an indicator of anything in particular, except if it's 0, then you're running out of memory.

    • Io-bi/bo: It indicates blocks in/out of disk. Here, lower is better. High numbers here, that increase without control or patterns during loads, are indicative of an I/O-bound application. Either find the I/O bottlenecks and remove them or invest in faster storage… or a different storage architecture. Also see CPU waiting time (wa), which is the time waiting for I/O (here, lower is better).

    • cpu-us/id: It indicates the CPU usage versus the idle time. Idle means responsiveness but also underconsumption of CPU power. A fair, consistent usage amount is a good indicator of a stable load.

  2. You can use vmstat n to get samplings each n seconds. For example vmstat 5 is good when you're doing a load test to see where your app bottlenecks are. Here is an example of a sample web server when running httperf. The first two samples are before the test, and the next two are during the test:

Can you spot the bottleneck? Yes, it's disk I/O. You can see swap is good, and the CPU and memory usage rates are good as well, thus no need to increase capacity.

This set up runs on a VM (particularly, a VPS), which tends to have slow storage (some people prefer to use network-based storage to avoid using the disk drivers of their hypervisors).

  1. You can use jnettop (called jnettop –i <interface name>) to check the bandwidth usage per each TCP connection. You can see aggregates and check whether the HTTP requests or the SQL connections are using all of your available bandwidth. There are several strategies to increase the bandwidth, which we cover in this book under the Using proxies, caches, and clusters to scale your architecture recipe.

  2. As an alternative, you can use tcptrack to track the TCP state as well; although, for network load, we like to use jnettop that looks like:

  3. Previously we used httperf to simulate a load scenario. Now let's suppose you want to use it to actually measure the response time of your web server, simulating 10 connections using the command httperf --hog --server=www.example.com --num-conns=10. It should look like the following screenshot:

    Our application is able to handle 2.4 req/s (given the 10 connection workload and the fact that we ran this locally), or conversely it takes 0.4 seconds to reply to one request. This operation used 0.9 KB/s of bandwidth. For production environments, you would like to set a more complex test environment with remote computers simulating hundreds or thousands of connections from different connections.

  4. Measuring the response time for a SQL query is trivial. If you can write your query on the command line (for example, with mysql –e or psql –c), you can wrap the entire statement on a time call:

    time (mysql -u root -p tsa -e 'select count(*) from token').
    

Take a look at the user and sys values—since this statement requires a password, real is artificially higher. Also notice that this statement will also include the time necessary to run the MySQL binary, connect, and so on, so it might be biased—for single queries, the mysql console already gives you an execution time in seconds. You could also compare the value over time by wrapping everything on a watch statement. But soon you will find out that the query response time depends on a lot of variables such as server load, I/O load, and so on, and that it is more efficient to focus on the queries that are systemically slow.

  1. If using MySQL, edit /etc/mysql/my.cnf and uncomment the log_slow_queries directive. Queries taking more than long_query_time to complete will get logged to that file. Then your programmers, DBA, and you can sit and work on that query.

  2. If using Postgres, edit /etc/postgresql/9.1/main/postgresql.conf and set log_min_duration_statement to a value (for example, 250ms).

  3. Restart your database with the service mysql restart or sudo service postgres restart and start taking a look at the logs.

You have been reading a chapter from
Instant Debian - Build a Web Server
Published in: Sep 2013
Publisher: Packt
ISBN-13: 9781849518840
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image