Chapter 2. Take Tracing for a HotROD Ride
A picture is worth a thousand words. So far, we have only talked about distributed tracing in the abstract terms. In this chapter, we are going to look at concrete examples of the diagnostic and troubleshooting tools provided by a tracing system. We are going to use Jaeger (pronounced \
\), an open source distributed tracing system, originally created by Uber Technologies [1] and now hosted with the Cloud Native Computing Foundation [2].
The chapter will:
- Introduce HotROD, an example application provided by the Jaeger project, which is built with microservices and instrumented with the OpenTracing API (we will discuss OpenTracing in detail in Chapter 4, Instrumentation Basics with OpenTracing)
- Use Jaeger's user interface to understand the architecture and the data flow of the HotROD application
- Compare standard logging output of the application with contextual logging capabilities of distributed tracing
- Investigate and attempt to fix the root causes of latency in the application
- Demonstrate the distributed context propagation features of OpenTracing
All relevant screenshots and code snippets are included in this chapter, but you are strongly encouraged to try running the example and explore the features of the web UIs, in order to better understand the capabilities of distributed tracing solutions like Jaeger.
Both the Jaeger backend and the demo application can be run as downloadable binaries for macOS, Linux, and Windows, as Docker containers, or directly from the source code. Since Jaeger is an actively developed project, by the time you read this book, some of the code organization or distributions may have changed. To ensure you are following the same steps as described in this chapter, we are going to use Jaeger version 1.6.0, released in July 2018.
Running from prepackaged binaries
Downloading pre packaged binaries is likely the easiest way to get started, since it requires no additional setup or installations. All Jaeger backend components, as well as the HotROD demo application, are available as executable binaries for macOS, Linux, and Windows from GitHub: https://github.com/jaegertracing/jaeger/releases/tag/v1.6.0/. For example, to run the binaries on macOS, download the archive jaeger-1.6.0-darwin-amd64.tar.gz
and extract its content into a directory:
This archive includes the production-grade binaries for the Jaeger backend, namely jaeger-query
, jaeger-agent
, and jaeger-collector
, which we will not use in this chapter. We only need the all-in-one packaging of the Jaeger backend jaeger-standalone
, which combines all the backend components into one executable, with no additional dependencies.
The Jaeger backend listens on half a dozen different ports, so if you run into port conflicts, you may need to find out which other software listens on the same ports and shut it down temporarily. The risk of port conflicts is greatly reduced if you run Jaeger all-in-one as a Docker container.
The executable example-hotrod
is the HotROD demo application we will be using in this chapter to demonstrate the features of distributed tracing.
Running from Docker images
Like most software designed for the cloud-native era, Jaeger components are distributed in the form of Docker container images, so we recommend installing a working Docker environment. Please refer to the Docker documentation (https://docs.docker.com/install/) for installation instructions. To quickly verify your Docker setup, run the following command:
You may first see some Docker output as it downloads the container images, followed by the program's output:
Running from the source code
The HotROD demo application contains some standard tracing instrumentation, as well as a number of custom instrumentation techniques that give more insight into the application's behavior and performance. We will discuss these techniques in this chapter and show some code examples. If you want to dive deeper into the code and maybe tweak it, we recommend downloading the full source code.
Go language development environment
The HotROD application, just like Jaeger itself, is implemented in the Go language, thus a working Go v1.10 or later development environment is required to run it from source. Please refer to the documentation (https://golang.org/doc/install) for installation instructions.
The HotROD application is located in the examples
directory of the main Jaeger backend repository, https://github.com/jaegertracing/jaeger/. If you have Git installed, you can check out the source code as follows:
Alternatively, you can download the source code bundle from the Jaeger release page (https://github.com/jaegertracing/jaeger/releases/tag/v1.6.0/), and make sure that the code is extracted into the $GOPATH/src/github.com/jaegertracing/jaeger/
directory.
Jaeger 1.6 uses the github.com/Masterminds/glide
utility as the dependencies manager, so you would need to install it:
After installing glide
, run it to download the libraries that Jaeger depends on:
Now you should be able to build and run the HotROD binary:
It is also possible to run the Jaeger backend from the source code. However, it requires an additional setup of Node.js in order to compile the static assets for the UI, which may not even work on an OS like Windows, so I do not recommend it for this chapter's examples.
Before we run the demo application, let's make sure we can run the Jaeger backend to collect the traces, as otherwise we might get a lot of error logs. A production installation of the Jaeger backend would consist of many different components, including some highly scalable databases like Cassandra or Elasticsearch. For our experiments, we do not need that complexity or even the persistence layer. Fortunately, the Jaeger distribution includes a special component called all-in-one just for this purpose. It runs a single process that embeds all other components of a normal Jaeger installation, including the web user interface. Instead of a persistent storage, it keeps all traces in memory.
If you are using Docker, you can run Jaeger all-in-one with the following command:
The -d
flag makes the process run in the background, detached from the terminal. The --name
flag sets a name by which this process can be located by other Docker containers. We also use the -p
flag to expose three ports on the host network that the Jaeger backend is listening to.
The first port, 6831/udp
, is used to receive tracing data from applications instrumented with Jaeger tracers, and the second port, 16686
, is where we can find the web UI. We also map the third port, 14268
, in case we have issues with UDP packet limits and need to use HTTP transport for sending traces (discussed as follows).
The process listens to other ports as well, for example, to accept traces in other formats, but they are not relevant for our exercise. Once the container starts, open http://127.0.0.1:16686/ in the browser to access the UI.
If you chose to download the binaries instead of Docker images, you can run the executable named jaeger-standalone
, without any arguments, which will listen on the same ports. jaeger-standalone
is the binary used to build the jaegertracing/all-in-one
Docker image (in the later versions of Jaeger
, it has been renamed jaeger-all-in-one
).
We removed some fields of the log statements (level
, timestamp
, and caller
) to improve readability.
Since the all-in-one binary runs the Jaeger backend with an in-memory database, which is initially empty, there is not much to see in the UI right away. However, the Jaeger backend has self-tracing enabled, so if we reload the home page a few times, we will see the Services dropdown in the top-left corner display jaeger-query, which is the name of the microservice running the UI component. We can now hit the Search button to find some traces, but let's first run the demo application to get more interesting traces.
HotROD is a mock-up "ride-sharing" application (ROD stands for Rides on Demand) that is maintained by the Jaeger project. We will discuss its architecture later, but first let's try to run it. If you are using Docker, you can run it with this command:
Let's quickly review what is going on with this command:
- The
rm
flag instructs Docker to automatically remove the container once the program exits. - The
it
flags attach the container's standard in and out streams to the terminal. - The
link
flag tells Docker to make the hostname jaeger
available inside the container's networking namespace and resolve it to the Jaeger backend we started earlier. - The string
all
, after the image name, is the command to the HotROD application, instructing it to run all microservices from the same process. It is possible to run each microservice as a separate process, even on different machines, to simulate a truly distributed application, but we leave this as an exercise for you. - The final flag tells the HotROD application to configure the tracer to send data to UDP port
6831
on hostname jaeger
.
To run HotROD from downloaded binaries, run the following command:
If we are running both the Jaeger all-in-one and the HotROD application from the binaries, they bind their ports directly to the host network and are able to find each other without any additional configuration, due to the default values of the flags.
Sometimes users experience issues with getting traces from the HotROD application due to the default UDP settings in the OS. Jaeger client libraries batch up to 65,000 bytes per UDP packet, which is still a safe number to send via the loopback interface (that is, localhost
) without packet fragmentation. However, macOS, for example, has a much lower default for the maximum datagram size. Rather than adjusting the OS settings, another alternative is to use the HTTP protocol between Jaeger clients and the Jaeger backend. This can be done by passing the following flag to the HotROD application:
Or, if using the Docker networking namespace:
Once the HotROD process starts, the logs written to the standard output will show the microservices starting several servers on different ports (for better readability, we removed the timestamps and references to the source files):
Let's navigate to the application's web frontend at http://127.0.0.1:8080/
:
We have four customers, and by clicking one of the four buttons, we summon a car to arrive to the customer's location, perhaps to pick up a product and deliver it elsewhere. Once a request for a car is sent to the backend, it responds with the car's license plate number, T757183C, and the expected time of arrival of two minutes:
There are a few bits of debugging information we see on the screen:
- In the top-left corner, there is a web client id: 6480. This is a random session ID assigned by the JavaScript UI. If we reload the page, we get a different session ID.
- In the brackets after the car information, we see a request ID, req: 6480-1. This is a unique ID assigned by the JavaScript UI to each request it makes to the backend, composed of the session ID and a sequence number.
- The last bit of debugging data, latency: 772ms, is measured by the JavaScript UI and shows how long the backend took to respond.
This additional information has no impact on the behavior of the application but will be useful when we investigate performance problems.
Now that we have seen what the HotROD application does, we may want to know how it is architected. After all, maybe all those servers we saw in the logs are just for show, and the whole application is simply a JavaScript frontend. Rather than asking someone for a design document, wouldn't it be great if our monitoring tools could build the architecture diagram automatically, by observing the interactions between the services? That's exactly what distributed tracing systems like Jaeger can do. That request for a car we executed earlier has provided Jaeger with enough data to connect the dots.
Let's go to the Dependencies page in the Jaeger UI. At first, we will see a tiny diagram titled Force Directed Graph, but we can ignore it, as that particular view is really designed for showing architectures that contain hundreds or even thousands of microservices. Instead, click on the DAG tab (Directed Acyclic Graph), which shows an easier-to-read graph. The graph layout is non-deterministic, so your view may have the second-level nodes in a different order than in the following screenshot:
As it turns out, the single HotROD binary is actually running four microservices and, apparently, two storage backends: Redis and MySQL. The storage nodes are not actually real: they are simulated by the application as internal components, but the top four microservices are indeed real. We saw each of them logging a network address of the servers they run. The frontend
microservice serves the JavaScript UI and makes RPC calls to the other three microservices.
The graph also shows the number of calls that were made to handle the single request for a car, for example, the route
service was called 10 times, and there were 14 calls to Redis.
We have learned that the application consists of several microservices. What exactly does the request flow look like? It is time to look at the actual trace. Let's go to the Search page in the Jaeger UI. Under the Find Traces caption, the Services dropdown contains the names of the services we saw in the dependency diagram. Since we know that frontend is the root service, let's choose it and click the Find Traces button.
The system found two traces and displayed some metadata about them, such as the names of different services that participated in the traces, and the number of spans each service emitted to Jaeger. We will ignore the second trace that represents the request to load the JavaScript UI and focus on the first trace, named frontend: HTTP GET /dispatch. This name is a concatenation of the service name frontend and the operation name of the top-level span, in this case HTTP GET /dispatch.
On the right side, we see that the total duration of the trace was 757.73ms. This is shorter than the 772ms we saw in the HotROD UI, which is not surprising because the latter was measured from the HTTP client side by JavaScript, while the former was reported by the Go backend. The 14.27ms difference between these numbers can be attributed to the network latency. Let's click on the trace title bar.
The timeline view shows a typical view of a trace as a time sequence of nested spans, where a span represents a unit of work within a single service. The top-level span, also called the root span, represents the handling of the main HTTP request from the JavaScript UI by the frontend
service (server span), which in turn called the customer
service, which in turn called a MySQL database. The width of the spans is proportional to the time a given operation took. This may represent a service doing some work or waiting on a downstream call.
From this view, we can see how the application handles a request:
- The
frontend
service receives the external HTTP GET request at its /dispatch
endpoint. - The
frontend
service makes an HTTP GET request to the /customer
endpoint of the customer
service. - The
customer
service executes SELECT
SQL statement in MySQL. The results are returned back to the frontend
service. - Then the
frontend
service makes an RPC request, Driver::findNearest
, to the driver
service. Without drilling more into the trace details, we cannot tell which RPC framework is used to make this request, but we can guess it is not HTTP (it is actually made over TChannel [1]). - The
driver
service makes a series of calls to Redis. Some of those calls show a red circle with an exclamation point, indicating failures. - After that, the
frontend
service executes a series of HTTP GET requests to the /route
endpoint of the route
service. - Finally, the
frontend
service returns the result to the external caller (for example, the UI).
We can tell all of this pretty much just by looking at the high-level Gantt chart presented by the end-to-end tracing tool.
We now have a pretty good idea about what the HotROD application does, if not exactly how it does it. For example, why does the frontend
service call the /customer
endpoint of the customer
service? Of course, we can look at the source code, but we are trying to approach this from the point of view of application monitoring. One direction we could take is to look at the logs the application writes to its standard output (Figure 2.7).
It is quite difficult to follow the application logic from these logs and we are only looking at the logs when a single request was executed by the application.
We are also lucky that the logs from four different microservices are combined in a more-or-less consistent stream. Imagine many concurrent requests going through the system and through microservices running in different processes! The logs would become nearly useless in that case. So, let's take a different approach. Let's view the logs collected by the tracing system. For example, click on the root span to expand it and then click on the Logs (18) section to expand and see the logs (18 refers to the number of log statements captured in this span). These logs give us more insight into what the /dispatch
endpoint was doing (Figure 2.8):
- It called the
customer
service to retrieve some information about the customer given customer_id=123
. - It then retrieved N available drivers closest to the customer location (115,277). From the Gantt chart, we know this was done by calling the
driver
service. - Finally, it called the
route
service to find the shortest route between driver location (indicated as "pickup") and customer location (indicated as "drop-off").
Let's close the root span and open another one; specifically, one of the failed calls to Redis (Figure 2.9). The span has a tag error=true
, which is why the UI highlighted it as failed. The log statement explains the nature of the error as "Redis timeout." The log also includes the driver_id
that the driver
service was attempting to retrieve from Redis. All these details may provide very useful information during debugging.
The distinct feature of a tracing system is that it only shows the logs that happened during the execution of a given request. We call these logs contextualized because they are captured not only in the context of a specific request, but also in the context of a specific span within the trace for that request.
In the traditional log output, these log statement would have been mixed with a lot of other statements from parallel requests, but in the tracing system, they are neatly isolated to the service and span where they are relevant. Contextualized logs allow us to focus on the behavior of the application, without worrying about logs from other parts of the program or from other concurrent requests.
As we can see, using a combination of a Gantt chart, span tags, and span logs, the end-to-end tracing tool lets us easily understand the architecture and data flow of the application, and enables us to zoom in on the details of individual operations.
Let's expand a couple more spans.
In the customer
span, we can see a tag http.url
that shows that the request at the /customer
endpoint had a parameter customer=123
, as well as two logs narrating the execution during that span. In the mysql
span, we see an sql.query
tag showing the exact SQL query that was executed: SELECT * FROM customer WHERE customer_id=123
, and a log about acquiring some lock.
What is the difference between a span tag and a span log? They are both annotating the span with some contextual information. Tags typically apply to the whole span, while logs represent some events that happened during the span execution. A log always has a timestamp that falls within the span's start-end time interval. The tracing system does not explicitly track causality between logged events the way it keeps track of causality relationships between spans, because it can be inferred from the timestamps.
An acute reader will notice that the /customer
span records the URL of the request twice, in the http.url
tag and in the first log. The latter is actually redundant but was captured in the span because the code logged this information using the normal logging facility, which we will discuss later in this chapter.
The OpenTracing Specification [3] defines semantic data conventions that prescribe certain well-known tag names and log fields for common scenarios. Instrumentation is encouraged to use those names to ensure that the data reported to the tracing system is well defined and portable across different tracing backends.
Identifying sources of latency
So far, we have not discussed the performance characteristics of the HotROD application. If we refer to Figure 2.6, we can easily make the following conclusions:
- The call to the
customer
service is on the critical path because no other work can be done until we get back the customer data that includes the location to which we need to dispatch the car. - The
driver
service retrieves N nearest drivers given the customer's location and then queries Redis for each driver's data in a sequence, which can be seen in the staircase pattern of Redis GetDriver
spans. If these operations can be done in parallel, the overall latency can be reduced by almost 200ms. - The calls to the
route
service are not sequential, but not fully parallel either. We can see that, at most, three requests can be in progress, and as soon as one of them ends, another request starts. This behavior is typical when we use a fixed-size executor pool.
What happens if we make many requests to the backend simultaneously? Let's go to the HotROD UI and click on one of the buttons repeatedly (and quickly).
As we can see, the more requests that are being processed concurrently, the longer it takes for the backend to respond. Let's take a look at the trace of the longest request. We could do it in two ways. We can simply search for all traces and pick the one with the highest latency, represented by the longest cyan-colored title bar (Figure 2.14):
Another way is to search by tags or logs on the span. The root span emits a final log, where it records the license plate number of the closest car as one of the log fields:
The Jaeger backend indexes all spans by both tags and log fields, and we can find that trace by specifying driver=T796774C
in the Tags search box:
This trace took 1.43 seconds, about 90% longer than our first trace, which took only 757ms (measured from the server side). Let's open it and investigate what is different:
The most apparent difference is that the database query (the mysql
span) takes a lot longer than before: 1s instead of 323ms. Let's expand that span and try to find out why:
In the log entries of the span, we see that execution was blocked waiting for a lock for more than 700ms. This is clearly a bottleneck in the application, but before we dive into that, let's look at the first log record, evidently emitted before getting blocked on the lock: Waiting for lock behind 3 transactions. Blockers=[6480-4 6480-5 6480-6].
It tells us how many other requests were already queued for this lock, and even gives us the identity of those requests. It is not too hard to imagine a lock implementation that keeps track of how many goroutines are blocked, but where would it get the identity of the requests?
If we expand the previous span for the customer service, we can see that the only data passed to it via an HTTP request was the customer ID 392
. In fact, if we inspect every span in the trace, we will not find any remote call where the request ID, like 6480-5
, was passed as a parameter.
This magic appearance of blocking request IDs in the logs is due to a custom instrumentation in HotROD that makes use of a distributed context propagation mechanism, which is called baggage in the OpenTracing API.
As we will see in Chapter 3, Distributed Tracing Fundamentals, end-to-end tracing works because tracing instrumentation is designed to propagate certain metadata across thread, component, and process boundaries, throughout the whole distributed call graph. Trace and span IDs are examples of such metadata. Another example is baggage, which is a general-purpose key-value store embedded in every inter-process request. HotROD's JavaScript UI stores session ID and request ID in the baggage before making the request to the backend, and that baggage is transparently made available via OpenTracing instrumentation to every service involved in handling the request, without the need to pass that information explicitly as request parameters.
It is an extremely powerful technique that can be used to propagate various useful pieces of information (such as tenancy) in the context of a single request throughout the architecture, without needing to change every service to understand what they are propagating. We will discuss more examples of using metadata propagation for monitoring, profiling, and other use cases in Chapter 10, Distributed Context Propagation.
In our example, knowing the identities of the requests stuck in the queue ahead of our slow request allows us to find traces for those requests, and analyze them as well. In real production systems, this could lead to unexpected discoveries, such as a long-running request spoiling a lot of other requests that are normally very fast. Later in this chapter, we will see another example of using baggage.
Now that we know that the mysql
call gets stuck on a lock, we can easily fix it. As we mentioned earlier, the application does not actually use the MySQL database, just a simulation of it, and the lock is meant to represent a single database connection shared between multiple goroutines. We can find the code in the file examples/hotrod/services/customer/database.go
:
If the locking behavior is not disabled via configuration, we acquire the lock before simulating the SQL query delay. The statement defer d.lock.Unlock()
is used to release the lock before we exit the surrounding function.
Notice how we pass the ctx
parameter to the lock object. context.Context
is a standard way in Go to pass request-scoped data throughout an application. The OpenTracing span is stored in the Context, which allows the lock to inspect it and retrieve the JavaScript's request ID from the baggage. The code for this custom implementation of a mutex can be found in the source file examples/hotrod/pkg/tracing/mutex.go
.
We can see that the mutex behavior is protected by a configuration parameter
Fortunately, the HotROD applications expose command line flags to change these configuration parameters. We can find the flags by running the HotROD binary with help
command:
The flags that control parameters for latency-affecting logic all start with the --fix
prefix. In this case, we want the flag --fix-disable-db-conn-mutex
, or -M
as a short form, to disable the blocking behavior. We also want to reduce the default 300ms
latency of simulated database queries, controlled by flag -D
, to make it easier to see the results of this optimization.
Let's restart the HotROD application using these flags, to pretend that we fixed the code to use a connection pool with enough capacity that our concurrent requests do not have to compete for connections (the logs are again trimmed for better readability):
We can see in the logs that the changes are taking effect. To see how it works out, reload the HotROD web page and repeat the experiment of issuing many simultaneous requests by clicking one of the buttons many times in quick succession.
The latency still increases as we add more requests to the system, but it no longer grows as dramatically as with the single database bottleneck from before. Let's look at one of the longer traces again.
As expected, the mysql
span stays at around 100ms
, regardless of the load. The driver
span is not expanded, but it takes the same time as before. The interesting change is in the route
calls, which now take more than 50% of the total request time. Previously, we saw these requests executing in parallel three at a time, but now we often see only one at a time, and even a gap right after the frontend
to driver
call when no requests to route
service are running. Clearly, we have a contention with other goroutines on some limited resource and we can also see that the gaps happen between the spans of the frontend
service, which means the bottleneck is not in the route
service, but in how the frontend
service calls it.
Let's look at function getRoutes()
in the source file services/frontend/best_eta.go
:
This function receives a customer record (with address) and a list of drivers (with their current locations), then calculates the expected time of arrival (ETA) for each driver. It calls the route
service for each driver inside an anonymous function executed via a pool of goroutines, by passing the function to eta.pool.Execute()
.
Since all functions are executed asynchronously, we track their completion with the wait group, wg
, which implements a countdown latch: for every new function, we increment its count with we.Add(1)
, and then we block on wg.Wait()
until each of the spawned functions calls wg.Done()
.
As long as we have enough executors (goroutines) in the pool, we should be able to run all calculations in parallel. The size of the executor pool is defined in services/config/config.go
:
The default value of three explains why we saw, at most, three parallel executions in the very first trace we inspected. Let's change it to 100 (goroutines in Go are cheap) using the -W
command line flag, and restart HotROD:
One more time, reload the HotROD web UI and repeat the experiment. We have to click on the buttons really quickly now because the requests return back in less than half a second.
If we look at one of these new traces, we will see that, as expected, the calls from frontend
to the route
service are all done in parallel now, thus minimizing the overall request latency. We leave the final optimization of the driver service as an exercise for you.
Resource usage attribution
In this last example, we will discuss a technique that is not, strictly speaking, a functionality commonly provided by distributed tracing systems, but rather a side effect of the distributed context propagation mechanism that underlies all tracing instrumentation and can be relied upon in applications instrumented for tracing. We saw an example of this earlier when we discussed an implicit propagation of the frontend request ID used to tag transactions that were blocking in a mutex queue. In this section, we will discuss the use of metadata propagation for resource usage attribution.
Resource usage attribution is an important function in large organizations, especially for capacity planning and efficiency improvements. We can define it as a process of measuring the usage of some resource, such as CPU cores, or disk space, and attributing it to some higher-level business parameter, such as a product or a business line. For example, consider a company that has two lines of business that together require 1,000 CPU cores to operate, as maybe measured by some cloud provider.
Let's assume the company projects that one line of business will grow 10% next year, while the other will grow 100%. Let's also assume, for simplicity, that the hardware needs are proportional to the size of each business. We are still not able to predict how much extra capacity the company will need because we do not know how those current 1,000 CPU cores are attributed to each business line.
If the first business line is actually responsible for consuming 90% of hardware, then its hardware needs will increase from 900 to 990 cores, and the second business line's needs will increase from 100 to 200 CPU cores, to the total of an extra 190 cores across both business lines. On the other hand, if the current needs of the business lines are split 50/50, then the total capacity requirement for next year will be 500 * 1.1 + 500 * 2.0=1550 cores.
The main difficulty in resource usage attribution stems from the fact that most technology companies use shared resources for running their business. Consider such products as Gmail and Google Docs. Somewhere, at the top level of the architecture, they may have dedicated pools of resources, for example, load balancers and web servers, but the lower we go down the architecture, the more shared resources we usually find.
At some point, the dedicated resource pools, like web servers, start accessing shared resources like Bigtable, Chubby, Google File System, and so on. It is often inefficient to partition those lower layers of architecture into distinct subsets in order to support multi-tenancy. If we require all requests to explicitly carry the tenant information as a parameter, for example, tenant="gmail"
or tenant="docs"
, then we can accurately report resource usage by the business line. However, such a model is very rigid and hard to extend if we want to break down the attribution by a different dimension, as we need to change the APIs of every single infrastructure layer to pass that extra dimension. We will now discuss an alternative solution that relies on metadata propagation.
We have seen in the HotROD demo that the calculation of the shortest route performed by the route
service is a relatively expensive operation (probably CPU intensive). It would be nice if we could calculate how much CPU time we spend per customer. However, the route
service is an example of a shared infrastructure resource that is further down the architectural layers from the point where we know about the customer. It does not need to know about the customer in order to calculate the shortest route between two points.
Passing the customer ID to the route
service just to measure the CPU usage would be poor API design. Instead, we can use the distributed metadata propagation built into the tracing instrumentation. In the context of a trace, we know for which customer the system is executing the request, and we can use metadata (baggage) to transparently pass that information throughout the architecture layers, without changing all the services to accept it explicitly.
If we want to perform CPU usage aggregation by some other dimension, say, the session ID from the JavaScript UI, we can do so without changing all the components as well.
To demonstrate this approach, the route
service contains the code to attribute the CPU time of calculations to the customer and session IDs, which it reads from the baggage. In the services/route/server.go
file, we can see this code:
As with the instrumented mutex we saw earlier, we don't pass any customer/session IDs because they can be retrieved from baggage via the context. The code actually uses some static configuration to know which baggage items to extract and how to report the metrics.
This code uses the expvar
package ("exposed variables") from Go's standard library. It provides a standard interface to global variables that can be used to accumulate statistics about the application, such as operation counters, and it exposes these variables in JSON format via an HTTP endpoint, /debug/vars
.
The expvar
variables can be standalone primitives, like float
and string
, or they can be grouped into named maps for more dynamic statistics. In the preceding code, we define two maps: one keyed by customer ID and another by session ID, and combine them in the stats
structure (an array of anonymous structs) with the names of metadata attributes that contain the corresponding ID.
The updateCalcStats()
function first converts the elapsed time of an operation into seconds as a float
value, then iterates through the stats
array and checks if the span
metadata contains the desired key ("customer"
or "session"
). If the baggage item with that key is not empty, it increments the counter for that key by calling AddFloat()
on the respective expvar.Map
to aggregate the time spent on computing the route.
If we navigate to http://127.0.0.1:8083/debug/vars
, which is one of the endpoints exposed by the HotROD application, we can see route.calc.by.*
entries that give us the breakdown of time (in seconds) spent in calculation on behalf of each customer, and some UI sessions.
This approach is very flexible. If necessary, this static definition of the stats
array can be easily moved to a configuration file to make the reporting mechanism even more flexible. For example, if we wanted to aggregate data by another dimension, say the type of web browser (not that it would make a lot of sense), we would need to add one more entry to the configuration and make sure that the frontend
service captures the browser type as a baggage item.
The key point is that we do not need to change anything in the rest of the services. In the HotROD demo, the frontend
and route
services are very close to each other, so if we had to change the API it would not be a major undertaking. However, in real-life situations, the service where we may want to calculate resource usage can be many layers down the stack, and changing the APIs of all the intermediate services, just to pass an extra resource usage aggregation dimension, is simply not feasible. By using distributed context propagation, we vastly minimize the number of changes needed. In Chapter 10,
Distributed Context Propagation, we will discuss other uses of metadata propagation.
In a production environment, using the expvar
module is not the best approach, since the data is stored individually in each service instance. However, our example has no hard dependency on the expvar
mechanism. We could have easily used a real metrics API and had our resource usage statistics aggregated in a central metrics system like Prometheus.
This chapter introduced a demo application, HotROD, that is instrumented for distributed tracing, and by tracing that application with Jaeger, an open source distributed tracing system, demonstrated the following features common to most end-to-end tracing systems:
- Distributed transaction monitoring: Jaeger records the execution of individual requests across the whole stack of microservices, and presents them as traces.
- Performance and latency optimization: Traces provide very simple visual guides to performance issues in the application. The practitioners of distributed tracing often say that fixing the performance issue is the easy part, but finding it is the hard part.
- Root cause analysis: The highly contextualized nature of the information presented in the traces allows for quickly narrowing down to the parts of the execution that are responsible for issues with the execution (for example, the timeouts when calling Redis, or the mutex queue blocking).
- Service dependency analysis: Jaeger aggregates multiple traces and constructs a service graph that represents the architecture of the application.
- Distributed context propagation: The underlying mechanism of metadata propagation for a given request enables not only tracing, but various other features, such as resource usage attribution.
In the next chapter, we will review more of the theoretical underpinnings of distributed tracing, the anatomy of a tracing solution, different implementation techniques that have been proposed in the industry and academia, and various trade-offs that implementors of tracing infrastructure need to keep in mind when making certain architectural decisions.