You're reading from Modern Distributed Tracing in .NET A practical guide to observability and performance analysis for microservices

Product type Paperback

Published in Jun 2023

Publisher Packt

ISBN-13 9781837636136

Length 336 pages

Edition 1st Edition

Tools

.NET

Concepts

Application Development

Author (1):

Liudmila Molkova

View More author details

Table of Contents (23) Chapters

Preface

1. Part 1: Introducing Distributed Tracing

2. Chapter 1: Observability Needs of Modern Applications FREE CHAPTER

3. Chapter 2: Native Monitoring in .NET

4. Chapter 3: The .NET Observability Ecosystem

5. Chapter 4: Low-Level Performance Analysis with Diagnostic Tools

6. Part 2: Instrumenting .NET Applications

7. Chapter 5: Configuration and Control Plane

8. Chapter 6: Tracing Your Code

9. Chapter 7: Adding Custom Metrics

10. Chapter 8: Writing Structured and Correlated Logs

11. Part 3: Observability for Common Cloud Scenarios

12. Chapter 9: Best Practices

13. Chapter 10: Tracing Network Calls

14. Chapter 11: Instrumenting Messaging Scenarios

15. Chapter 12: Instrumenting Database Calls

16. Part 4: Implementing Distributed Tracing in Your Organization

17. Chapter 13: Driving Change

18. Chapter 14: Creating Your Own Conventions

19. Chapter 15: Instrumenting Brownfield Applications

20. Assessments

21. Index

Why subscribe?

22. Other Books You May Enjoy

Ensuring consistency and structure

As we already defined, spans are structured events describing interesting operations.

A span’s start time, duration, status, kind, and context are strongly typed – they enable correlation and causation, allowing us to visualize traces and detect failures or latency issues.

The span’s name and attributes describe an operation but are not strongly typed or strictly defined. If we don’t populate them in a meaningful way, we can detect an issue but have no knowledge of what actually happened.

For example, for client HTTP calls, beyond generic properties, we want to capture at least the URL, method, and response code (or exception) – if we don’t know any of these, we’re blind. Once we populate them, we can start doing some powerful analysis with queries over such spans to answer the following common questions:

Which dependency calls were made in the scope of this request? Which of them failed? What was the latency of each of them?
Does my application make independent dependency calls in parallel or sequentially? Does it make any unnecessary requests when they can be done lazily?
Are dependency endpoints configured correctly?
What are the success or error rates and latency per dependency API?

Note

This analysis relies on an application using the same attributes for all HTTP dependencies. Otherwise, the operator that performs the queries will have a hard time writing and maintaining them.

With unified and community-driven telemetry collection taken off the observability vendor’s plate, they can now fully focus on (semi-)automating analysis and giving us powerful performance and fault analysis tools.

OpenTelemetry defines a set of semantic conventions for spans, traces, and resources, which we’ll talk more about in Chapter 9, Best Practices.

Building application topology

Distributed tracing, combined with semantic conventions, allows us to build visualizations such as an application map (aka service map), as shown in Figure 1.11 – you could see your whole system along with key health metrics. It’s an entry point to any investigation.

Figure 1.11 – An Azure Monitor application map for a meme service is an up-to-date system diagram with all the basic health metrics

Observability vendors depend on trace and metrics semantics to build service maps. For example, the presence of HTTP attributes on the client span represents an outgoing HTTP call, and we need to show the outgoing arrow to a new dependency node. We should name this node based on the span’s host attribute.

If we see the corresponding server span, we can now merge the server node with the dependency node, based on span context and causation. There are other visualizations or automation tools that you might find useful – for example, critical path analysis, or finding common attributes that correspond to higher latency or error rates. Each of these relies on span properties and attributes following common semantics or at least being consistent across services.

Resource attributes

Resource attributes describe the process, host, service, and environment, and are the same for all spans reported by the service instance – for example, the service name, version, unique service instance ID, cloud provider account ID, region, availability zone, and K8s metadata.

These attributes allow us to detect anomalies specific to certain environments or instances – for example, an error rate increase only on instances that have a new version of code, an instance that goes into a restart loop, or a cloud service in a region and availability zone that experiences issues.

Based on standard attributes, observability vendors can write generic queries to perform this analysis or build common dashboards. It also enables the community to create vendor-agnostic tools and solutions for popular technologies.

Such attributes describe a service instance and don’t have to appear on every span – OTLP, for example, passes resource attributes once per batch of spans.

You're reading from Modern Distributed Tracing in .NET A practical guide to observability and performance analysis for microservices

Table of Contents (23) Chapters

Ensuring consistency and structure

Building application topology

Resource attributes

Authors (1)

Personalised recommendations for you