Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
gRPC Go for Professionals

You're reading from   gRPC Go for Professionals Implement, test, and deploy production-grade microservices

Arrow left icon
Product type Paperback
Published in Jul 2023
Publisher Packt
ISBN-13 9781837638840
Length 260 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Clément Jean Clément Jean
Author Profile Icon Clément Jean
Clément Jean
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Chapter 1: Networking Primer 2. Chapter 2: Protobuf Primer FREE CHAPTER 3. Chapter 3: Introduction to gRPC 4. Chapter 4: Setting Up a Project 5. Chapter 5: Types of gRPC Endpoints 6. Chapter 6: Designing Effective APIs 7. Chapter 7: Out-of-the-Box Features 8. Chapter 8: More Essential Features 9. Chapter 9: Production-Grade APIs 10. Epilogue
11. Index 12. Other Books You May Enjoy

Protobuf versus JSON

If you’ve already worked on the backend or even frontend, there is a 99.99 percent chance that you’ve worked with JSON. This is by far the most popular data schema out there and there are reasons why it is the case. In this section, we are going to discuss the pros and cons of both JSON and Protobuf and we are going to explain which one is more suitable for which situation. The goal here is to be objective because as engineers, we need to be to choose the right tool for the right job.

As we could write chapters about the pros and cons of each technology, we are going to reduce the scope of these advantages and disadvantages to three categories. These categories are the ones that developers care the most about when developing applications, as detailed here:

  • Size of serialized data: We want to reduce the bandwidth when sending data over the network
  • Readability of the data schema and the serialized data: We want to be able to have a descriptive schema so that newcomers or users can quickly understand it, and we want to be able to visualize the data serialized for debugging or editing purposes
  • Strictness of the schema: This quickly becomes a requirement when APIs grow, and we need to ensure the correct type of data is being sent and received between different applications

Serialized data size

In serialization, the Holy Grail is, in a lot of use cases, reducing the size of your data. This is because most often, we want to send that data to another application across the network, and the lighter the payload, the faster it should arrive on the other side. In this space, Protobuf is the clear winner against JSON. This is the case because JSON serializes to text whereas Protobuf serializes to binary and thus has more room to improve how compact the serialized data is. An example of that is numbers. If you set a number to the id field in JSON, you would get something like this:

{ id: 123 }

First, we have some boilerplate with the braces, but most importantly we have a number that takes three characters, or three bytes. In Protobuf, if we set the same value to the same field, we would get the hexadecimal shown in the following callout.

Important note

In the chapter2 folder of the companion GitHub repository, you will find the files need to reproduce all the results in this chapter. With protoc, we will be able to display the hexadecimal representation of our serialized data. To do that, you can run the following command:

Linux/Mac: cat ${INPUT_FILE_NAME}.txt | protoc --encode=${MESSAGE_NAME} ${PROTO_FILE_NAME}.proto | hexdump –C

Windows (PowerShell): (Get-Content ${INPUT_FILE_NAME}.txt | protoc --encode=${MESSAGE_NAME} ${PROTO_FILE_NAME}.proto) -join "`n" | Format-Hex

For example:

$ cat account.txt | protoc --encode=Account account.proto | hexdump -C

00000000 08 7b |.{|

00000002

Right now, this might look like magic numbers, but we are going to see in the next section how it is encoded into two bytes. Now, two bytes instead of three might look negligible but imagine this kind of difference at scale, and you would have wasted millions of bytes.

Readability

The next important thing about data schema serialization is readability. However, readability is a little bit too broad, especially in the context of Protobuf. As we saw, as opposed to JSON, Protobuf separates the schema from the serialized data. We write the schema in a .proto file and then the serialization will give us some binary. In JSON, the schema is the actual serialized data. So, to be clearer and more precise about readability, let us split readability into two parts: the readability of the schema and the readability of the serialized data.

As for the readability of the schema, this is a matter of preference, but there are a few points that make Protobuf stand out. The first one of them is that Protobuf can contain comments, and this is nice to have for extra documentation describing requirements. JSON does not allow comments in the schema, so we must find a different way to provide documentation. Generally, it is done with GitHub wikis or other external documentation platforms. This is a problem because this kind of documentation quickly becomes outdated when the project and the team working on it get bigger. A simple oversight and your documentation do not describe the real state of your API. With Protobuf, it is still possible to have outdated documentation, but as the documentation is closer to the code, it provides more incentive and awareness to change the related comment.

The second feature that makes Protobuf more readable is the fact that it has explicit types. JSON has types but they are implicit. You know that a field contains a string if its value is surrounded by double quotes, a number when the value is only digits, and so on. In Protobuf, especially for numbers, we get more information out of types. If we have an int32 type, we can obviously know that this is a number, but on top of that, we know that it can accept negative numbers and we are able to know the range of numbers that can be stored in this field. Explicit types are important not only for security (more on that later) but also for letting the developer know the details of each field and letting them describe accurately their schemas to fulfill the business requirements.

For readability of the schema, I think we can agree that Protobuf is the winner here because it can be written as self-documenting code and we get explicit types for every field in objects.

As for the readability of serialized data, JSON is the clear winner here. As mentioned, JSON is both the data schema and the serialized data. What you see is what you get. Protobuf, however, serializes the data to binary, and it is way harder to read that, even if you know how Protobuf serializes and deserializes data. In the end, this is a trade-off between readability and serialized data size here. Protobuf will outperform JSON on serialized data and is way more explicit on the readability of the data schema. However, if you need human-readable data that can be edited by hand, Protobuf is not the right fit for your use case.

Schema strictness

Finally, the last category is the strictness of the schema. This is usually a nice feature to have when your team and your project scale because it ensures that the schema is correctly populated, and for a certain target language, it shortens the feedback loop for the developers.

Schemas are always valid ones because every field has an explicit type that can only contain certain values. We simply cannot pass a string to a field that was expecting a number or a negative number to a field that was expecting a positive number. This is enforced in the generated code by either runtime checks for dynamic languages or at compile time for typed languages. In our case, since Go is a typed language, we will have compile-time checks.

And finally, in typed languages, a schema shortens the feedback loop because instead of having a runtime check that might or might not trigger an error, we simply have a compilation error. This makes our software more reliable, and developers can feel confident that if they were able to compile, the data set into the object would be valid.

In pure JSON, we cannot ensure that our schema is correct at compile time. Most often, developers will add extra configurations such as JSON Schema to have this kind of assurance at runtime. This adds complexity to our project and requires every developer to be disciplined because they could simply go about their code without developing the schema. In Protobuf, we do schema-driven development. The schema comes first, and then our application revolves around the generated types. Furthermore, we have assurance at compile time that the values that we set are correct and we do not need to replicate the setup to all our microservices or subprojects. In the end, we spend less time on configuration and we spend more time thinking about our data schemas and the data encoding.

You have been reading a chapter from
gRPC Go for Professionals
Published in: Jul 2023
Publisher: Packt
ISBN-13: 9781837638840
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image