Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

Twitter adopts Apache Kafka as their Pub/Sub System

Save for later
  • 3 min read
  • 30 Nov 2018

article-image

Yesterday, Twitter shared that they are migrating to Apache Kafka from an in-house Pub/Sub system, EventBus which is built on top of Apache DistributedLog. The main reasons behind this adoption were Kafka’s lower latency, better resource savings, and a strong community support.

Apache Kafka is an open-source distributed stream-processing software that provides a unified, high-throughput, low-latency platform for handling real-time data feeds. It has seen broad adoption from many big companies such as LinkedIn, Netflix, Uber and many more, making it the de facto real-time messaging system of choice in the industry.

Why did Twitter decide to move to Apache Kafka?


The Twitter team evaluated Kafka under similar workloads that are run on EventBus such as durable writes, tailing reads, catchup reads, and high fanout reads, for several months. They concluded these reasons for moving to Kafka:

Lower latency


This evaluation highlighted that Kafka provides significantly lower latency, regardless of the amount of throughput. Throughput was measured by the timestamp difference from the time the message was created to when the consumer read the message. Its lower latency could be attributed to these factors:

  • In EventBus the serving and storage layers are decoupled, which introduces an additional hop. Kafka eliminates this as it has only one process handling both storage and request serving.
  • The writes on fsync() calls in EventBus are explicitly blocked while in Kafka the OS is responsible to fsync() in the background.
  • Unlock access to the largest independent learning library in Tech for FREE!
    Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
    Renews at €18.99/month. Cancel anytime
  • Kafka supports the zero-copy functionality, which greatly improves application performance by reducing the number of context switches between kernel and user mode.

Better resource savings


EventBus separates the serving and storage layers, which calls for additional hardware while Kafka uses only a single host to provide both. During the evaluation, the team saw a 68% resource savings for single consumer use cases, and 75% for multiple consumers use cases. In addition to this, the latest versions of Kafka support data replication, providing the durability guarantees.

Strong community support


As hundreds of developers are contributing to the Kafka project, which ensures regular bug fixes, improvements, and new features as opposed to only eight or so engineers working on EventBus/DistributedLog. Kafka comes with features that Twitter needed such as a streaming library, at-least-once HDFS pipeline, and exactly-once processing, which are not yet implemented in EventBus. Additionally, they will be able to get solutions to any problems they encounter on either the client or server side and get access to a better documentation.

Check out the complete announcement on Twitter’s website.

Apache Kafka 2.0.0 has just been released

Getting started with the Confluent Platform: Apache Kafka for enterprise

Working with Kafka Streams