Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Getting Started with Hazelcast, Second Edition

You're reading from   Getting Started with Hazelcast, Second Edition Get acquainted with the highly scalable data grid, Hazelcast, and learn how to bring its powerful in-memory features into your application

Arrow left icon
Product type Paperback
Published in Jul 2015
Publisher Packt
ISBN-13 9781785285332
Length 162 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Matthew Johns Matthew Johns
Author Profile Icon Matthew Johns
Matthew Johns
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. What is Hazelcast? 2. Getting off the Ground FREE CHAPTER 3. Going Concurrent 4. Divide and Conquer 5. Listening Out 6. Spreading the Load 7. Gathering Results 8. Typical Deployments 9. From the Outside Looking In 10. Going Global 11. Playing Well with Others A. Configuration Summary Index

Breaking the mould

Hazelcast is a radical, new approach towards data that was designed from the ground up around distribution. It embraces a new, scalable way of thinking in that data should be shared for resilience and performance while allowing us to configure the trade-offs surrounding consistency, as the data requirements dictate.

The first major feature of Hazelcast is its masterless nature. Each node is configured to be functionally the same and operates in a peer-to-peer manner. The oldest node in the cluster is the de facto leader. This node manages the membership by automatically making decisions as to which node is responsible for which data. In this way, as the new nodes join in or drop out, the process is repeated and the cluster rebalances accordingly. This makes it incredibly simple to get Hazelcast up and running, as the system is self-discovering, self-clustering, and works straight out of the box.

However, the second feature of Hazelcast that you should remember is that we are persisting data entirely in-memory. This makes it incredibly fast, but this speed comes at a price. When a node is shut down, all the data that was held by it is lost. We combat this risk to resilience through replication; by holding a number of copies of a piece of data across multiple nodes. In the event of failure, the overall cluster will not suffer any data loss. By default, the standard backup count is 1 so that we can immediately enjoy basic resilience. However, don't pull the plug on more than one node at a time until the cluster has reacted to the change in membership and reestablished the appropriate number of backup copies of data.

So, when we introduce our new peer-to-peer distributed cluster, we get something that looks like the following figure:

Breaking the mould

Note

A distributed cache is by far the most powerful as it can scale up in response to changes to the application's needs.

We previously identified that multi-node caches tend to suffer from either saturation or consistency issues. In the case of Hazelcast, each node is the owner. Hence, responsible for a number of subset partitions of the overall data, so the load will be fairly spread across the cluster. Therefore, any saturation that exists will be at the cluster level rather than in any individual node. We can address this issue simply by adding more nodes. In terms of consistency, the backup copies of the data are internal to Hazelcast by default and are not directly used. Thus, we enjoy strict consistency. This does mean that we have to interact with a specific node to retrieve or update a particular piece of data. However, exactly which node that is an internal operational detail and can vary over time. We, as developers, actually never need to know.

It is obvious that Hazelcast is not trying to entirely replace the role of a primary database. Its focus and feature set do differ from that of the primary database (which has more transactionally stateful capabilities, long term persistent storage, and so on). However the more the data and processes we master within Hazelcast, the less dependant we become on this constrained resource. Thus, we remove the potential need to change the underlying database systems.

If you imagine the scenario where the data is split into a number of partitions, and each partition slice is owned by a node and backed up on another, the interactions will look like the following figure:

Breaking the mould

This means that for data belonging to Partition 1, our application will have to communicate to Node 1, Node 2 for data belonging to Partition 2, and so on. The slicing of the data into each partition is dynamic. So, in practice, there are typically more partitions than nodes, hence each node will own a number of different partitions and hold backups for the number of others. As mentioned before, this is an internal operational detail and our application does not need to know it. However, it is important that we understand what is going on behind the scenes.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image