Moving to new ground
So far, we have talked mostly about simple persisted data and caches, but in reality, we should not think of Hazelcast as purely a cache. It is much more powerful than just that. It is an in-memory data grid that supports a number of distributed collections, processors, and features. We can load the data from various sources into differing structures, send messages across the cluster, perform analytical processing on the stored data, take out locks to guard against concurrent activity, and listen to the goings-on inside the workings of the cluster. Most of these implementations correspond to a standard Java collection or function in a manner that is comparable to other similar technologies. However, in Hazelcast, the distribution and resilience capabilities are already built in.
- Standard utility collections:
- Map: Key-value pairs
- List: A collection of objects
- Set: Non-duplicated collection
- Queue: Offer/poll FIFO collection
- Specialized collection:
- Multi-Map: Key–collection pairs
- Lock: Cluster wide mutex
- Topic: Publish and subscribe messaging
- Concurrency utilities:
- AtomicNumber: Cluster-wide atomic counter
- IdGenerator: Cluster-wide unique identifier generation
- Semaphore: Concurrency limitation
- CountdownLatch: Concurrent activity gatekeeping
- Listeners: This notifies the application as things happen
Playing around with our data
In addition to data storage collections, Hazelcast also features a distributed executor service that allows runnable tasks to be created. These tasks can be run anywhere on the cluster to obtain, manipulate, and store results. We can have a number of collections that contain source data, spin up tasks to process the disparate data (for example, averaging or aggregating), and outputs the results into another collection for consumption.
However, more recently, along with this general-purpose capability, Hazelcast has introduced a few extra ways that allow us to directly interact with data. The MapReduce functionality allows us to build data-centric tasks to search, filter, and process held data to find potential insights within it. You may have heard of this functionality before, but this extracting of value from raw data is at the heart of what big data is all about (forgive the excessive buzzword cliché). While MapReduce focuses more on generating additional information, the EntryProcessor interface enables us to quickly and safely manipulate data in-place throughout the cluster—on single entries and whole collections or even selectively based on a search criteria.
Again, just as we can scale up the data capacities by adding more nodes, we can also increase the processing capacity in exactly the same way. This essentially means that by building a data layer around Hazelcast, if our application's needs rapidly increase, we can continuously increase the number of nodes to satisfy the seemingly extensive demands, all without having to redesign or rearchitect the actual application.