Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Mastering vRealize Operations Manager

You're reading from   Mastering vRealize Operations Manager Analyze and optimize your IT environment by gaining a practical understanding of vRealize Operations 6.6

Arrow left icon
Product type Paperback
Published in Mar 2018
Publisher
ISBN-13 9781788474870
Length 426 pages
Edition 2nd Edition
Arrow right icon
Authors (3):
Arrow left icon
Chris Slater Chris Slater
Author Profile Icon Chris Slater
Chris Slater
Spas Kaloferov Spas Kaloferov
Author Profile Icon Spas Kaloferov
Spas Kaloferov
Scott Norris Scott Norris
Author Profile Icon Scott Norris
Scott Norris
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Going Ahead with vRealize Operations FREE CHAPTER 2. Which vRealize Operations Deployment Model Fits Your Needs 3. Initial Setup and Configuration 4. Extending vRealize Operations with Management Packs and Plugins 5. Badges 6. Getting a Handle on Alerting and Notifications 7. Capacity Management Made Easy 8. Aligning vRealize Operations with Business Outcomes 9. Super Metrics Made Super Easy 10. Creating Custom Views 11. Creating Custom Dashboards 12. Using vRealize Operations to Monitor Applications 13. Leveraging vRealize Operations for vSphere and vRealize Automation Workload Placement 14. Using vRealize Operations for Infrastructure Compliance 15. Troubleshooting vRealize Operations 16. Other Books You May Enjoy

Multi-node deployment, HA, and scalability

So far, we have focused on the new architecture and components of vRealize Operations 6.6, as well as starting to mention the major architectural changes that the GemFire-based Controller, Analytics, and Persistence layers have introduced. Now, before we close this chapter, we will dive down a little deeper into how data is handled in multi-node deployment, and, finally, how HA works in vRealize Operations 6.6, and what design decisions revolve around a successful deployment.

We are also going to mention what scalability considerations you should make to configure your initial deployment of vRealize Operations based on anticipated usage.

GemFire clustering

At the core of vRealize Operations, 6.6 architecture is the powerful GemFire in-memory clustering and distributed cache. GemFire provides the internal transport bus, as well as the ability to balance CPU and memory consumption across all nodes through compute pooling, memory sharing, and data partitioning. With this change, it is better to then think of the Controller, Analytics, and Persistence layers as components that span nodes, rather than individual components on individual nodes:

During deployment, ensure all your vRealize Operations 6.6 nodes are configured with the same amount of vCPUs and memory. This is because, from a load balancing point of view, vRealize Operations expects all nodes to have the same amount of resources as part of the controller's round-robin load balancing.

The migration to GemFire is probably the single largest underlying architectural change from vCenter Operations Manager 5.x, and the result of moving to a distributed in-memory database has made many of the new vRealize Operations 6.x features possible, including the following:

  • Elasticity and scale: Nodes can be added on demand, allowing vRealize Operations to scale as required. This allows a single Operations Manager instance to scale to 6 extra large nodes in a cluster, which can support up to 180,000 objects and 45,000,000 metrics.
  • Reliability: When GemFire HA is enabled, a backup copy of all data is stored in both the Analytics GemFire cache and the Persistence layer.
  • Availability: Even with the GemFire HA mode disabled, in the event of a failure, other nodes take over the failed services and the load of the failed node (assuming the failure was not the master node).
  • Data partitioning: vRealize Operations leverages GemFire data partitioning to distribute data across nodes in units called buckets. A partition region will contain multiple buckets that are configured during a startup, or migrated during a rebalance operation. Data partitioning allows the use of the GemFire MapReduce function. This function is a data-aware query, that supports parallel data querying on a subset of the nodes. The result of this is then returned to the coordinator node for final processing.

GemFire sharding

When describing the Persistence layer earlier, we listed the new components related to Persistence in vRealize Operations 6.6, Now it's time to discuss what sharding actually is.

GemFire sharding is the process of splitting data across multiple GemFire nodes for placement in various partitioned buckets. It is this concept in conjunction with the controller and locator services that balance the incoming resources and metrics across multiple nodes in the vRealize Operations Cluster. It is important to note that data is sharded per resource, and not per adapter instance. For example, this allows the load balancing of incoming and outgoing data, even if only one adapter instance is configured. From a design perspective, a single vRealize Operations cluster could then manage a maximum configuration vCenter by distributing the incoming metrics across multiple data nodes.

In vRealize Operations 6.6, the maximum number of VMware vCenter adapter instances certified is 60, and the maximum number of VMware vCenter adapter instances that were tested on a single collector is 40.

vRealize Operations data is sharded in both the Analytics and Persistence layers, which is referred to as GemFire cache sharding and GemFire Persistence sharding respectively.

Just because data is held in the GemFire cache on one node, this does not necessarily result in the data shard persisting on the same node. In fact, as both layers are balanced independently, the chance of both the cache shard and Persistence shard existing on the same node is 1/N, where N is the number of nodes.

In an HA environment, the databases that use GemFire sharding are Central, Alert/HIS, and FSDB. The Cassandra DB uses its own clustering mechanism.

Adding, removing, and balancing nodes

One of the biggest advantages of a GemFire-based cluster is the elasticity of adding nodes to the cluster as the number of resources and metrics grows in your environment. This allows administrators to add or remove nodes if the size of their environment changes unexpectedly; for example, a merger with another IT department, or catering for seasonal workloads that only exist for a small period of the year.

From a deployment perspective, we want to hide the complexities of scaling out from the user, so we deploy the whole stack at a time. When one instance/slice of the stack runs out of capacity (CPU/disk/memory), we can spin up another, and add more capacity. We can keep doing this as necessary to handle the scale.

Although adding nodes to an existing cluster is something that can be done at any time, there is a slight cost when doing so. As just mentioned, it is important when adding new nodes that they are sized the same as the existing cluster nodes; this will ensure during a rebalance operation that the load is distributed equally between the cluster nodes:

When adding new nodes to the cluster sometime after initial deployment, it is recommended that the Rebalance Disk option be selected under Cluster Management. As seen in the preceding figure, the warning advises that this is a very disruptive operation that may take hours and, as such, it is recommended that this be a planned maintenance activity. The amount of time this operation will take will vary depending on the size of the existing cluster and the amount of data in the FSDB. As you can probably imagine, if you are adding the eighth node to an existing seven-node cluster with tens of thousands of resources, there could potentially be several TBs of data that need to be re-sharded over the entire cluster. It is also strongly recommended that when adding new nodes the disk capacity and performance match that of existing nodes, as the Rebalance Disk operation assumes this is the case.

This activity is not required to start receiving the compute and network load balancing benefits of the new node. This can be achieved by selecting the Rebalance GemFire option, which is a far less disruptive process. As per the description, this process re-partitions the JVM buckets, balancing the memory across all active nodes in the GemFire federation. With the GemFire cache balanced across all nodes, the compute and network demand should be roughly equal across all the nodes in the cluster.

Although this allows early benefit from adding a new node into an existing cluster, unless a large number of new resources is discovered by the system shortly afterward, the majority of disk I/O for persisted, sharded data will occur on other nodes.

Apart from adding nodes, vRealize Operations also allows the removal of a node at any time, as long as it has been taken offline first. This can be valuable if a cluster was originally oversized for a requirement, and is considered a waste of physical compute resource; however, this task should not be taken lightly, as the removal of a data node without HA enabled will result in the loss of all metrics on that node. As such, it is recommended that removing nodes from the cluster is generally avoided.

If the permanent removal of a data node is necessary, ensure HA is first enabled to prevent data loss.
You have been reading a chapter from
Mastering vRealize Operations Manager - Second Edition
Published in: Mar 2018
Publisher:
ISBN-13: 9781788474870
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image