vRealize Operations Manager (vROps) supports Continuous Availability (CA) feature that separates the vROps cluster into two Fault Domains.
Fault Domains are composed by one or more analytics nodes grouped according to their physical location, and this configuration allows to stretch vROps nodes across vSphere clusters protecting the analytics cluster against the loss of an entire Fault Domain.
vRealize Operations Manager 8.3 enable Continuous Availability - pt.1
vRealize Operations Manager 8.3 enable Continuous Availability - pt.2
vRealize Operations Manager 8.3 enable Continuous Availability - pt.3
To implement vRealize Operations Manager CA three main components are required:
- Master Node - it collects and stores data and by default is assigned to Fault Domain 1.
- Replica Node - it is the replica of the Master Node and is assigned to Fault Domain 2. Master and Replica creates a pair. In case the Master Node fails, the replica node can take over all functions that the primary node provides.
- Witness Node - it doesn't store or collect any data but it serves as tiebreaker in case the communication between the two Fault Domains is lost avoiding the split-brain situation. The Witness brings one of the Fault Domains offline to avoid data inconsistency issues. Once the connection has been restored, the offline Fault Domain can be resumed through the Bring Online button. The Witness must be located in a third location which would not be affected by the loss of either Fault Domain 1 or 2.
The number of Data Nodes must always be an even number with a maximum of 16 (8 pairs). If a Data Node is added to Fault Domain 1 it must have a pair in Fault Domain 2 to preserve and replicate data that is added to its peer.
If Data Nodes cannot be split into different vSphere clusters, do not enable CA.
If Remote Collectors are deployed, they are added outside the Fault Domain.
Since several factors must be considered to size a vRealize Operations Manager cluster, you can use the online sizing tool to properly size your vROps cluster.
How Continuous Availability works
In case of a Fault Domain failure, the remaining active Fault Domain and the Witness Node will keep the cluster alive by promoting the Replica Node as the new Master Node. The cluster will run in degraded mode.
The failover procedure is automatic and takes two/three minutes to resume vROps operations and restart data collection.
Since the cluster runs in degraded mode, you need to perform the following actions to fix:
- Correct the primary node failure manually
- Replace the Primary Node to return to CA mode. This doesn't repair the node failure but instead the new node assumes the Primary Node role.
Analytics Nodes must be deployed in each Fault Domain, on separate hosts for redundancy and isolation. Using anti-affinity rules you can keep nodes on specific hosts.
Before proceeding with the deployment, keep in mind that to improve performance and reliability in vROps, some best practices should be applied to the cluster nodes:
- Deploy nodes on the same vSphere cluster in a single datacenter and add only one node at a time to a cluster
- Deploy cluster nodes to the same type of storage tier
- Use ESXi hosts with the same processor frequencies to ensure balanced performance
- Consider configuring some resource reservations for optimal performance
- When sizing, it is better to over-allocate than under-allocate resources
Install vROPs appliance
Download vROps from VMware and deploy the appliance to your VMware cluster as you normally do with .OVA files.
Once the deployment has been completed, Power On the vROps appliance.
To configure the first node, when the appliance has booted you can access the HTML GUI by entering the IP address reported on the console.
Configure a Single Node cluster
Open your preferred browser and enter the address https://<IP-primary-node> to access the GUI. You are automatically redirected to the Administration page.
Click New Installation to start the configuration.
Enter the admin Password then click Next.
Select Use the default certificates unless you have a certificate issued from a CA.
Enter the Cluster Master Node Name (for example vROps-Master) and Add the NTP Server Address for clock synchronization. Click Next.
Leave Availability Mode switch inactive. The Continuous Availability feature will be enabled later.
Since we are configuring a single node cluster, click Next.
Click Finish to begin the cluster initialization.
The system is preparing the node for first use.
Click on Start vRealize Operations Manager button to start the cluster.
Click Yes to proceed.
The cluster starts the initialization process.
The cluster is going online.
After some minutes, the vROps cluster is online.
From the preferred browser enter the address https://<IP_appliance> to access the GUI. Enter the credentials and click Login.
Accept the EULA and click Next.
Enter the Product Key and click on Validate License Key.
When the license has been validated successfully, click Next.
Click Finish to complete.
The vROps main page is displayed. The configuration of the Master Node has been completed.
The installation of the First Node of vROps has been completed. Part 2 will cover the deployment and configuration of the Witness Node and Data Node.