Nutanix: Best Practices

The purpose of this document is to offer you some best practices to monitor your Nutanix Cluster in order to produce an IT Weather User Service reflecting its current state.

Nutanix Architecture, Concepts and Components

The Nutanix virtual computing platform is a scalable, converged computing and storage system specifically designed to host and store virtual machines.

All nodes in a Nutanix cluster aggregate to provide a unified tiered storage pool and present virtual machines with seamless, resource access. A global, system architecture integrates each new node into the cluster, allowing you to scale the solution to meet the data needs of your infrastructure.

The basic unit for the cluster is a Nutanix node. Each node in the cluster runs a standard hypervisor and contains CPUs, memory, and local storage (SSDs / hard disks).

A Nutanix Controller virtual machine runs on each node, allowing for pooling of local storage across all nodes in the cluster.


Nutanix Cluster Monitoring

Monitor the status and performance of your Nutanix Cluster by adding specific Nutanix service templates:

  • Nutanix-Cluster-Status
  • Nutanix-Cluster-IOPS
  • Nutanix-Cluster-Latency
  • Nutanix-Cluster-IOBandwidth

Monitoring Capacity and Performance of Nutanix Cluster Storage

Monitor the general use of Nutanix Cluster Storage using Nutanix service templates that measure general usage:

  • Nutanix-Cluster-StorageUsage

A Nutanix Cluster Storage cluster consists of a Container grouping one or more Storage Pools. In the event of an alert from the ‘Nutanix-Cluster-StorageUsage’ service template, you can immediately get further insight into the underlying cause by additionally monitoring the Containers and Storage Pools with the following service templates:

  • Nutanix-Container-Usage
  • Nutanix-Storage Pool-Usage

Monitoring the Nutanix Blocks in the Cluster

Ensure the health of your Nutanix Blocks by using the following service template:

  • Nutanix-Disk-Status

Monitoring each Nodes Hypervisor Host

Also make sure the hypervisors of each Nutanix Node are in good health.

If you are using VMware technology, you can use the following service templates to monitor the health of your ESXs:

  • VMware-ESX-CPU
  • VMware-ESX-Datastore
  • VMware-ESX-DiskIO-Read
  • VMware-ESX-DiskIO-Write
  • VMware-ESX-Hardware
  • VMware-ESX-NetUsage
  • VMware-ESX-RAM
  • VMware-ESX-Runtime_Issues
  • VMware-ESX-Runtime_status
  • VMware-ESX-Services
  • VMware-ESX-Services-WithExclusion
  • VMware-ESX-SWAP

Model your Nutanix Cluster with the ServiceNav Monitoring Tool.

Once the monitoring is in place, you can create an IT Weather User Service.

The screenshot below is an example; we can see the different elements constituting a Nutanix Cluster including:

  • The status and performance of the Cluster,
  • Storage usage,
  • The physical state of the different devices comprising the cluster.

As shown in the screenshot below, the advantage of such a visual model is to be able, in case of an issue, to directly identify the “root cause” of a degradation of the Nutanix Cluster. The screenshot below clearly shows a degradation due to a Critical Alert from ‘Pool-Usage_1’ of ‘Container_1’:

UK ServiceNav Product Development Manager; my priority is to be needful of the particular requirements of all ‘English-speaking’ markets where ServiceNav is sold. I have over 20 years experience of the IT monitoring field - covering a wide variety of products and technologies.