Skip to main content
 
 
Splunk Lantern

ESXi hosts with sustained high ballooning

 

When an ESXi host is running low on available physical memory, it attempts to reclaim memory from one or more virtual machines through a process called ballooning.

While some ballooning on your ESXi hosts is normal, frequent and sustained ballooning is a sign that the host is experiencing memory pressure. This situation causes performance degradations to the virtual machines assigned to the host. You want to monitor your network for this situation so you can take corrective action as needed.

Data required

Procedure

  1. Ensure that you have installed the IT Essentials Work app to onboard VMware data and provide the various VMware entity type configurations and dashboards.
  2. Ensure that you are collecting VMware data through one or more Data Collection Nodes, which are essentially Splunk heavy forwarders with specific VMware collection configurations.
  3. Run the following search. You can optimize it by specifying an index and adjusting the time range.
| mstats max(vsphere.esxihost.mem.vmmemctl) AS max.vsphere.esxihost.mem.vmmemctl WHERE (index=vmware-perf-metrics) AND name=* AND cluster_name=* AND vcenter=* AND sourcetype=vmware_inframon:perf:mem BY name moid 
| eval is_high_ballooning = if('max.vsphere.esxihost.mem.vmmemctl' > 10, 1, 0)
| eventstats mean(max.vsphere.esxihost.mem.vmmemctl) AS mean_host_population stdev(max.vsphere.esxihost.mem.vmmemctl) AS stdev_host_population
| eval stdev_from_host_population = ('max.vsphere.host.mem.vmmemctl'-'mean_host_population') / stdev_host_population
| sort - stdev_from_host_population

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation
| mstats max(vsphere.esxihost.mem.vmmemctl) AS max.vsphere.esxihost.mem.vmmemctl WHERE (index=vmware-perf-metrics) AND name=* AND cluster_name=* AND vcenter=* AND sourcetype=vmware_inframon:perf:mem BY name moid Calculate maximum of the average amount of memory reclaimed by the vmmemctl memory balloon driver, in kilobytes, for each host managed object ID (MOID).
| eval is_high_ballooning = if('max.vsphere.esxihost.mem.vmmemctl' > 10, 1, 0) Create the is_high_ballooning field for results where the driver has reclaimed more than 10 kilobytes.
| eventstats mean(max.vsphere.esxihost.mem.vmmemctl) AS mean_host_population stdev(max.vsphere.esxihost.mem.vmmemctl) AS stdev_host_population Calculate the mean memctl reclaimed and the standard deviation.
| eval stdev_from_host_population = ('max.vsphere.host.mem.vmmemctl'-'mean_host_population') / stdev_host_population Calculate how many standard deviations away each MOID is from the average amount of memory reclaimed on all hosts and put the result in a field called stdev_from_host_population.
| sort - stdev_from_host_population Sort results with the largest standard deviation first.

Next steps

Sample results for this search are shown in the table below. Ballooning is one of the techniques used to reclaim memory and facilitates the guest OS to release memory for reclamation. The High Ballooning value is Yes or No based on the threshold set above. The statistical values show how each host is doing with memory pressure. These results help you determine which hosts have sustained high ballooning and which hosts do not. You can also select a candidate to balance load based on the mean and standard deviation of the hosts that are not ballooning. For example, based on the sample data, you might move load from host-26 to host-11 or host-20.

moid Avg Memctl(KB) High Ballooning Mean of Host Pop. Stdev of Host Pop. Stdev Across Host Pop

host-26

22

Yes

15.25

6.185

1.091

host-10

19

Yes

15.25

6.185

0.606

host-11

10

No

15.25

6.185

-0.849

host-20

10

No

15.25

6.185

-0.849

Finally, you might be interested in other processes associated with the Monitoring VMware virtual machine performance use case.