Skip to main content
Splunk Lantern

ESXi hosts with sustained high swapping

When an ESXi host can't reclaim necessary memory through ballooning, the host begins to swap memory to disk. Memory swapping on the host is a strong indication that the host is over provisioned and experiencing significant memory pressure. The latency introduced by the swapping has a noticeable performance impact on the virtual machines running on the host. You want to monitor and investigate hosts with high memory swapping.

Data required 

Procedure

To optimize the search shown below, you should specify an index and a time range.

  1. Ensure that you have installed the IT Essentials Work app to onboard VMware data and provide the various VMware entity type configurations and dashboards.
  2. Ensure that you are collecting VMware data through one or more Data Collection Nodes, which are essentially Splunk heavy forwarders with specific VMware collection configurations.
  3. Run the following search. You can optimize it by specifying an index and adjusting the time range.
| mstats max(vsphere.esxihost.mem.llSwapUsed) AS vsphere.esxihost.mem.llSwapUsed WHERE (index=vmware-perf-metrics) BY name moid
| stats max(vsphere.esxihost.mem.llSwapUsed) AS max_p_average_mem_llSwapUsed_kiloBytes BY name moid
| eval is_high_swapping = if(max_p_average_mem_llSwapUsed_kiloBytes > 5000, 1, 0)
| eventstats mean(max_p_average_mem_llSwapUsed_kiloBytes) AS mean_host_population stdev(max_p_average_mem_llSwapUsed_kiloBytes) AS stdev_host_population
| eval stdev_from_host_population = (max_p_average_mem_llSwapUsed_kiloBytes - mean_host_population) / stdev_host_population
| sort - stdev_from_host_population
| table name moid max_p_average_mem_llSwapUsed_kiloBytes is_high_swapping stdev_from_host_population mean_host_population stdev_host_population

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation
| mstats max(vsphere.esxihost.mem.llSwapUsed) AS vsphere.esxihost.mem.llSwapUsed WHERE (index=vmware-perf-metrics) BY name moid
| stats max(vsphere.esxihost.mem.llSwapUsed) AS max_p_average_mem_llSwapUsed_kiloBytes BY name moid

Calculate the average of llSwapUsed, which is the amount of space used for caching swapped pages in the host cache, in kilobytes, for each managed object ID (MOID). 

| eval is_high_swapping = if(max_p_average_mem_llSwapUsed_kiloBytes > 5000, 1, 0)

Create the is_high_swapping field for results where more than 5,000 kilobytes of space are used. 

| eventstats mean(max_p_average_mem_llSwapUsed_kiloBytes) AS mean_host_population stdev(max_p_average_mem_llSwapUsed_kiloBytes) AS stdev_host_population

Calculate the average and standard deviation of the results.

| eval stdev_from_host_population = (max_p_average_mem_llSwapUsed_kiloBytes - mean_host_population) / stdev_host_population

Calculate a running total of how many standard deviations away each MOID is from the average amount of space used and put the result in a field called stdev_from_host_population.

|sort - stdev_from_host_population

Sort results with the largest standard deviation first.

| table name moid max_p_average_mem_llSwapUsed_kiloBytes is_high_swapping stdev_from_host_population mean_host_population stdev_host_population

Display the results in a table with columns in the order shown.

Next steps

Sample results for this search are shown in the table below. None of the hosts have crossed the swapping threshold set in the search. You can see the average KB of memory swapped and which hosts are under some memory pressure and which are less so. From that information, you can determine if and where to move load from and to in order to better balance load. 

moid Avg Mem swapped (KB) Swapping Mean Host Pop. Stdev of Host Pop Stdev Across Host Pop

host-26

1000

No

325

471.699

1.431

host-11

300

No

325

471.699

-0.053

host-10

0

No

325

471.699

-0.689

host-20

0

No

325

471.699

-0.689

Finally, you might be interested in other processes associated with the Monitoring VMware virtual machine performance use case.