ESXi hosts with sustained high swapping
When an ESXi host can't reclaim necessary memory through ballooning, the host begins to swap memory to disk. Memory swapping on the host is a strong indication that the host is over provisioned and experiencing significant memory pressure. The latency introduced by the swapping has a noticeable performance impact on the virtual machines running on the host. You want to monitor and investigate hosts with high memory swapping.
Data required
- VMware. This procedure depends on data primarily obtained from the Splunk Add-on for VMware Metrics; however, log and event data from the VMWare environment can also provide additional insights into general VMWare environment health. Therefore, for best performance, you should also download and install Splunk Add-on for VMware ESXi Logs and Splunk Add-on for vCenter Logs.
Procedure
To optimize the search shown below, you should specify an index and a time range.
- Ensure that you have installed the IT Essentials Work app to onboard VMware data and provide the various VMware entity type configurations and dashboards.
- Ensure that you are collecting VMware data through one or more Data Collection Nodes, which are essentially Splunk heavy forwarders with specific VMware collection configurations.
- Run the following search. You can optimize it by specifying an index and adjusting the time range.
| mstats max(vsphere.esxihost.mem.llSwapUsed) AS vsphere.esxihost.mem.llSwapUsed WHERE (index=vmware-perf-metrics) BY name moid | stats max(vsphere.esxihost.mem.llSwapUsed) AS max_p_average_mem_llSwapUsed_kiloBytes BY name moid | eval is_high_swapping = if(max_p_average_mem_llSwapUsed_kiloBytes > 5000, 1, 0) | eventstats mean(max_p_average_mem_llSwapUsed_kiloBytes) AS mean_host_population stdev(max_p_average_mem_llSwapUsed_kiloBytes) AS stdev_host_population | eval stdev_from_host_population = (max_p_average_mem_llSwapUsed_kiloBytes - mean_host_population) / stdev_host_population | sort - stdev_from_host_population | table name moid max_p_average_mem_llSwapUsed_kiloBytes is_high_swapping stdev_from_host_population mean_host_population stdev_host_population
Search explanation
The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.
Splunk Search | Explanation |
---|---|
| mstats max(vsphere.esxihost.mem.llSwapUsed) AS vsphere.esxihost.mem.llSwapUsed WHERE (index=vmware-perf-metrics) BY name moid |
Calculate the average of llSwapUsed, which is the amount of space used for caching swapped pages in the host cache, in kilobytes, for each managed object ID (MOID). |
| eval is_high_swapping = if(max_p_average_mem_llSwapUsed_kiloBytes > 5000, 1, 0) |
Create the |
| eventstats mean(max_p_average_mem_llSwapUsed_kiloBytes) AS mean_host_population stdev(max_p_average_mem_llSwapUsed_kiloBytes) AS stdev_host_population |
Calculate the average and standard deviation of the results. |
| eval stdev_from_host_population = (max_p_average_mem_llSwapUsed_kiloBytes - mean_host_population) / stdev_host_population |
Calculate a running total of how many standard deviations away each MOID is from the average amount of space used and put the result in a field called |
|
Sort results with the largest standard deviation first. |
| table name moid max_p_average_mem_llSwapUsed_kiloBytes is_high_swapping stdev_from_host_population mean_host_population stdev_host_population |
Display the results in a table with columns in the order shown. |
Next steps
Sample results for this search are shown in the table below. None of the hosts have crossed the swapping threshold set in the search. You can see the average KB of memory swapped and which hosts are under some memory pressure and which are less so. From that information, you can determine if and where to move load from and to in order to better balance load.
moid |
Avg Mem swapped (KB) |
Swapping |
Mean Host Pop. |
Stdev of Host Pop |
Stdev Across Host Pop |
---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Finally, you might be interested in other processes associated with the Monitoring VMware virtual machine performance use case.