Virtualization data is a type of data that comes from software that is generally identified as a hypervisor. The hypervisor software allows a single physical computer to run multiple instances of an operating system, making it behave like multiple computers. These instances are called virtual machines. The main benefits are increased utilization of the underlying hardware and greater workload isolation. The hypervisor also simplifies and accelerates the provisioning of virtual machines (VMs) and allows for workloads to be moved from one physical machine to another without interrupting the work being done.
Functionally, a hypervisor is very similar to a traditional operating system as it presents a uniform interface to the hardware and coordinates the sharing of resources by the VMs. Example hypervisors are VMware ESXi, Microsoft Hyper-V, Xen, and Virtual Box.
Virtualization has been around for a long time and is not limited to computation. It is also found in storage, networking, and application execution environments like Java and Python. This article, however, limits the source of virtualization data to hypervisors.
Data visibility
Monitoring virtualization data is similar to monitoring OS related data in that we are interested in metrics such as cpu, disk, memory, memory management IO, and scheduling. Scheduling activity is very important because VMs share resources. All these metrics help identify how to keep loads balanced and can explain why certain VMs are not performing as expected.
Commonly monitored components in a hypervisor are:
- Inventory of hosts and guests (clustered)
- Location of VM on host
- Resource utilization
- Resource scheduling
- Virtual (V6l) Memory
- V6l cpu
- V6l IO (networking and storage interfaces)
- Filesystem and snapshot counts and sizes
- Hypervisor logs for tasks, events, and troubleshooting
Data application
When your Splunk deployment is ingesting virtualization data, you can use the data to achieve objectives related to the following use cases:
High-value fields
This data type has many available fields, but users typically derive the most value out of the fields listed here.
memory
Average or peak usage, especially as it relates to ballooning and swapping which decrease performance of VMs.
alarms
Various alarms that can point to a broad set of conditions that can affect performance. Others can indicate the cause of failures or predict failures.
inventory
Host that VMS are running on. There are fields that can be used to identify VM motion across hosts.
CPU
Compute bound VMs and hosts, and how to spread load for efficiency and performance of applications running on the VMs.
storage
Where space is being consumed and how much is available before reaching limits. This is useful for capacity planning and avoiding outages.
Note that the names of these fields vary, depending on the data source. The Splunk Common Information Model (CIM) can be added to your deployment to normalize and validate data at search time, accelerate key data in searches and dashboards, or create new reports and visualizations. In the Common Information Model, virtualization data is typically mapped to the Inventory and Performance data models.
Known data sources and source types
Guidance for onboarding data can be found in the Splunk documentation, Getting Data In. In addition, the following data sources have add-ons and apps available in Splunkbase to optimize data collection and help you with analysis and visualizations.
Data Source |
Sourcetype |
Recommend Add-Ons |
VMware |
sourcetype="vmware:*:*" There are many available sourcetypes, depending on what data you need. |
|
Microsoft Hyper=V |
sourcetype="microsoft:hyperv:*" There are many available sourcetypes, depending on what data you need. |
Comments
0 comments
Please sign in to leave a comment.