Following best practices for ingesting data from your Azure environment
Effectively collecting and utilizing information from your cloud environment is paramount for maintaining operational visibility, security, and performance. Microsoft Azure offers a vast array of services, generating critical data that, when properly ingested and monitored, can provide invaluable insights into your organization's health and efficiency. However, navigating the myriad Azure services and architectural patterns for data ingestion can be complex.
This article serves as a guide, offering service-specific recommendations for ingesting data from your Azure environment into the Splunk platform. There are multiple methods for ingesting data from Azure. The Splunk Validated Architecture (SVA) for Azure GDI discusses each of these methods in detail and helps you understand which method is best for your use case. The guidance for each of the services below builds upon the guidance in Azure SVA.
Azure data sources
Event Hubs
When available, any data source that can be sent to an Event Hub in Azure should use Event Hubs to send data to the Splunk platform. The recommended method, if available, is using Data Manager. If Data Manager is not available, the Splunk Add-on for Microsoft Cloud Services can be used.
Push based methods are the second most performant way to collect Event Hub data and can be done with Azure functions. Sample code is provided at on GitHub.
Pull based ingestion can be done through the Splunk Add-on for Microsoft Cloud Services.
For best performance be sure to use the correct number of partitions and throughput units for your workload. See the recommendations on GitHub.
Click to see available resources
- Splunk Lantern Article: Configuring Event Hubs in Data Manager
- Splunk GitHub: Azure Functions - Event Hub
- Splunk GitHub: Configure Azure Event Hub inputs for the Splunk Add-on for Microsoft Cloud Services
Azure virtual machine data
To collect OS or application logs, you should install a universal forwarder (UF) on Azure virtual machines. If installing a UF is not possible, logs can be sent to Azure storage and picked up with those methods.
Azure VM metadata should be pulled using the Microsoft Cloud Services app, using the Azure Resource input. This can also include disk, images, network interfaces, IP addresses, and security groups.
Click to see available resources
Splunk GitHub: Configure Azure Resource modular inputs for the Splunk Add-on for Microsoft Cloud Services
Azure storage
There are two primary ways to get data from Azure storage and each has their pros and cons depending on the data type and volume. See the push versus pull discussion in the SVA to make a decision. The most common Azure storage logs are NSG flow logs, application logs, and diagnostic logs.
Click to see available resources
- Splunk GitHub: Azure Functions - Storage
- Splunk GitHub: Connect to your Azure Storage Account with the Splunk Add-on for Microsoft Cloud Services
Cost and Billing
Cost and Billing information can only be pulled via API. The best method to pull this data is by using the Splunk Add-on for Microsoft Cloud Services.
Click to see available resources
Splunk GitHub: Configure Azure Consumption (Billing) inputs for the Splunk Add-on for Microsoft Cloud Services
Entra ID and activity logs
If available, the best method for collecting Entra ID audit logs and Azure activity logs is to use Data Manager. Data Manager configures a push-based ingestion method using Azure functions by supplying an Azure Resource Manager template. This enables a scalable and monitored solution that is easy to configure.
Alternatively, Entra ID logs can be pulled directly using the Splunk Add-On for Microsoft Azure. Entra ID and Azure activity logs can also be sent to Event Hubs and retrieved using either push or pull methods for Event Hubs.
Click to see available resources
Microsoft Defender
To retrieve all Microsoft Defender data, use a combination of methods. Depending on the specific Defender data you want to retrieve, use the Sankey chart linked in the resources below to determine the best path.
Click to see available resources
- External Resource: Getting Microsoft Cloud data into Splunk (Sankey chart)
- Splunk GitHub: Splunk Add-on for Microsoft Security
Microsoft 365
The supported method for retrieving Microsoft 365 logs is using the the Splunk Add-on for Microsoft Office 365.
Click to see available resources
Splunk GitHub: Splunk Add-on for Microsoft Office 365
OS and Application Logs
The best method for collecting OS level and application logs is to use the standard Splunk universal forwarder. If the application logs exist within a cloud service like Azure blob, follow the recommendations for that service to collect logs.
Click to see available resources
- Splunk Help: Monitor Windows data with the Splunk platform
- Splunk Help: Monitor files and directories
Additional resources
These additional resources might help you understand and implement this guidance:
- Splunk Lantern Article: Microsoft data descriptor
- Splunk Help: Get data with HTTP Event Collector
- Splunk Help: About the universal forwarder
- Splunk Help: Set up Data Manager
- Splunkbase: Splunk Add-on for Microsoft O365
- Splunkbase: Splunk Add-on for Microsoft Cloud Services
- Splunkbase: Splunk Add-On for Microsoft Azure

