Skip to main content

 

Splunk Lantern

Getting started with Splunk ITSI

Splunk IT Service Intelligence (ITSI) is a premium analytics and IT management solution that correlates and applies machine learning to data for service insights and event analytics capabilities. ITSI enables you to be a manager of managers by allowing your teams to detect, respond and resolve incidents all from one place — helping to predict and prevent incidents before they happen and impact customers.

Here is a high-level view of Splunk ITSI and its key capabilities:

Screen Shot 2021-09-07 at 12.33.08 PM.png

Step 1: Getting Data In

There are two ways to get data into ITSI: entities and content packs.

Entities & Entity Integration

Entities and entity integrations are used to collect and aggregate data into Splunk ITSI. Data is collected into what we call Entities – you could define entities any way that fits your needs, but this usually includes data from servers, DNS groups, firewalls, or other devices. Data can be metrics, logs, traces - anything that helps you gain better visibility into the health of the services you are responsible for. Data is streamed and collected from native systems or management/monitoring tools like Splunk Infrastructure Monitoring. 

All entities exist in the Global team and can be created in a few ways: 

Manually create a single entity in ITSI

Create a single entity in ITSI to associate events your Splunk platform deployment receives.

You have to log in as a user with the itoa_admin or itoa_team_admin ITSI role. 

For more information, see Documentation - create a single entity in ITSI.

Manually import entities from a Splunk search in ITSI

Create entities from ITSI module searches, saved searches, or ad hoc searches using indexed data coming into your Splunk platform deployment. 

ITSI uses the itsiimportobjects command to import entities from searches.

You can import a maximum of 50,000 entities at a time in ITSI. If you attempt to import more than 50,000 entities, only the first 50,000 are imported.

Prerequisites:

  • ITSI role: You have to log in as a user with the itoa_admin or itoa_team_admin ITSI role and access to the Global team.
  • Indexed data: You must have already indexed data you want to associate with entities. 

For more information, see Documentation - import entities from a search in ITSI.

Manually import entities from a CSV file in ITSI 

Importing entities from CSV files is an efficient way to define multiple entities. You can dump data from a change management database (CMDB) or asset inventory database into a CSV file and automate the import for ongoing updates.

ITSI uses the itsiimportobjects command to import entities from a CSV file. All events your Splunk platform deployment indexes from a manual entity import from a CSV file is stored in the itsi_import_objects index and each event has the itsi_import_objects:csv source type.

You can import a maximum of 50,000 entities at a time in ITSI. If you attempt to import more than 50,000 entities, only the first 50,000 are imported.

Prerequisites: 

  • ITSI role: You have to log in as a user with the itoa_admin ITSI role.
  • CSV file: You must have a CSV file that contains entity definitions. Specify column names in the first row. In each subsequent row, specify an entity title and entity type, as well as one or more entity aliases, and one or more entity information fields. To associate an entity with a service, provide a column with the name of the service. Importing from a CSV file has a limit of one service and one entity per row. There is no limit on the number of dependent services, entity aliases, or entity rule values per row. A CSV file can contain multiple rows. Importing from a CSV file supports five different separators: comma (,), semicolon (;), pipe (|), tab (\t), and caret (^). 

In this example you want to create two entities called appserver-04 and appserver-05, and associate appserver-04 with the Web A service and associate appserver-05 with the Web B service. The Web A service already exists in ITSI but the Web B service does not. The following image shows the CSV file to import:

For more information, see Documentation - import entities from a CSV file in ITSI.

After you import entities either by creating single entities or from a Splunk search, you can configure recurring imports to update existing entities and create new entities. However, you can't set up a recurring entity import from a CSV file. To configure recurring entity imports from data that's stored in a CSV file, you have to configure a universal forwarder to monitor the CSV file and send data to your Splunk platform deployment, run an entity import from a Splunk search, and configure a recurring import from the Splunk search. 

For more information, see Set up a recurring entity import from a CSV file.

You can also automatically create entities and collect data on a recurring basis with ITSI entity integrations. The integrations that are available are:

To learn more, see Overview of entity integrations in ITSI.

Content Packs & Splunk App for Content Packs 

Content packs are individual preconfigured packs that provide capabilities for a specific use case. They can be installed directly within ITSI. Many content packs include service templates, so you can easily link one of your existing services to predetermined key performance indicators (KPIs), allowing you to get up and run faster and easier. 

Splunk App for Content Packs is a free application for ITSI (version 4.9 and later) that acts as a one-stop shop for content packs, and out-of-the-box searches and dashboards for common IT infrastructure monitoring sources. With this app, you no longer need to use the backup/restore functionality to install content packs. Instead, the app contains a library of readily updated content packs and is used to update all of them, rather than individually updating each content pack. 

The easiest way to onboard your data into Splunk ITSI is through content packs available on the Splunk App for Content Packs.  

Prerequisite: You must have command-line access and Splunk admin access to an ITSI v4.9 or later instance. 

  1. Download the Splunk App for Content Packs on Splunkbase.
  2. Install the app per the instructions on the Splunk Docs page.
  3. Go to Configuration > Data Integrations to see the available content packs.

Make sure to install the associated Add-On for the Content Pack you downloaded! For example, there is a corresponding Unix and Linux Add-On that works with the Monitoring Unix and Linux content pack.

For more information regarding: how to install the Splunk App for Content Packs on a Splunk Cloud Platform or on-premises environments, how to install content packs for ITSI version 4.8.x and below, and to see a list of available content packs, see the Splunk Content Packs Manual.

Step 2: Services and Service Insights

A service is a set of interconnected applications and hosts that are configured to offer a specific service to the organization. These services can be internal — an organization’s email system— or external —an organization’s website. 

You can create business and technical services that model those within your environment. Some services might have dependencies on other services. Services contain Key Performance Indicators (KPIs), which make it possible to monitor service health via service health scores, perform root cause analysis, receive alerts, and ensure that your IT operations are in compliance with business service-level agreements (SLAs).   

ITSI’s Service Insights allow you to create glass tables to help you monitor real-time interrelationships and dependencies via KPIs and service health scores across your IT and business services in one view. Glass tables also feature a drawing canvas where you can add visualizations in the form of KPIs and service health scores, upload images and icons, and add charts.

Another glass table used to evaluate business, operational and SLA performances, along with infrastructure status. 

Service Insights also enables you to create and use four kinds of dashboards: infrastructure overview, service analyzer, deep dives, and predictive analytics. 

Infrastructure Overview Dashboards provide a consolidated view of all your data integrations and investigation tools for operating systems, virtual infrastructures, containers, and cloud services. 

Service Analyzer Dashboards help you map dependencies based on a connection between devices and applications in a tile or topology view. You can visually correlate services to underlying infrastructure with a tile or tree view. You are also able to drill down to the code level and identify root causes directly from service monitoring dashboards.

Deep Dives Dashboards are an investigative tool to help you identify and analyze issues in your IT environment. 

Deep dives display a side-by-side view of KPIs and service health scores over time to help you zoom in on metric and log data and visually correlate root cause. 

Use side-by-side displays of multiple KPIs and correlate metrics over time to identify root causes.

Predictive Analytics Dashboards predict future incidents 30 minutes in advance using machine learning algorithms and historical service health scores.

Top five contributing service metrics are displayed to guide troubleshooting.

To learn more about these data models, see the Splunk ITSI interactive demo

The following are included: Glass Tables (step 1/8), Predictive Analytics Dashboard (step 2/8), Service Analyzer Dashboard (step 5/8), and Deep Dive Dashboard (step 6/8).

Service Modeling and Service Decomposition 

Before you are ready to set up your dashboards and services in Splunk ITSI, it’s important to identify what services will provide the most value.

Best Practices for selecting the right services to apply in Splunk ITSI

Screen Shot 2021-09-01 at 7.37.36 PM.png

To learn more, see the Tech Talk: Service Decomposition.

Create KPIs for Your Services

Every service you map in ITSI will have at least one KPI. KPIs are recurring saved searches that return the value of an IT performance metric. They are created within a specific service and define everything needed to generate searches to understand the underlying data, including how to access, aggregate, and qualify with thresholds. There are two types of KPIs: business and technical.

Doing pre-work with service decomposition to correctly identify what services are most valuable to the organization is a good first step to identifying appropriate KPIs to map to these services. Please schedule time with your account team to go through the service decomposition workshop. 

Best Practices for Choosing KPIs

Screen Shot 2021-09-01 at 7.36.17 PM.png

Good KPIs have the following characteristics:

  • Provide data regularly
  • Self normalizing data
  • Data with deltas, not counters 

You can also use Splunk App for Content Packs and Content Packs for preconfigured services and KPIs. Here are some KPIS available in the Microsoft 365 Content Pack: 

Availability KPIs

Performance KPIs

Group Administration Activities KPIs

Login Activity KPIs

Office 365 Security & Compliance Center

  • Extended recovery
  • False positive
  • Investigating
  • Restoring service
  • Normal service 
  • Added delegation entry 
  • Added service principal 
  • Set company information
  • Set password policy 
  • Added group
  • Added member to group
  • Deleted group
  • Updated group 
  • Authentication methods
  • Distinct user sign-ins
  • Logins by region
  • Logon errors 
  • User agents
  • User types 
  • Mail flow
  • Elevation of exchange admin privilege
  • Unusual external user file activity
  • Potentially malicious URL click was detected

After you have your services, entity rules, KPIs, and service dependencies planned out, you can finally create services in ITSI! There are three ways to do so:

For more information regarding creating services, see the Service Insights Manual.

Get Started with Service Insights

Service Insights within Splunk ITSI consists of various dashboard views, alerts and metrics so that you can effectively monitor and map services within your organization. Here are some ways to get better acquainted with the various available features and views.

Tasks to tackle

Source: BSI workshop

Step 3: Event Analytics

Event Analytics in Splunk ITSI is where you can streamline your incident management workflows, from alert management to incident response triggers. 

Get Started with Event Analytics

Tasks to tackle

Ingest events through correlational searches. 

The data itself comes from Splunk indexes, but ITSI only focuses on a subset of all Splunk Enterprise data. This subset is generated by correlation searches. A correlation search is a specific type of saved search that generates notable events from the search results.

See Overview of correlation searches in ITSI

Configure aggregation policies to group events into episodes. 

Once notable events start coming in, they need to be organized so you can start gaining value from them. Configure an aggregation policy to define which notable events are related to each other and group them into episodes. An episode contains a chronological sequence of events that tells the story of a problem or issue. In the backend, a component called the Rules Engine executes the aggregation policies you configure. 

For more information, see Overview of aggregation policies in ITSI.

Set up automated actions to take on episodes. 

You can run actions on episodes either automatically using aggregation policies or manually in Episode Review. Some actions, like sending an email or pinging a host, are shipped with ITSI. You can also create tickets in external ticketing systems like ServiceNow, Remedy, or VictorOps. Finally, actions can also be modular alerts that are shipped with Splunk add-ons or apps, or custom actions that you configure. 

For more information, see Configure episode action rules in ITSI.

To learn more about event analytics, see the documentation and Event Analytics section (step 7 / 8) on the Splunk ITSI interactive demo.

Event Analytics Best Practices for Third-Party Data Sources

To avoid duplicate events, use the same frequency and time range in correlation searches.

When configuring a correlation search, consider using the same value for the search frequency and time range to avoid duplicate events. For example, a search might run every five minutes and also look back every five minutes.

If there's latency in your data and you need to look for events you might have missed, consider expanding the time range. For example, the search could run every minute but look back 5 minutes.

To reduce load on your system, don't use a time range greater than 5 minutes.

Exceeding a calculation window of 5 minutes can put a lot of load on your system, especially if you have a lot of events coming in. If you want to avoid putting extra load on your system, consider reducing the time range to 5 minutes or less. 

One exception is if your data is coming in more sporadically. For example, if your data comes in every 15 minutes, consider using a 15-minute time range.

Normalize all the important fields in your third-party events.

When you're creating correlation searches, don't only normalize on obvious fields that exist in a lot of data sources, like host, severity, event type, message, and so on. It's also important to normalize fields that you know are important in your events. For example, when you're looking at Windows event logs, what do you look at to know if something is good or bad? Normalize those fields as well and use them to build out a common information model.

Perform this normalization process for every data source you have so you can easily identify important fields when creating aggregation policies.

Create one correlation search per data source.

For every third-party data source you're bringing into ITSI, create a single correlation search to normalize those fields and generate notable events. For example, one for SCOM, one for SolarWinds, and so on.

Don't create too many aggregation policies.

Limit the number of aggregation policies you enable in your environment. Too many aggregation policies create too many groups, which produces an overly granular view of your IT environment. By limiting the number of policies, you create more end-to-end visibility and avoid creating silos of collaboration between groups in your organization. Make sure to group events according to how those events are related, not based on how people work to resolve those issues.

Only select 5-10 fields for Smart Mode analysis.

Additional Resources