Working with multivalue fields
When working with data in the Splunk platform, each event field typically has a single value. However, for events such as email logs, you can find multiple values in the “To” and “Cc” fields. Multivalue fields can also result from data augmentation using lookups.
If you ignore multivalue fields in your data, you might end up with missing and inaccurate data, sometimes reporting only the first value of the multivalue fields in your results. To properly evaluate and modify multivalue fields to get the results you need, the Splunk platform has some multivalue search commands and functions. Multivalue functions can be used with eval
, where
, or fieldformat
search commands. This article shows you how to use them.
In the examples used in this article, the makeresults
command (in Enterprise or Cloud) is used to generate hypothetical data for searches so that anyone can recreate them without the need to onboard data. The default field _time
has been deliberately excluded. _time
is a default field generated when the makeresults
command is used.
Scenario
Within one purchase transaction, Mary bought eggs, milk and bread. She paid for the eggs with cash and covered the remaining items using her credit card. This purchase transaction is equivalent to a log event. The values for each multivalue field are separated by the comma delimiter.
Makemv
command
The makemv
command is used to split the values of a field that appear like a single value into multiple values within an event based on the delimiter. A delimiter specifies the boundary between characters.
The values in the groceries
field have been split within the same event based on the comma delimiter. The values in the payment
field remain the same. The report shows the method of payment for all three grocery items but it does not specify the actual payment method used for each item. To expand the event into three separate events, one for each item and show the exact payment for each grocery item, we need a combination of commands and functions.
Learn more about using the makemv
command in Splunk Enterprise or Splunk Cloud Platform documentation.
Mvzip
function
The mvzip
function is used to tie corresponding values in the different fields of an event together. This helps to keep the association among the field values. This function takes two multivalue fields, X and Y, and combines them by stitching together the first value of X with the first value of field Y, then the second X with the second Y, and so on.
The new field, “zipped” is the result of the mvzip
function. The values of the groceries and payment fields are properly zipped together before expanding into separate events. At this point, the results are still within one event.
Learn more about using the mvzip
function in Splunk Enterprise or Splunk Cloud Platform documentation.
Mvexpand
command
The mvexpand
command expands the values of a multivalue field into separate events, one event for each value in the multivalue field. All other single field values and unexpanded multivalue field values will remain the same in each new event.
Mvexpand
works well at splitting the values of a multivalue field into multiple events while keeping other field values in the event as is, but it only works on one multivalue field at a time. For instance, in the above example, mvexpand
cannot be used to split both “zipped” and “payment” fields at the same time. The mvindex
function accomplishes this.
Learn more about using the mvexpand
command in Splunk Enterprise or Splunk Cloud Platform documentation.
Mvindex
function
Having zipped the values and created one field, “zipped”, you can now expand the “zipped” field into multiple events. The mvindex
function is a little more intricate. To further tie field values together so that accurate associations are made in the process of expanding the values into separate events, mvindex
separates the existing multivalued field into two chosen fields using index values. The following are possible index values using values= a,e,i,o,u:
- Indexes can start at zero if labeling from the first value. For example, a=0 e=1 i=2 o=3 u=4.
- The last character can start with -1. For example, a=-5 e=-4 i=-3 o=-2 u=-1.
- You could have a combination of both index patterns; a=0 e=1 i=2 o=-2 u=-1.
Mvindex
is used to assign index 0 to the first value in the group which represents groceries and index 1 to the second value representing payment method so that when the fields are split, the values will not get mixed up. The split
command is used to separate the values on the comma delimiter. Using mvindex
and split
functions, the values are now separated into one value per event and the values correspond correctly.
The stats command can also be used in place of mvexpand
to split the fields into separate events as shown below:
Learn more about using the mvindex
function in Splunk Enterprise or Splunk Cloud Platform documentation.
Mvcount
function
The mvcount
function can be used to quickly determine the number of values in a multivalue field using the delimiter. If the field contains a single value, the function returns 1 and if the field has no values, the function returns NULL.
As with single value fields, keep in mind that you may need a combination of multivalue commands/functions to get your report in the required format that will meet your specific use case.
Learn more about using the mvcount
function in Splunk Enterprise or Splunk Cloud Platform documentation.
Next steps
If there are situations in your data where a field is sometimes multivalue and other times null, see mvexpand multiple multi-value fields that may be null.
Want to learn more about working with multivalue fields in Splunk? Contact us today! TekStream accelerates clients’ digital transformation by navigating complex technology environments with a combination of technical expertise and staffing solutions. We guide clients’ decisions, quickly implement the right technologies with the right people, and keep them running for sustainable growth. Our battle-tested processes and methodology help companies with legacy systems get to the cloud faster, so they can be agile, reduce costs, and improve operational efficiencies. And with hundreds of deployments under our belt, we can guarantee on-time and on-budget project delivery. That’s why 97% of clients are repeat customers.
The user- and community-generated information, content, data, text, graphics, images, videos, documents and other materials made available on Splunk Lantern is Community Content as provided in the terms and conditions of the Splunk Website Terms of Use, and it should not be implied that Splunk warrants, recommends, endorses or approves of any of the Community Content, nor is Splunk responsible for the availability or accuracy of such. Splunk specifically disclaims any liability and any actions resulting from your use of any information provided on Splunk Lantern.