Skip to main content
Splunk Lantern

Spike in downloaded documents per user on Salesforce cloud

You might need to look for a sudden, high-volume increase in downloaded documents when doing the following:

Prerequisites 

In order to execute this procedure in your environment, the following data, services, or apps are required:

Example

A sudden, high-volume increase in downloaded documents can indicate unauthorized, non-compliant, and potentially malicious behavior. Because so many people in your organization have access to Salesforce, this is an activity you want to monitor for regularly. 

To optimize the search shown below, you should specify an index and a time range.

  1. Populate the lookup_sfdc_usernames lookup provided by the Salesforce Add-on with live values from your site.
  2. Run the following search:
EVENT_TYPE=DocumentAttachmentDownloads
|lookup lookup_sfdc_usernames USER_ID
|bucket _time span=1d 
|stats count BY Username _time
|stats count AS num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) AS count avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) AS avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) AS stdev BY Username
|eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
|where 'count' > upperBound AND num_data_samples >=7

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation

EVENT_TYPE=DocumentAttachmentDownloads

Filter for the DocumentAttachmentDownloads EVENT_TYPE. 

|lookup lookup_sfdc_usernames USER_ID

Convert the SFDC USER_ID into a friendly username via a lookup. 

|bucket _time span=1d 

Group events based on _time, effectively flattening the actual _time value to the same day.

|stats count BY Username _time

Count and aggregate per user, per day. 

|stats count AS num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) AS count avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) AS avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) AS stdev BY Username

Calculate the mean, standard deviation, and most recent value. 

|eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)

Calculate the bounds as a multiple of the standard deviation.

|where 'count' > upperBound AND num_data_samples >=7

Display events that have a frequency of occurrence above the calculated upperBound and seven or more data samples.

Result

While there are no traditional false positives in this search, there will be a lot of noise. Every time this search runs, it will accurately measure a spike in the number of documents monitored. 

How you handle these alerts depends on where you set the standard deviation. If you set a low standard deviation (2 or 3), you are likely to get a lot of events that are useful only for contextual information. If you set a high standard deviation (6 or 10), the amount of noise can be reduced enough to send an alert directly to analysts.

For most environments, these searches can be run once a day, often overnight, without a lag. If you want to run this search more frequently, or if this search is too slow for your environment, use a summary index that first aggregates the data. 

When this search returns values, initiate your incident response process and identify the user demonstrating this behavior. Capture the time of the event, the user's role, and number of documents downloaded. If possible, determine the system used to download this data and its location. Contact the user and their manager to determine if the download is authorized, and then document that it was authorized and by whom. If you cannot find authorization, the user credentials may have been used by another party and additional investigation is warranted.

  • Was this article helpful?