Spikes in printer activity in a Windows environment
Users who suddenly start printing a lot more pages from networked printers than normal could be a sign of data exfiltration. Sensitive data could be leaving your corporation. You want information about print jobs when a spike is detected, including the destination printers, the source of the print jobs, the names of files printed, and even whether the output was black-and-white or color.
Data required
Microsoft: Windows event logs
Procedure
- Verify that you deployed the Splunk Add-on for Microsoft Windows to the search heads and Splunk Universal Forwarders on the monitored systems. For more information, see About installing Splunk add-ons.
- Enable the
WinPrintMon://job
input. - Run the following search. You can optimize it by specifying an index and adjusting the time range.
sourcetype=WinPrintMon |bucket _time span=1d |stats sum(page_printed) AS Pages BY user _time |eventstats max(_time) AS maxtime |stats count AS num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'Pages',null))) AS "Pages" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'Pages',null))) AS avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'Pages',null))) AS stdev BY user |eval lowerBound=(avg-stdev*1), upperBound=(avg+stdev*1) |eval isOutlier=if(('Pages' < lowerBound OR 'Pages' > upperBound) AND num_data_samples >=7, 1, 0) |table user num_data_samples Pages avg lowerBound upperBound isOutlier
Search explanation
The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.
Splunk Search | Explanation |
---|---|
sourcetype=WinPrintMon |
Search only printer data. |
|bucket _time span=1d |
Group data into bins that are 1-day spans. |
|stats sum(page_printed) AS Pages BY user _time |
Calculate the total number of pages printed by each user per day. |
|eventstats max(_time) AS maxtime |
Keep the maximum time value in the event to be used as the latest day from which to look back one day. |
|stats count AS num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'Pages',null))) AS "Pages" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'Pages',null))) AS avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'Pages',null))) AS stdev BY user |
Calculate actual number of pages printed in the most recent day or current day. Then calculate the average pages printed in the days before the most recent day and the standard deviation for the same days before. |
|eval lowerBound=(avg-stdev*1), upperBound=(avg+stdev*1) |
Set the upper and lower bounds of the average to be the average + and - a single standard deviation. |
|eval isOutlier=if(('Pages' < lowerBound OR 'Pages'> upperBound) AND num_data_samples >=7, 1, 0) |
Determine outliers (and therefore a spike in pages printed or not printed) by evaluating if today's number of pages is near the average plus or minus one standard deviation. Set data to an outlier only if the number of days sampled is at least 7 days. |
|table user num_data_samples Pages avg lowerBound upperBound isOutlier |
Display the results in a table with columns in the order shown. |
Next steps
The search shows some useful techniques for comparing current rates of change to past rates of change and could be a framework for other basic predictive searches. It might be interesting to correlate this behavior to a watchlist that contains the user IDs of personnel who are considered higher risk: contractors, new employees, employees that never go on vacation, and employees with access to particularly sensitive data. In the sample data below, the large number of pages printed by Chuck could be indicative of risky user behavior.
user | num_data_samples | pages | avg | lowerBound | upperBound | isOutlier |
---|---|---|---|---|---|---|
alice |
26 |
12 |
62.625 |
-1.53228804 |
126.782288 |
No |
chuck |
22 |
4983 |
16.7 |
0.393398173 |
33.00660183 |
Yes |
Finally, you might want to look at additional searches in Managing printers in a Windows environment.