Skip to main content
Registration for .conf24 is open! Join us June 11-14 in Las Vegas.
Splunk Lantern

Investigating unusual file system queries


Your organization uses NetFlow data, which you are ingesting into Splunk. These data show a handful of machines reaching out via HDFS (Hadoop Distributed File System) to your Hadoop cluster, randomly querying for specific files that don't exist.

Data required

Apache: Hadoop

To optimize the search shown below, you should specify an index and a time range. In addition, this sample search uses the Hadoop Distributed File System source type. You can replace this source with any other server log data used in your organization. 


A Splunk customer ran the following search:

| protocol=hdfs response=404
| sort BY rare filename

Next steps

After running this search, the customer found that the machines in question had malware on them and were searching for sensitive documents with names like salary.xls and personal.doc. The customer's endpoint detection and response solution did not detect the malware. Running this search improved their mean time to respond.

Splunk OnDemand Services: Use these credit-based services for direct access to Splunk technical consultants with a variety of technical services from a pre-defined catalog. Most customers have OnDemand Services per their license support plan. Engage the ODS team at if you require assistance.