Using Splunk as a data store for developers
A great use of the Splunk platform for software developers is to put data for their applications into the environment and then search on the indexed data. This helps them avoid doing sequential search on unstructured text. However, there are many no-SQL data stores available. So what is the benefit to using the Splunk platform as a data store?
Solution
This use of the Splunk platform provides simplicity, performance, and scale.
- Simplicity. You can put any type of time series text into the Splunk platform without having to worry about its format.
- Performance. The Splunk platform offers universal forwarders to send data from remote places, regardless of whether the data comes from a file, a network port, or the output of an API (known to Splunk users as scripted input). With this universal indexing, all data separated by punctuation in the event stream gets indexed. If all data is indexed, search speed is incredibly fast for any search term. To make matters even better, Bloom filters used in the Splunk platform make searching faster than simply indexing all the data, especially when searching for a very rare bit of data.
- Scale. The MapReduce algorithm horizontally scales hosts that index the data. The user of the Splunk platform does not have to write or think about MapReduce as it happens implicitly.
Getting data in is one thing, but getting it out is quite another. The ability to use "Google like" searches with AND (implicit), OR, and NOT to retrieve events makes for a natural search experience. However, the real power of the Splunk platform is the included Splunk Search Processing Language (SPL) that aids productivity and analysis. If you combine universal indexing, a scalable engine to do the work, and a comprehensive set of commands to become productive quickly, you’ll see why it's a great idea to use the Splunk platform as a developer data store to manage all the types of input shown in the following image.
Steps to get started
- Download the Splunk platform and install it. You can start with the free version.
- (Optional) If you plan to send data from remote locations, download Universal Forwarders.
- Use the web interface to get data in and to test out some SPL. If this is your first time using the Splunk platform, try the Search Tutorial.
- Use one of the open source software development kits (SDKs) to interact with the Splunk platform using Java, Python, or JavaScript. Each SDK follows this pattern to retrieve data:
- Connect to the Splunk platform.
- Authenticate, which may be implicit with configuration files with some languages.
- Request a search job to run a search. The search will be the same type of search text string you ran from the web interface.
- Iterate over the results to achieve a goal. Results for matching events can come back as raw text, JSON, XML, or CSV formatted.
- Disconnect, if needed.
Next steps
The ease of getting time series data stored into the Splunk platform with full fidelity, the ability to have it be universally indexed, the capability to scale to large amounts of data, and the inclusion of a powerful set of search commands are all reasons that software developers should be using the Splunk platform as a data store.
The steps presented above should get you started with this process, but more docs are at the Splunk platform developer website. For certain SDK languages, there might be more integrations that adhere to the culture of the language. For instance, the Java SDK works inside of NetBeans and Spring.