Exporting Data From Splunk for purposes of Visualization, Modelling & Forecasting

Knowledge Base & Community Wiki

Exporting Data From Splunk for purposes of Visualization, Modelling & Forecasting

in

What Is Splunk – Splunk is the heavyweight commercial software which enables you to index, visualise and explore virtually any machine generated data. Splunk is often used to consume Apache, Varnish and Nginx web server logs as well as website clicks and any other data which maintains a constant format. You can learn more about Splunk, where to download it and how to install it at the following location – http://community.visualize-it.co/knowledgebase/installing-splunk-on-ubuntu/

Pre-requisites for exporting data from Splunk – A pre-requisite for extracting data from Splunk is as follows –

  • Setup Splunk (Splunk lite, Splunk Enterprise, Splunk Cloud / Hunk, etc.) for your environment
  • Configure Splunk collectors to collect data from your applications, infrastructure, network devices, etc.
  • Collect data for key Application, Infrastructure workload drivers i.e. CPU Utilization, Mem Utilization, Orders Placed Per Hour, Messages Transmitted Per Hour, etc. for an extended period of time
  • Confirm that you are able to see the data from your various key applications, infrastructure being collected, indexed and stored by Splunk

To learn more about configuring and setting up Splunk to collect data for key workload drivers please visit – http://community.visualize-it.co/knowledgebase/installing-splunk-on-ubuntu/

Exporting data from Splunk – This section assumes that you have addressed all the pre-requisites as listed in the previous section. If required please go back and read through the pre-requisites section again.

Exporting data from Splunk is relatively easy. To get started let’s login to your Splunk installation.

splunk_1Splunk Light Login Screen

 

Once you have logged into your Splunk installation which by now holds a treasure trove of data you would need to execute a search using the Splunk Search option.

 

splunk_2Splunk Dashboard showing the Search tab right at the top

 

Before we execute the Search query in Splunk we visualize the data for the given workload dimension which in this case is CPU Utilization. Make sure you have relevant data for the given workload driver over the period of time you would like to perform modelling.

 

splunk_3CPU Utilization Workload Data Presented by Splunk

 The query we execute in this case is – host=aragorn AND source=”/opt/perfstats/cpu_load.txt”| timechart max(load1) max(load5) max(load15)  span=5m

The query requests Splunk to provide us a view of data obtained from the source – “/opt/perfstats/cpu_load.txt”. We are requesting the data such that the data is rolled up into 5 minute segments for the following attributes found within the source file i.e. load1, load5, load15. For those of you who have experience working on Unix systems you would realize that this is CPU Load values collected at the system level over a period of time.

The data in the source file “/opt/perfstats/cpu_load.txt” has been collected using the Nagios monitoring plugins. To see how to setup basic system monitors that could feed data into Splunk please look at the articles at the Statistical Modelling section of the wiki.

 

splunk_4Issuing a query in Splunk to obtain data for load1, load5, load15

 

The resolution currently is set of “Last 24 Hours” which you can see on the top right hand side of the screen, adjacent to the search key. Change this to the period for which you would like to extract data from Splunk. In our case we choose to extract data for the above query for a period of 30 days. Splunk will limit the display of data on the screen due to various reasons but do not fret, the data will be available when you export it to your machine.

Once you’ve run the query and changed the time resolution, give Splunk time to complete running the query. The Splunk interface if very intuitive and you will see a progress bar that provides a view of when the query has completed execution. Once the execution is complete you can hit the download button (adjacent to the Fast Mode) text on the right hand side corner of the screen to export the data from Splunk. Also, avoid closing the Splunk window while the query is executing or the file is still downloading, else you’ll end up with missing data in your data exports. Splunk will provide the results of the query in a CSV (Comma Separate Values) format which is a format supported by VisualizeIT. 

This graph in the image below provides a view of another query which we ran to obtain Memory Utilization data for a different machine. The command we used to search for the data was – host=gimli AND source=”/opt/perfstats/mem_free.txt” | timechart max(Aragorn_Mem_Utilization) span=1m

The query requests Splunk to provide us a view of data obtained from the source –”/opt/perfstats/mem_free.txt”. We are requesting the data such that the data is rolled up into 5 minute segments. “Aragorn Memory Utilization” is a Splunk entity which has been configured to extract the Memory Used field from the source file “/opt/perfstats/mem_free.txt”. For those of you who have experience working on Unix systems you would realize that this is Memory Utilization value collected at the system level over a period of time.

The data in the source file “/opt/perfstats/mem_free.txt” has been collected using the Nagios monitoring plugins. To see how to setup basic system monitors that could feed data into Splunk please look at the articles at the Statistical Modelling section of the wiki – http://community.visualize-it.co/knowledgebase/collecting-performance-data-from-unix-systems/

 

splunk_5Issuing a query in Splunk to download Memory Utilization data

The default resolution for this data was initially set at “Last 24 Hours”. If you look at the right hand side top (Adjacent to the search button) We changed this to the period for which we wanted to extract data from Splunk i.e. “All Time” which in our case was the last 30 days for which we were collecting data. Splunk will limit the display of data on the screen due to various reasons but do not fret, the data will be available when you export it to your machine.

Once you’ve run the query and changed the time resolution, you should give Splunk time to complete running the query. The Splunk interface if very intuitive and you will see a progress bar that provides a view of when the query has completed execution. Once the execution is complete you can hit the download button (adjacent to the Fast Mode) text on the right hand side corner of the screen to export the data from Splunk. Also, avoid closing the Splunk window while the query is executing or the file is still downloading, else you’ll end up with missing data in your data exports. Splunk will provide the results of the query in a CSV (Comma Separate Values) format which is a format supported by VisualizeIT.

 

splunk_6Highlighting the export button on Splunk

 What does the exported data look like – Splunk’s CSV exports will provide data in the following format i.e.

“_time”,count

“2015-12-13T17:00:00.000-0600”,616

“2015-12-13T18:00:00.000-0600”,708

“2015-12-13T19:00:00.000-0600”,708

“2015-12-13T20:00:00.000-0600”,708

“2015-12-13T21:00:00.000-0600”,708

“2015-12-13T22:00:00.000-0600”,708

“2015-12-13T23:00:00.000-0600”,708

“2015-12-14T00:00:00.000-0600”,710

“2015-12-14T01:00:00.000-0600”,719

“2015-12-14T02:00:00.000-0600”,710

The Splunk CSV format is supported by VisualizeIT and you are now ready to import this data directly into VisualizeIT using the Data Management capability within the Statistical Modelling section.

Conclusion – In this article we’ve looked at a brief introduction to Splunk, the pre-requisites for exporting data from Splunk, the approach to query and export data using Splunk and final the supported CSV formats from a Splunk standpoint. Importing data from Splunk is pretty straight forward, it’s the configuration of Splunk to collect data from your various workload drivers that can tend to be an arduous task. Happy hacking!!!

Modelling Solution: VisualizeIT offers access to a bunch of Analytical Models, Statistical Models and Simulation Mcropped-visualize_it_logo__transparent_090415.pngodels for purposes of Visualization, Modelling & Forecasting. Access to all the Analytical (Mathematical) models is free. We recommend you try out the Analytical models at VisualizeIT which are free to use and drop us a note with your suggestions, input and comments. You can access the VisualizeIT website here and the VisualizeIT modelling solution here –VisualizeIT.

This entry was posted in   .
Bookmark the   permalink.

Admin has written 0 articles

VisualizeIT Administrator & Community Moderator