Logstash, File Input, CSV Filter and Elasticsearch Output


Logstash, File Input Plugin, CSV Filter and Elasticsearch Output Plugin Example will read data from CSV file, Logstash will parse this data and store in Elasticsearch.

Pre-Requisite

  • Logstash 5.X
  • Elasticsearch 5.X

Below Logstash configuration file is considered based data in CSV file.You can modify  this configuration file as per you data in your CSV file.

Logstash File Input CSV Filter Elasticsearch Output

Sample Data 

transactions-sample-data.txt

TRANSACTION_COUNT|TRANSACTION_DATE|TRANSACTION_TYPE|SERVER
18|07/24/2017|New Customer|SVR-1
9|07/25/2017|Online Customer|SVR-2
9|07/26/2017|Agents|SVR-3
12|07/24/2017|In Store|SVR-1
13|07/25/2017|New Customer|SVR-2
18|07/26/2017|Online Customer|SVR-3
21|07/24/2017|Agents|SVR-2
13|07/25/2017|In Store|SVR-3
15|07/26/2017|New Customer|SVR-4

Logstash Configuration File

Create Logstastash configuration file logstash- installation-dir/bin/transaction-test.conf and paste below content.

input {
    file {
        path => "/opt/app/facinissuesonit/transactions-sample-data.txt"
        start_position => beginning
    }
}
filter {
    csv {
        #add mapping columns name correspondily values assigned
        columns => ["TRANSACTION_COUNT","TRANSACTION_DATE","TRANSACTION_TYPE","SERVER"]
        separator => "|"
        remove_field => ["message"]
        }
#Date filter is used to convert date to @Timestamp sho that chart in Kibana will show as per date
    date {
        match => ["TRANSACTION_DATE", "MM/dd/yyyy"]
    }
#Remove first header line to insert in elasticsearch
    if  [TRANSACTION_TYPE] =~ "TRANSACTION_TYPE"
{
drop {}
}
}
output {

	elasticsearch {
   # Create Index based on date
		index => "app-transactions-%{+YYYY.MM.dd}"
    		hosts => ["elasticsearver:9200"]
  		}
#Console Out put
stdout
         {
         codec => rubydebug
        # debug => true
         }
}

Information about configuration file :

File Input Plugin :  will read data from file and because we set as start-position as “Beginning” will always read file form start.

CSV Filter : This filter will read each line message , split based on “|” and map with corresponding column mentioned position and finally will remove this message field because data is parsed now.

Date Filter : This filter will map TRANSACTION_DATE  to @timestamp value for Index for each document and it says to TRANSACTION_DATE  is having pattern as “MM/dd/YYYY” so that when converting to timestamp will follow same.

drop: Drop is for removing header line if field name match with content.

Run Logstash Configuration with below command

 [logstash-installation-dir]/bin/logstash -f transaction-test.conf

For learning validation and start Logstash with other option follow link Logstash Installation, Configuration and Start

Logstash Console Output

If you noticed by using Date filter index @timestamp value is generating based on value of TRANSACTION_DATE and for elasticsearch output configuration for index name app-transactions-%{+YYYY.MM.dd} will create 3 indexes based on @timestamp value as   app-transactions-2017.07.24 , app-transactions-2017.07.25, app-transactions-2017.07.26 for sample data.

{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/24/2017",
"@timestamp" => 2017-07-24T04:00:00.000Z,
"SERVER" => "SVR-1",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "New Customer",
"TRANSACTION_COUNT" => "18"
}
{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/25/2017",
"@timestamp" => 2017-07-25T04:00:00.000Z,
"SERVER" => "SVR-2",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "Online Customer",
"TRANSACTION_COUNT" => "9"
}
{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/26/2017",
"@timestamp" => 2017-07-26T04:00:00.000Z,
"SERVER" => "SVR-3",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "Agents",
"TRANSACTION_COUNT" => "9"
}
{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/24/2017",
"@timestamp" => 2017-07-24T04:00:00.000Z,
"SERVER" => "SVR-1",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "In Store",
"TRANSACTION_COUNT" => "12"
}
{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/25/2017",
"@timestamp" => 2017-07-25T04:00:00.000Z,
"SERVER" => "SVR-2",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "New Customer",
"TRANSACTION_COUNT" => "13"
}
{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/26/2017",
"@timestamp" => 2017-07-26T04:00:00.000Z,
"SERVER" => "SVR-3",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "Online Customer",
"TRANSACTION_COUNT" => "18"
}
{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/24/2017",
"@timestamp" => 2017-07-24T04:00:00.000Z,
"SERVER" => "SVR-2",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "Agents",
"TRANSACTION_COUNT" => "21"
}
{
"path" => "/opt/app/facinissuesonit/transactions-sample-data.txt",
"TRANSACTION_DATE" => "07/25/2017",
"@timestamp" => 2017-07-25T04:00:00.000Z,
"SERVER" => "SVR-3",
"@version" => "1",
"host" => "facingissuesonit.saurabh.com",
"TRANSACTION_TYPE" => "In Store",
"TRANSACTION_COUNT" => "13"
}

Summary

In above detail cover about below points:

  • Logstash File Input reading.
  • How to apply CSV filter for “|” and map with fields.
  • How to drop header line if exist in CSV file
  • Date Filter to get Index Timestamp value based on fields and pattern
  • Dynamic Index Name for each day by appending date format
  • Start Logstash on background for configuration file.

Read More

To read more on Logstash Configuration,Input Plugins, Filter Plugins, Output Plugins, Logstash Customization and related issues follow Logstash Tutorial and Logstash Issues.

Hope this blog was helpful for you.

Leave you feedback to enhance more on this topic so that make it more helpful for others.

Reference:

https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html

About Saurabh Gupta

My Name is Saurabh Gupta, I have approx. 10 Year of experience in Information Technology World manly in Java/J2EE. During this time I have worked with multiple organization with different client, so many technology, frameworks etc.
This entry was posted in Centralize logging, Elasticsearch, ELK, Logstash, Tutorials and tagged , , , , , , , , , , . Bookmark the permalink.