Category Archives: YAML

YAML and JAVA Configuration


Below are Tools which support YAML for JAVA API’s. Here I will focus only on SnakeYAML which widely used by industry.

  • JvYaml
  • SnakeYAML
  • YamlBeans
  • JYaml
  • Camel

SnakeYAML Configuration

JAVA


<dependency>
    <groupId>org.yaml</groupId>
    <artifactId>snakeyaml</artifactId>
    <version>1.20-SNAPSHOT</version>
  </dependency>

Android


<dependency>
  <groupId>org.yaml</groupId>
  <artifactId>snakeyaml</artifactId>
  <version>1.20-SNAPSHOT</version>
  <classifier>android</classifier>
</dependency>

For any operation with SnakeYAML api’s you have to use below steps:
package :

org.yaml.snakeyaml.Yaml

Intialization:
Yaml yaml = new Yaml();

Loading YAML :

  • yaml.load(String) accepts a String.
  • yaml.load(InputStream) accepts a InputStream.

yaml.load(InputStream) detects the encoding by checking the BOM (byte order mark) sequence at the beginning of streams. If no BOM presents the utf-8 encoding is assumed..

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

JAVA : How to convert YAML Documents to JSON List?


Here is code to convert YAML documents to JSON objects by Jackson and snakeyml apis. Jackson also support YAML support.

Pre-Requisite



        <dependencies>
        <!-- Jackson JSON Processor -->
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>2.4.1</version>
        </dependency>
        <!-- For YAML -->
        <dependency>
            <groupId>org.yaml</groupId>
            <artifactId>snakeyaml</artifactId>
            <version>1.21</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.dataformat</groupId>
            <artifactId>jackson-dataformat-yaml</artifactId>
            <version>2.1.2</version>
        </dependency>
    </dependencies>   

Sample YAML Documents File


---
# My personal record
name: Saurabh Kumar Gupta
Title: Sr. Project Lead
skill: JAVA/J2EE
employed: True
domains:
    - Telecom
    - Finance
    - Banking
    - Healthcare
languages:
    ELK: Medium
    JAVA: Expertize
    Scripting: Comfortable
education: |
    MCA
    B.Sc
    Diploma

---
# Gaurav personal record
name: Gaurav Gupta
Title: Project Lead
skill: ELK
employed: True
domains:
    - Telecom
    - Banking
    - Healthcare
languages:
    ELK: Medium
    JAVA: Expertize
    Scripting: Comfortable
    Bigdata: Expertize
education: |
    MCA
    B.Sc

Code to Convert YAML documents to JSON Objects

package com.fiot.examples.yaml;

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.Iterator;
import org.yaml.snakeyaml.Yaml;
import org.yaml.snakeyaml.constructor.SafeConstructor;

public class ConvertYAMLObjectsToJSON {
	public static void main(String[] args) {
		try (InputStream input = new FileInputStream(new File(
				"F:\\Workspace-Blog\\TestExamples\\src\\main\\resources\\YAMLDocument2.yaml"))) {
			Yaml yaml = new Yaml(new SafeConstructor());
			Iterator iterator = yaml.loadAll(input).iterator();
			while (iterator.hasNext()) {
				System.out.println(iterator.next());
			}
		} catch (Throwable e) {
			System.out.println("ERROR: " + e.getMessage());
		}
	}
}

Output


{name=Saurabh Kumar Gupta, Title=Sr. Project Lead, skill=JAVA/J2EE, employed=true, domains=[Telecom, Finance, Banking, Healthcare], languages={ELK=Medium, JAVA=Expertize, Scripting=Comfortable}, education=MCA
B.Sc
Diploma
}
{name=Gaurav Gupta, Title=Project Lead, skill=ELK, employed=true, domains=[Telecom, Banking, Healthcare], languages={ELK=Medium, JAVA=Expertize, Scripting=Comfortable, Bigdata=Expertize}, education=MCA
B.Sc}

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

JAVA : How to convert YAML To JSON?


Here is code to convert YAML document to JSON by Jackson and snakeyml apis. Jackson also support YAML support.

Pre-Requisite


<dependencies>
        <!-- Jackson JSON Processor -->
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>2.4.1</version>
        </dependency>
        <!-- For YAML -->
        <dependency>
            <groupId>org.yaml</groupId>
            <artifactId>snakeyaml</artifactId>
            <version>1.21</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.dataformat</groupId>
            <artifactId>jackson-dataformat-yaml</artifactId>
            <version>2.1.2</version>
        </dependency>
    </dependencies>

Sample YAML File


---
# My personal record
name: Saurabh Kumar Gupta
Title: Sr. Project Lead
skill: JAVA/J2EE
employed: True
domains:
    - Telecom
    - Finance
    - Banking
    - Healthcare
languages:
    ELK: Medium
    JAVA: Expertize
    Scripting: Comfortable
education: |
    MCA
    B.Sc
    Diploma
...

Code to convert YAML to JSON data

package com.fiot.examples.yaml;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;

public class ConvertYAMLToJSON {
	public static void main(String[] args) {
		String content = "";
		try {
			content = new String(Files.readAllBytes(Paths.get(
					"F:\\Workspace-Blog\\TestExamples\\src\\main\\resources\\YAMLDocument.yaml")));
			System.out.println("*********Content from YAML File ****************");
			System.out.println(content);
			String json = convertYamlToJson(content);
			System.out.println("*********Cnverted JSON from YAML File ****************");
			System.out.println(json);
		} catch (IOException e) {
			e.printStackTrace();
		}
	}

	private static String convertYamlToJson(String yaml) {
		try {
			ObjectMapper yamlReader = new ObjectMapper(new YAMLFactory());
			Object obj = yamlReader.readValue(yaml, Object.class);
			ObjectMapper jsonWriter = new ObjectMapper();
			return jsonWriter.writerWithDefaultPrettyPrinter().writeValueAsString(obj);
		} catch (JsonProcessingException ex) {
			ex.printStackTrace();
		} catch (IOException ex) {
			ex.printStackTrace();
		}
		return null;
	}
}

Output


*********Content from YAML File ****************
---
# My personal record
name: Saurabh Kumar Gupta
Title: Sr. Project Lead
skill: JAVA/J2EE
employed: True
domains:
    - Telecom
    - Finance
    - Banking
    - Healthcare
languages:
    ELK: Medium
    JAVA: Expertize
    Scripting: Comfortable
education: |
    MCA
    B.Sc
    Diploma
...
*********Cnverted JSON from YAML File ****************
{
  "name" : "Saurabh Kumar Gupta",
  "Title" : "Sr. Project Lead",
  "skill" : "JAVA/J2EE",
  "employed" : true,
  "domains" : [ "Telecom", "Finance", "Banking", "Healthcare" ],
  "languages" : {
    "ELK" : "Medium",
    "JAVA" : "Expertize",
    "Scripting" : "Comfortable"
  },
  "education" : "MCA\nB.Sc\nDiploma\n"
}

Below are some online tools to convert YAML/YML to JSON.
https://codebeautify.org/yaml-to-json-xml-csv
http://convertjson.com/yaml-to-json.htm

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

Difference between YAML and JSON


 

“YAML is superset of JSON”

Below are comparison between YAML and JSON by conceptually and writing differences

YAML vs JSON

  • YAML is best suited for configuration while JSON is better as a serialization format or serving up data for your APIs.
  • YAML is by no means  a replacement for JSON .You should use the data format that makes the most sense for what you are trying to accomplish.

YAML Advantage

  • YAML has a couple of big advantages  including the ability to self reference, support for complex datatypes, embedded block literals, comments, and more.
  • Write your configuration files in YAML format where you have the opportunity – it is designed to be readable and easily editable by humans.

JSON Disadvantage

  • JSON designed to be human readable – intentionally lacking features to support editing.
  • JSON doesn’t support comments – this is intentionally left out of the JSON specification because its not what the format was designed for.

JSON vs YAML

  • JSON is well suited for  serialization format to data interchange between apis over network.
  • JSON ships with a far simpler specification than YAML.
  • JSON  learning is faster in comparison to YAML, because it is not nearly as robust in its feature set.
  • YAML is a superset of JSON, which means you can parse JSON with a YAML parser.

JSON Advantage

  • JSON is  best to data interchange.

Disadvantage of YAML

  • YAML parsers are younger and  known to be less secure.
  • YAML is mainly designed for configuration when use for data interchange , many of YAMLs features lose their appeal.

Syntax Difference between YAML and JSON

Below are some syntax difference in YAML and JSON while writing files:

JSON Syntax

  • JSON is a subset of the JavaScript object notation syntax.
  • JSON data stored in name/value pairs.
  • JSON records separated by commas.
  • JSON field names & strings are wrapped by double quotes.

YAML Syntax

  • YAML stands for ain’t markup language and is a superset of JSON – You Convert YAML to JSON
  • YAML files begin with ‘—‘, marking the start of the document.
  • YAML documents end with ‘…’ but it’s optional.
  • YAML key value pairs are separated by colon.
  • YAML lists begin with a hyphen.

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

 

YAML Introduction


YAML(Ain’t Markup Language) is a human friendly , cross language, unicode based data serialization format designed for interaction with all programming languages.

File Name Extension : .yml or .yaml

Latest Release : 1.2

It is broadly use for programming:

  • Configuration Files
  • Internet Messaging
  • Object Persistence to data auditing
  • Also support for Unicode standards characters.

YAML allow data to show itself in a natural and meaningful way and provide a unique cleanness by minimizing the amount of structural characters.
For example: indentation may be used for structure, colons use to separate “mapping key: value” pairs, and dashes are used to create “bullet” lists for showing collections of data.

YAML is a superset of JSON : you can convert YAML to JSON by different API’s

YAML Goals

Below are main design goals for YAML :

  • YAML is easily readable and understandable by humans.
  • YAML is expressive and extensible.
  • YAML is easy to implement and use.
  • YAML matches the native data structures of agile languages.
  • YAML data is portable between programming languages.
  • YAML has a consistent model to support generic tools.
  • YAML supports one-pass processing.

YAML Documents Processing

A YAML document is parsed only when the iterator is invoked (lazy evaluation).

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

 

How to validate YAML/YML?


You can follow YAML syntax and semantics to create YAML files as explained in previous post.

YAML Syntax

Below are some online tools to validate YAML/YML file . you just need to copy and paste there.

http://www.yamllint.com/

https://codebeautify.org/yaml-validator

http://beautifytools.com/yaml-validator.php

https://jsonformatter.org/yaml-validator

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

 

YAML Syntax


By using below example of YAML file content will explain about YAML syntax so that easily understandable.

Example :

---
# My personal record
name: Saurabh Kumar Gupta
Title: Sr. Project Lead
skill: JAVA/J2EE
employed: True
domains:
    - Telecom
    - Finance
    - Banking
    - Healthcare
languages:
    ELK: Medium
    JAVA: Expertize
    Scripting: Comfortable
education: |
    MCA
    B.Sc
    Diploma
...

Comments  

YAML documents comments start with #.

Example  : #My personal record

Documents

YAML documents start with (- – -)  and ends with (. . .) (optional)

Example : as above YAML content having only one document.

List

YAML all members of a list are lines beginning at the same indentation level starting with a “-” (a dash and a space).

Example: as above YAML content having domains list as below

domains:
    - Telecom
    - Finance
    - Banking
    - Healthcare

Dictionary

A YAML dictionary is represented in a simple key: value form (the colon must be followed by a space).

Example : as above YAML content having dictionary as languages.

languages:
    ELK: Medium
    JAVA: Expertize
    Scripting: Comfortable

List and Dictionary Together

---
# Employee records
- Saurabh
   name: Saurabh Kumar Gupta
   Title: Sr. Project Lead
   skill: JAVA/J2EE
   employed: True
   domains:
     - Telecom
     - Finance
     - Banking
     - Healthcare
- Gaurav
   name: Gaurav Kumar Gupta
   Title: Project Lead
   skill: ELK
   employed: True
   domains:
     - Telecom
     - Finance
     

Boolean

you can also specify a boolean value (true/false) in several forms.

employed: yes
employed: no
employed: True
employed: false
employed: TRUE

If need to use Boolean value as String Literal use as below

employed: "yes"
employed: "no"
employed: "True"
employed: "false"
employed: "TRUE"

Multiline Value

YAML values can span multiple lines using two ways by  | or >.

  •  “Literal Block Scalar” | will include the newlines and any trailing spaces.
  • Folded Block Scalar” > will fold newlines to spaces;

It’s used to make what would otherwise be a very long line easier to read and edit. In either case the indentation will be ignored.

education: |
    MCA
    B.Sc
    Diploma

and

education: '>
     MCA
     B.Sc
     Diploma'

Special Characters

YAML allowed anything to put unquoted but there are some special cases where need to show value with quote.

Character (:) represents as mapping and Character (#) for comments. if these characters are occurs with space in any text value will use with (‘ ‘ or ” “) quotes.

Example :

description: 'you can write your code here: so that we can copy.'
or
description: "you can write your code here: so that we can copy."

The difference between single quotes and double quotes is that in double quotes you can use escapes.

description: "it\'s time to go home."

Variables

YAML uses “{{ var }}” for variables. If a any value after a colon starts with a “{” (curly bracket), YAML will think it is a dictionary, so you must quote it, like so:

Example :

log_file :"{{ LOG_DIR}}\\apps\\app_logs-*.log"

Note:

These reserved or special characters (‘ ” :[] {} > | * & ! % # ` @ ,) can not be used as first character of unquoted scalar. Only allowed characters are ( ? :  ) on beginning of string if a non-space character follows. Better always use quote for these type of scalar values.

YAML Supporting Language, Framework and Tools


YAML Supports so many languages and frameworks below are YAML Supporting languages, frameworks and respective tools :

C/C++ Libraries:

  • libyaml
  • libcyaml
  • Syck
  • yaml-cpp

Crystal:

  • YAML

Ruby:

  • psych
  • RbYaml
  • yaml4r

Python:

  • PyYAML
  • ruamel.yaml
  • PySyck

Java:

  •  JvYaml
  • SnakeYAML
  • YamlBeans
  • JYaml
  • Camel

Perl Modules:

  • YAML
  • YAML::XS
  • YAML::Syck
  • YAML::Tiny
  • PlYaml

C#/.NET:

  • YamlDotNet
  • yaml-net
  • yatools.net

Golang:

  •  Go-yaml
  • Go-gypsy

PHP:

  • php-yaml
  • syck
  • spyc

OCaml:

  • ocaml-syck

Javascript:

  • JS-YAML
  • JS-YAML Online

Actionscript:

  • as3yaml

Haskell:

  • YamlReference

Dart:

  • yaml

Rust:

  • yaml-rust
  • serde-yaml

Nim:

  • NimYAML

Others:

  • yamlvim (src)

Related Projects:

  • Rx
  • Kwalify
  • yaml_vim
  • yatools.net
  • JSON
  • Pygments

References

http://yaml.org/

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

 

How to Enable YAML Editor in Eclipse?


Below are steps to Enable YAML/YML editor on eclipse IDE.

  • Open Eclipse IDE
  • Go to Help menu -> Eclipse Marketplace.
  • Search for term “YAML”
  • You will get lots of option select “YAML Editor” and click on install as in below screen.
YAML Editor in Eclipse
Enable YAML/YML Editor on Eclipse
  • Click check box corresponding to “Accept Terms and Conditions”.
  • Click on Apply button and restart eclipse.

More

To know more about YAML Syntax, Configuration with Java and other supporting language, frameworks and tools, Sample configuration files and JSON and YAML conversion follow below YAML Tutorials and YAML related exceptions follow YAML Issues.

 

Sample filebeat.yml file for Prospectors ,Kafka Output and Logging Configuration


You can copy same file in filebeat.yml  and run  after making below change as per your environment directory structure and follow steps mentioned for Filebeat Download,Installation and Start/Run

  • Change on Prospectors section for your logs file directory and file name
  • Configure Multiline pattern as per your logs format as of now set as generic hopefully will work with all pattern
  • Change on Logstash Output section for Host ,Port, Topic and other settings if required
  • Change on logging directory as per you machine directory.

Sample filebeat.yml file

#=============Filebeat prospectors ===============

filebeat.prospectors:

# Here we can define multiple prospectors and shipping method and rules  as per #requirement and if need to read logs from multiple file from same patter directory #location can use regular pattern also.

#Filebeat support only two types of input_type log and stdin

##############input type logs configuration#####################

- input_type: log

# Paths of the files from where logs will read and use regular expression if need to read #from multiple files
paths:
- /opt/app/app1/logs/app1-debug*.log*
# make this fields_under_root as true if you want filebeat json out for read files in root.
fields_under_root: true

### Multiline configuration for handeling stacktrace, Object, XML etc if that is the case #and multiline is enabled with below configuration will shipped output for these case in #multiline

# The regexp Pattern that has to be matched. The example pattern matches all lines #starting with [DEBUG,ALERT,TRACE,WARNING log level that can be customize #according to your logs line format
#multiline.pattern: '^\[([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)'

# Default is false.Defines if the pattern match  should be negated or not.
#multiline.negate: true

# multiline.match define if pattern not match with above pattern where these line need #to append.Possible values  are "after" or "before".

#multiline.match: after

# if you will set this max line after these number of multiline all will ignore
#multiline.max_lines: 50

#==========Kafka output Configuration ============================
output.kafka:
# Below enable flag is for enable or disable output module will discuss more on filebeat #moodule section
#enabled: true

# Here mentioned all your Kafka broker host and port to fetch cluster metadata which #contains published events for kafka brokers.

hosts: ["kafkahost:port"]

# We can define topic for Kafka broker where events will published.
topic: QC-LOGS

# Default no key setting. But we can use formatted key settings.
#key: ''

#Default partition strategy is 'hash' using key values set. If not set key value will #randomly distribute publish events.

#partition.hash:

# Default value  is false. If reach_only enabled event will publish only reachable kafka #brokers.
#reachable_only: false

# Configure alternative event field names used to compute the hash value.
# If empty `output.kafka.key` setting will be used.
# Default value is empty list.
#hash: []

# If authentication set on Kafka broker end below fileds are required.
#username: ''
#password: ''

#Kafka Broker version to configure so that filebeat can check compatibility with that.
#version: 0.8.2

#Meta data information is required for broker event publishing so that filbeat can take  #decision based on status of brokers.

#metadata:

#Defaults value for max 3 retries selection of available brokers.
#retry.max: 3

# Default value is 250ms. Will wait for specified time before make next retries.
#retry.backoff: 250ms

# Will update meta data information  in every 10 minutes.
#refresh_frequency: 10m

# It shows no of worker will run for each configure kafka broker.
#worker: 1

#Default value is 3. If set less than 0 filebeat will retry continuously as logs as events not #publish.
#max_retries: 3

# The Default value is 2048.It shows max number of batch events will publish to Kafka in #one request.
#bulk_max_size: 2048

#The default value is 30 second. It will timeout if not hear any response from Kafka #broker with in specified time.
#timeout: 30s
# Default is value is 10 seconds. During this max duration broker will wait for #number #of required acknowledgement.
#broker_timeout: 10s

# Default value is 256 for buffered message for Kafka broker.
#channel_buffer_size: 256

# Default value is 0 seconds  as keep alive is disabled and if this value set will keep alive #active network connection for that time.
#keep_alive: 0

# Default value for compression is gzip. We can also set other compression codec like #snappy, gzip or none.
compression: gzip

#Default value is 1000000 bytes . If Json value is more than configured max message #bytes event will dropped.
max_message_bytes: 1000000

#Default Value is 1 for ACK for reliability. Possible values can be :

#0=no response , Message can be lost on some error happens

#1=wait for local commit

#-1=wait for all replicas to commit.
#required_acks: 1

# Waiting Interval between new events and previous events for read logs.
#flush_interval: 1s

# The configurable ClientID used for logging, debugging, and auditing
# purposes. The default is "beats".

#Default value is beat. We can set values for this field that will help for analysis and #auditing purpose.
#client_id: beats

# Configure SSL setting id required for Kafk broker
#ssl.enabled: true

# Optional SSL configuration options. SSL is off by default.
# List of root certificates for HTTPS server verifications

#SSL configuration is Optional and OFF by default . It required for server verification if #HTTPS root certificate .
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

#Default value is full. SSL configuration verfication mode is required if SSL is configured .#We can use value as 'none' for testing purpose but in this mode it can accept any #certificate.
#ssl.verification_mode: full

# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.

#By Default  it support all TLS versions after 1.0 to 1.2. We can also mentioned version in #below array
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]

# Define path for certificate for SSL
#ssl.certificate: "/etc/pki/client/cert.pem"

# Define path for Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"

# If data is configured and shipped encrypted form. Need to add passphrase for decrypting the Certificate Key otherwise optional
#ssl.key_passphrase: ''

# Configure encryption cipher suites to be used for SSL connections
#ssl.cipher_suites: []

# Configure encryption curve types for ECDHE based cipher suites
#ssl.curve_types: []
#====================Logging ==============================

# Default log level is info if set above or below will record top this hierarchy #automatically. Available log levels are: critical, error, warning, info, debug

logging.level: debug
# Possible values for selectors are "beat", "publish" and  "service" if you want to enable #for all select value as "*". This selector decide on command line when  start filebeat.
logging.selectors: ["*"]

# The default value is false.If make it true will send out put to syslog.
logging.to_syslog: false
# The default is true. all non-zero metrics  reading are output on shutdown.
logging.metrics.enabled: true

# Period of matrics for log reading counts from log files and it will send complete report #when shutdown filebeat
logging.metrics.period: 30s
# Set this flag as true to enable logging in files if not set that will disable.
logging.to_files: true
logging.files:
# Path of directory where logs file will write if not set default directory will home #directory.
path: /tmp

# Name of files where logs will write
name: filebeat-app.log
# Log File will rotate if reach max size and will create new file. Default value is 10MB
rotateeverybytes: 10485760 # = 10MB

# This will keep recent maximum log files in directory for rotation and remove oldest #files.
keepfiles: 7
# Will enable logging for that level only. Available log levels are: critical, error, warning, #info, debug
level: debug

Sample filebeat.yml File

Integration

Complete Integration Example Filebeat, Kafka, Logstash, Elasticsearch and Kibana

Read More

To read more on Filebeat topics, sample configuration files and integration with other systems with example follow link Filebeat Tutorial  and  Filebeat Issues.To Know more about YAML follow link YAML Tutorials.

Leave you feedback to enhance more on this topic so that make it more helpful for others.

Sample filebeat.yml file for Prospectors ,Logstash Output and Logging Configuration


You can copy same file in filebeat.yml  and run  after making below change as per your environment directory structure and follow steps mentioned for Filebeat Download,Installation and Start/Run

  • Change on Prospectors section for your logs file directory and file name
  • Configure Multiline pattern as per your logs format as of now set as generic hopefully will work with all pattern
  • Change on Logstash Output section for Host ,Port and other settings if required
  • Change on logging directory as per you machine directory.

Sample filebeat.yml file

#=============Filebeat prospectors ===============

filebeat.prospectors:

# Here we can define multiple prospectors and shipping method and rules  as per #requirement and if need to read logs from multiple file from same patter directory #location can use regular pattern also.

#Filebeat support only two types of input_type log and stdin

##############input type logs configuration#####################

- input_type: log

# Paths of the files from where logs will read and use regular expression if need to read #from multiple files
paths:
- /opt/app/app1/logs/app1-debug*.log*
# make this fields_under_root as true if you want filebeat json out for read files in root.
fields_under_root: true

### Multiline configuration for handeling stacktrace, Object, XML etc if that is the case #and multiline is enabled with below configuration will shipped output for these case in #multiline

# The regexp Pattern that has to be matched. The example pattern matches all lines #starting with [DEBUG,ALERT,TRACE,WARNING log level that can be customize #according to your logs line format
#multiline.pattern: '^\[([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)'

# Default is false.Defines if the pattern match  should be negated or not.
#multiline.negate: true

# multiline.match define if pattern not match with above pattern where these line need #to append.Possible values  are "after" or "before".

#multiline.match: after

# if you will set this max line after these number of multiline all will ignore
#multiline.max_lines: 50
#=========Logstash Output Configuration=======================
output.logstash:
# Below enable flag is for enable or disable output module will discuss more on filebeat #module section.
#enabled: true

#  Here mentioned all your logstash server host and port to publish events. Default port #for logstash is 5044 if Logstash listener start with different port then use same here.
#hosts: ["logstashserver:5044"]

# It shows no of worker will run for each configure Logstash host.
#worker: 1

#Filebeat provide gzip compression level which varies from 1 to 9. As compression level #increase processing speed will reduce but network speed increase.By default #compression level disable and value is 0.
#compression_level: 3

# Default value is false.  If set to true will check status of hosts if unresponsive will send #to another available host. if false filebeat will select random host and send events to it.
#loadbalance: true

# Default value is 0 means pipeline disabled. Configure value decide of pipeline  batches #to send to logstash asynchronously and wait for response. If pipeline value is written #means output will blocking.
#pipelining: 0

#Filebeat use SOCKS5 protocol to communicate with Logstash servers. If any proxy #configure for this protocol on server end then we can overcome by setting below #details.

# SOCKS5 proxy URL
#proxy_url: socks5://userid:pwd@socks5-server:2233

# Default value is false means resolve host name resolution on  proxy server. If value is #set as true Logstash host name resolution locally for proxy.
#proxy_use_local_resolver: false

# Configure SSL setting id required for Logstash broker if SSL is configured
#ssl.enabled: true

# Optional SSL configuration options. SSL is off by default.
# List of root certificates for HTTPS server verifications

#SSK configuration is Optional and OFF by default . It required for server verification if #HTTPS root certificate .
#ssl.certificate_authorities: ["/app/pki/root/ca.pem"]

#Default value is full. SSL configuration verfication mode is required if SSL is configured #We can use value as 'none' for testing purpose but in this mode it can accept any #certificate.
#ssl.verification_mode: full

# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.

#By Default  it support all TLS versions after 1.0 to 1.2. We can also mentioned version in #below array
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]

# Define path for certificate for SSL
#ssl.certificate: "/etc/pki/client/cert.pem"

# Define path for Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"

# If data is configured and shipped encrypted form. Need to add passphrase for #decrypting the Certificate Key otherwise optional
#ssl.key_passphrase: ''

# Configure encryption cipher suites to be used for SSL connections
#ssl.cipher_suites: []

# Configure encryption curve types for ECDHE based cipher suites
#ssl.curve_types: []
#====================Logging ==============================

# Default log level is info if set above or below will record top this hierarchy #automatically. Available log levels are: critical, error, warning, info, debug

logging.level: debug
# Possible values for selectors are "beat", "publish" and  "service" if you want to enable #for all select value as "*". This selector decide on command line when  start filebeat.
logging.selectors: ["*"]

# The default value is false.If make it true will send out put to syslog.
logging.to_syslog: false
# The default is true. all non-zero metrics  reading are output on shutdown.
logging.metrics.enabled: true

# Period of matrics for log reading counts from log files and it will send complete report #when shutdown filebeat
logging.metrics.period: 30s
# Set this flag as true to enable logging in files if not set that will disable.
logging.to_files: true
logging.files:
# Path of directory where logs file will write if not set default directory will home #directory.
path: /tmp

# Name of files where logs will write
name: filebeat-app.log
# Log File will rotate if reach max size and will create new file. Default value is 10MB
rotateeverybytes: 10485760 # = 10MB

# This will keep recent maximum log files in directory for rotation and remove oldest #files.
keepfiles: 7
# Will enable logging for that level only. Available log levels are: critical, error, warning, #info, debug
level: debug

Sample filebeat.yml File

Integration

Complete Integration Example Filebeat, Kafka, Logstash, Elasticsearch and Kibana

Read More

To read more on Filebeat topics, sample configuration files and integration with other systems with example follow link Filebeat Tutorial  and  Filebeat Issues.To Know more about YAML follow link YAML Tutorials.

Leave you feedback to enhance more on this topic so that make it more helpful for others.

Sample filebeat.yml file for Prospectors, Elasticsearch Output and Logging Configuration


Filebeat.yml file with Prospectors, Multiline,Elasticsearch Output and Logging Configuration

You can copy same file in filebeat.yml and run after making below change as per your environment directory structure and follow steps mentioned for Filebeat Download,Installation and Start/Run

  • Change on Prospectors section for your logs file directory and file name
  • Configure Multiline pattern as per your logs format as of now set as generic hopefully will work with all pattern
  • Change on Elasticsearch output section for Host ,Port and other setting if required
  • Change on logging directory as per you machine directory.

Sample filebeat.yml file

#=============Filebeat prospectors ===============

filebeat.prospectors:

# Here we can define multiple prospectors and shipping method and rules  as per #requirement and if need to read logs from multiple file from same patter directory #location can use regular pattern also.

#Filebeat support only two types of input_type log and stdin

##############input type logs configuration#####################

- input_type: log

# Paths of the files from where logs will read and use regular expression if need to read #from multiple files
paths:
- /opt/app/app1/logs/app1-debug*.log*
# make this fields_under_root as true if you want filebeat json out for read files in root.
fields_under_root: true

### Multiline configuration for handeling stacktrace, Object, XML etc if that is the case #and multiline is enabled with below configuration will shipped output for these case in #multiline

# The regexp Pattern that has to be matched. The example pattern matches all lines #starting with [DEBUG,ALERT,TRACE,WARNING log level that can be customize #according to your logs line format
#multiline.pattern: '^\[([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)'

# Default is false.Defines if the pattern match  should be negated or not.
#multiline.negate: true

# multiline.match define if pattern not match with above pattern where these line need #to append.Possible values  are "after" or "before".

#multiline.match: after

# if you will set this max line after these number of multiline all will ignore
#multiline.max_lines: 50</pre>
<h4>#==========Elasticsearch Output Configuration=======================</h4>
<pre>output.elasticsearch:
# We can configure this flag the output as module.
#enabled: true

#Define elasticsearch elasticsearch HTTP client server host and port. default port for #elasticsearch is 9200
hosts: ["elasticsearver:9200"]

# Filebeat provide gzip compression level which varies from 1 to 9. As compression level #increase processing speed will reduce but network speed increase.By default #compression level disable and value is 0.
compression_level: 0

# Optional protocol by default HTTP. If requires set https and basic auth credentials for #credentials if any.
#protocol: "https"
#username: "userid"
#password: "pwd"

# we can configure number of worker for each host publishing events to elasticseach #which will do load balancing.
#worker: 1

# Optional index name. The default is "filebeat" plus date and generates filebeat-{YYYY.MM.DD} keys.
index: "app1-%{+yyyy.MM.dd}"

# Optional ingest node pipeline. By default no pipeline will be used.
#pipeline: ""

# Optional HTTP Path
#path: "/elasticsearch"

# Proxy server url
#proxy_url: http://proxy:3128

# Default value is 3. When max retry reach specified limit and evens not published all #events will drop. Filebeat also provide option to retry until all events are published by #setting value as less than 0.
#max_retries: 3

#Default values is 50. If filebeat is generating events more than configure batch max size it will split events in configure size batches and send to elasticsearch. As much as batch size will increase performance will improve but require more buffring. It can cause other issue like connection, errors, timeout for requests.
#bulk_max_size: 50

#Default value is 90 seconds. If no response http request will timeout.
#timeout: 90

# waiting time for new events for bulk requests. If bulk request max size sent before this #specified time, new bulk index request created.
#flush_interval: 1s

# We can update elasticsearch index template from filebeat which will define settings #and mappings to determine field analysis.

# Set to false to disable template loading.
#template.enabled: true

# Template name. By default the template name is filebeat.
#template.name: "app1"

# Path to template file
#template.path: "${path.config}/app1.template.json"

#Set template.overwrite as true and if need to update template file version as 2.x then set #path of Latest template file with below configuration.
#template.overwrite: false
#template.versions.2x.enabled: true
#template.versions.2x.path: "${path.config}/filebeat.template-es2x.json"

# Configure SSL setting id required for Kafk broker
#ssl.enabled: true

# Optional SSL configuration options. SSL is off by default.
# List of root certificates for HTTPS server verifications

#SSL configuration is Optional and OFF by default . It required for server verification if #HTTPS root certificate .
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

#Default value is full. SSL configuration verfication mode is required if SSL is configured .#We can use value as 'none' for testing purpose but in this mode it can accept any #certificate.
#ssl.verification_mode: full

# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.

#By Default  it support all TLS versions after 1.0 to 1.2. We can also mentioned version in #below array
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]

# Define path for certificate for SSL
#ssl.certificate: "/etc/pki/client/cert.pem"

# Define path for Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"

# If data is configured and shipped encrypted form. Need to add passphrase for decrypting the Certificate Key otherwise optional
#ssl.key_passphrase: ''

# Configure encryption cipher suites to be used for SSL connections
#ssl.cipher_suites: []

# Configure encryption curve types for ECDHE based cipher suites
#ssl.curve_types: []
#====================Logging ==============================

# Default log level is info if set above or below will record top this hierarchy #automatically. Available log levels are: critical, error, warning, info, debug

logging.level: debug
# Possible values for selectors are "beat", "publish" and  "service" if you want to enable #for all select value as "*". This selector decide on command line when  start filebeat.
logging.selectors: ["*"]

# The default value is false.If make it true will send out put to syslog.
logging.to_syslog: false
# The default is true. all non-zero metrics  reading are output on shutdown.
logging.metrics.enabled: true

# Period of matrics for log reading counts from log files and it will send complete report #when shutdown filebeat
logging.metrics.period: 30s
# Set this flag as true to enable logging in files if not set that will disable.
logging.to_files: true
logging.files:
# Path of directory where logs file will write if not set default directory will home #directory.
path: /tmp

# Name of files where logs will write
name: filebeat-app.log
# Log File will rotate if reach max size and will create new file. Default value is 10MB
rotateeverybytes: 10485760 # = 10MB

# This will keep recent maximum log files in directory for rotation and remove oldest #files.
keepfiles: 7
# Will enable logging for that level only. Available log levels are: critical, error, warning, #info, debug
level: debug

Read More on Filebeat

To Know more about YAML follow link YAML Tutorials.

Sample filebeat.yml File

Integration

Integrate Filebeat, Kafka, Logstash, Elasticsearch and Kibana

Sample filebeat.yml file for Prospectors and Logging Configuration


Filebeat.yml file  with Prospectors, Kafka Output and Logging Configuration

You can  copy same file in filebeat.yml  and run after making below change as per your environment directory structure and follow steps mentioned for  Filebeat Download,Installation and Start/Run

  • Change on Prospectors section for your logs file directory and file name
  • Configure Multiline pattern as per your logs format as of now set as generic hopefully will work with all pattern
  • Change on Kafka output section for Host ,Port and topic name as required
  • Change on logging directory as per you machine directory.

Below is Sample file:

#=============Filebeat prospectors ===============

filebeat.prospectors:

#Here we can define multiple prospectors and shipping method and rules  as per
#requirement and if need to read logs from multiple file from same patter directory #location can use regular pattern also.

#Filebeat support only two types of input_type log and stdin

- input_type: log

# Paths of the files from where logs will read and use regular expression if need to read #from multiple files
paths:
- /opt/app/app1/logs/app1-debug*.log*
# make this fields_under_root as true if you want filebeat json out for read files in root.
fields_under_root: true

### Multiline configuration for handeling stacktrace, Object, XML etc if that is the case #and multiline is enabled with below configuration will shipped output for these case in #multiline

# The regexp Pattern that has to be matched. The example pattern matches all lines #starting with [DEBUG,ALERT,TRACE,WARNING log level that can be customize #according to your logs line format
#multiline.pattern: '^\[([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)'

# Default is false.Defines if the pattern match  should be negated or not.
#multiline.negate: true

# multiline.match define if pattern not match with above pattern where these line need #to append.Possible values  are "after" or "before".

#multiline.match: after

# if you will set this max line after these number of multiline all will ignore
#multiline.max_lines: 50

#==========Kafka output Configuration ============================
output.kafka:
# Below enable flag is for enable or disable output module will discuss more on filebeat #module section
#enabled: true

# Here mentioned all your Kafka broker host and port to fetch cluster metadata which #contains published events for kafka brokers.

hosts: ["kafkahost:port"]

# We can define topic for Kafka broker where events will published.
topic: QC-LOGS

# Default no key setting. But we can use formatted key settings.
#key: ''

#Default partition strategy is 'hash' using key values set. If not set key value will #randomly distribute publish events.

#partition.hash:

# Default value  is false. If reach_only enabled event will publish only reachable kafka #brokers.
#reachable_only: false

# Configure alternative event field names used to compute the hash value.
# If empty `output.kafka.key` setting will be used.
# Default value is empty list.
#hash: []

# If authentication set on Kafka broker end below fileds are required.
#username: ''
#password: ''

#Kafka Broker version to configure so that filebeat can check compatibility with that.
#version: 0.8.2

#Meta data information is required for broker event publishing so that filbeat can take  #decision based on status of brokers.

#metadata:

#Defaults value for max 3 retries selection of available brokers.
#retry.max: 3

# Default value is 250ms. Will wait for specified time before make next retries.
#retry.backoff: 250ms

# Will update meta data information  in every 10 minutes.
#refresh_frequency: 10m

# It shows no of worker will run for each configure kafka broker.
#worker: 1

#Default value is 3. If set less than 0 filebeat will retry continuously as logs as events not #publish.
#max_retries: 3

# The Default value is 2048.It shows max number of batch events will publish to Kafka in #one request.
#bulk_max_size: 2048

#The default value is 30 second. It will timeout if not hear any response from Kafka #broker with in specified time.
#timeout: 30s
# Default is value is 10 seconds. During this max duration broker will wait for #number #of required acknowledgement.
#broker_timeout: 10s

# Default value is 256 for buffered message for Kafka broker.
#channel_buffer_size: 256

# Default value is 0 seconds  as keep alive is disabled and if this value set will keep alive #active network connection for that time.
#keep_alive: 0

# Default value for compression is gzip. We can also set other compression codec like #snappy, gzip or none.
compression: gzip

#Default value is 1000000 bytes . If Json value is more than configured max message #bytes event will dropped.
max_message_bytes: 1000000

#Default Value is 1 for ACK for reliability. Possible values can be :

#0=no response , Message can be lost on some error happens

#1=wait for local commit

#-1=wait for all replicas to commit.
#required_acks: 1

# Waiting Interval between new events and previous events for read logs.
#flush_interval: 1s

# The configurable ClientID used for logging, debugging, and auditing
# purposes. The default is "beats".

#Default value is beat. We can set values for this field that will help for analysis and #auditing purpose.
#client_id: beats

# Configure SSL setting id required for Kafk broker
#ssl.enabled: true

# Optional SSL configuration options. SSL is off by default.
# List of root certificates for HTTPS server verifications

#SSK configuration is Optional and OFF by default . It required for server verification if #HTTPS root certificate .
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

#Default value is full. SSL configuration verfication mode is required if SSL is configured #.We can use value as 'none' for testing purpose but in this mode it can accept any #certificate.
#ssl.verification_mode: full

# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.

#By Default  it support all TLS versions after 1.0 to 1.2. We can also mentioned version in #below array
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]

# Define path for certificate for SSL
#ssl.certificate: "/etc/pki/client/cert.pem"

# Define path for Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"

# If data is configured and shipped encrypted form. Need to add passphrase for #decrypting the Certificate Key otherwise optional
#ssl.key_passphrase: ''

# Configure encryption cipher suites to be used for SSL connections
#ssl.cipher_suites: []

# Configure encryption curve types for ECDHE based cipher suites
#ssl.curve_types: []
#====================Logging ==============================

# Default log level is info if set above or below will record top this hierarchy #automatically. Available log levels are: critical, error, warning, info, debug

logging.level: debug
# Possible values for selectors are "beat", "publish" and  "service" if you want to enable #for all select value as "*". This selector decide on command line when  start filebeat.
logging.selectors: ["*"]

# The default value is false.If make it true will send out put to syslog.
logging.to_syslog: false
# The default is true. all non-zero metrics  reading are output on shutdown.
logging.metrics.enabled: true

# Period of matrics for log reading counts from log files and it will send complete report #when shutdown filebeat
logging.metrics.period: 30s
# Set this flag as true to enable logging in files if not set that will disable.
logging.to_files: true
logging.files:
# Path of directory where logs file will write if not set default directory will home #directory.
path: /tmp

# Name of files where logs will write
name: filebeat-app.log
# Log File will rotate if reach max size and will create new file. Default value is 10MB
rotateeverybytes: 10485760 # = 10MB

# This will keep recent maximum log files in directory for rotation and remove oldest #files.
keepfiles: 7
# Will enable logging for that level only. Available log levels are: critical, error, warning, #info, debug
level: debug

Integration

Complete Integration Example Filebeat, Kafka, Logstash, Elasticsearch and Kibana

Read More

To read more on Filebeat topics, sample configuration files and integration with other systems with example follow link Filebeat Tutorial  and  Filebeat Issues.To Know more about YAML follow link YAML Tutorials.

Leave you feedback to enhance more on this topic so that make it more helpful for others.