Filebeat, Kafka Output Configuration

If need  to shipped server logs lines  directly to Kafka. Follow below steps:

Pre-Requisite :

  • Start Kafka  before start filebeat to listen publish events and configure filebeat with same kafka server port

Kafka Output  Required Configuration :

  • Comment out output.elasticsearch output section and uncomment output.kafka section
  • Set enabled value is true to make kafka output as enabled
  • Set host  of server where Kafka is running for listening  by default port for Kafka is 9092 if any change use same port value.
output.kafka:
 enabled:true
 #configure topic as per your application need
 hosts:["kafkaserver:9092"]
 topic:QC-TEST

Kafka Credentials Settings: Set below credentials if any for Kafka broker.

 username:"userid"
 password:"password"

Other Optional Configurations:

Kafka Output Compression Configuration:

Default value for compression is gzip. We can also set other compression codec like snappy, gzip or none.

compression:gzip

Logstash Output Performance Configuration:

worker:  we can configure number of worker for each host publishing events to elasticsearch which will do load balancing.

Kafka Broker Topic Partition Configuration:

key: Default no key setting. But we can use formatted key settings.

partition.hash: Default partition strategy is ‘hash’ using key values set. If not set key value will randomly distribute publish events.

reachable_only: Default value  is false. If reach_only enabled event will publish only reachable kafka brokers.

hash: [] Default value is empty list. Configure alternative event field names used to compute the hash value. If empty output.kafka.key setting will be used.

version: Kafka Broker version to configure so that filebeat can check compatibility with that.

Meta Data Configuration: Meta data information is required for broker event publishing so that filebeat can take  #decision based on status of brokers.

metadata:

retry.max: Defaults value for max 3 retries selection of available brokers.

retry.backoff: Default value is 250ms. Will wait for specified time before make next retries.
refresh_frequency: Will update meta data information  in every 10 minutes.

max_retries: Default value is 3. If set less than 0 filebeat will retry continuously as logs as events not publish.

bulk_max_size: The Default value is 2048.It shows max number of batch events will publish to Kafka in one request.

Kafka Reliability Setting:

#Default Value is 1 for ACK for reliability. Possible values can be :

#0=no response , Message can be lost on some error happens

#1=wait for local commit

#-1=wait for all replicas to commit.
required_acks: 1
timeout: The default value is 30 second. It will timeout if not hear any response from Kafka broker with in specified time.
broker_timeout: Default is value is 10 seconds. During this max duration broker will wait for number #of required acknowledgement.
channel_buffer_size: Default value is 256 for buffered message for Kafka broker.
keep_alive: Default value is 0 seconds  as keep alive is disabled and if this value set will keep alive active network connection for that time.
max_message_bytes: Default value is 1000000 bytes . If Json value is more than configured max message bytes event will dropped.

flush_interval: Waiting Interval between new events and previous events for read logs.

client_id: Default value is beat. We can set values for this field that will help for analysis and auditing purpose.

Sample configuration file

Sample filebeat.yml file for Kafka Output Configuration

Integration

Complete Integration Example Filebeat, Kafka, Logstash, Elasticsearch and Kibana

Read More

To read more on Filebeat topics, sample configuration files and integration with other systems with example follow link Filebeat Tutorial  and  Filebeat Issues. To Know more about YAML follow link YAML Tutorials.

Leave you feedback to enhance more on this topic so that make it more helpful for others.