Note :you still have to download lombok.jar and run it as a jar file, if you wish to program in eclipse. The plugin makes that part easier. Follow this link : Eclipse Lombok Configuration
If our FacingIssuesOnIT Experts solutions guide you to resolve your issues and improve your knowledge. Please share your comments, like and subscribe to get notifications for our posts.
@Component is a class level annotation. During the component scan, Spring Framework automatically detects classes annotated with @Component.
@Component
class CarUtility {
// ...
}
Note: By default, the bean instances @Component class have the same name as the class name with a lowercase initial. On top of that, we can specify a different name using the optional value argument of this annotation.
Since @Repository, @Service, @Configuration, and @Controller are all meta-annotations of @Component, they share the same bean naming behavior. Also, Spring automatically picks them up during the component scanning process.
@Repository
@Repository annotation use to represent DAO or Repository classes of database access layer in an application.
@Repository
class VehicleRepository {
// ...
}
Main advantage of using @Repository annotation is that it has automatic persistence exception translation enabled. When using a persistence framework such as Hibernate, native exceptions thrown within classes annotated with @Repository will be automatically translated into subclasses of Spring’s DataAccessExeption.
To enable exception translation, we need to declare our own PersistenceExceptionTranslationPostProcessor bean:
@Bean
public PersistenceExceptionTranslationPostProcessor exceptionTranslation() {
return new PersistenceExceptionTranslationPostProcessor();
}
@Service
@Service annotation use for service layer classes as business logic of an application usually resides within the service layer.
@Service
public class VehicleService {
// ...
}
@Controller
@Controller is a class level annotation which tells the Spring Framework that this class serves as a controller in Spring MVC.
@Controller
public class VehicleController {
// ...
}
@Configuration
Configuration classes can contain bean definition methods annotated with @Bean:
@Configuration
class VehicleFactoryConfig {
@Bean
Engine engine() {
return new Engine();
}
}
Stereotype Annotations with AOP
Spring stereotype annotations, makes easy to create a pointcut that targets all classes that have a particular stereotype.
For example: Suppose we want to measure the execution time and performance of methods from the DAO layer. We’ll create the following aspect (using AspectJ annotations) taking advantage of @Repository stereotype:
@Aspect
@Component
public class PerformanceAspect {
@Pointcut("within(@org.springframework.stereotype.Repository *)")
public void daoClassMethods() {};
@Around("daoClassMethods()")
public Object measureMethodExecutionTime(ProceedingJoinPoint joinPoint)
throws Throwable {
long start = System.nanoTime();
Object returnValue = joinPoint.proceed();
long end = System.nanoTime();
String methodName = joinPoint.getSignature().getName();
System.out.println(
"Execution of " + methodName + " took " +
TimeUnit.NANOSECONDS.toMillis(end - start) + " ms");
return returnValue;
}
}
In this example, we have created a pointcut that matches all methods in classes annotated with @Repository. We used the @Around advice to then target that pointcut and determine the execution time of the intercepted methods calls. Same approach can be apply on other layer of application.
Logging information refers to the recording of your application activity that help in analyzing runtime behavior of application especially when encounters unexpected scenarios, errors or tracking steps executed by any request. As much as logging is done will easy to analyze any issues and bugs in the code.
Now a days more companies are moving to cloud and focusing on monitor logs and log analysis. There are some tools for centralize log management such as Logstash, Loggy, Graylog etc.
There are so many JAVA logging frameworks and tools such as log4j, log4j2, slf4j, tinylog, logback etc. But here we mainly focus on Apache Log4j2 severity level, configuration file ways and java logging.
Java provides standard Logging API to work as wrapper over different Logging framework. Compatible frameworks can be loaded into JVM and accessed via the API. There is also a default logging framework implementation provided by the Sun JVM which accessed by the API. Many developers confuse this implementation with the Java Logging API.
Logging is broken into three major parts:
Logger : The Logger is responsible for capturing the message to be logged along with certain metadata and passing it to the logging framework. These messages can be an object, debug text or exceptions with an optional severity level.
Formatter: After receiving the message formatter do formatting with output.
Appender :Formatted message output will go to appender for disposition. Appenders might include console display, appending to database, log file or email etc.
Severity Level :
In logging framework always maintain the current configured logging level for each logger. That configured severity level can be set more or less restrictive.
For example : As we know each log message will logged at certain level. suppose the logging level is set to “WARNING”, then all messages of that level or higher are logged, ERROR and FATAL.
Below is list of all severity level from top to bottom. If any lower severity level configured all severity level above of it will by default consider.
FATAL: Severe errors that cause premature termination. Expect these to be immediately visible on a status console.
ERROR: Other runtime errors or unexpected conditions. Expect these to be immediately visible on a status console.
WARNING: Message that can cause issue in future.
INFO: Interesting runtime events (startup/shutdown). Expect these to be immediately visible on a console, so be conservative and keep to a minimum.
DEBUG: detailed information on the flow through the system. Expect these to be written to logs only.
TRACE: more detailed information. Expect these to be written to logs only.
Why Severity Level ?
Correct severity level is required while logging object, messages or errors so that easily track/debug issues and also analyze the behavior and failure cases of application while doing centralize logging.
Formatters or renderers
A Formatter is an object that that takes log line or object or exceptions from loggers and convert in formatted string representation. Below is technique to define your customize log format.
TTCC (Time Thread Category Component) is message format pattern representation used by log4j2.
For example : %r [%t] %-5p %c %x – %m%n will print log line as below
567 [main] INFO org.apache.log4j.examples.FacingIssuesOnIT- Exiting main method.
Where
%r Used to output the number of milliseconds elapsed from the construction of the layout until the creation of the logging event.
%t Used to output the name of the thread that generated the logging event.
%p Used to output the priority of the logging event.
%c Used to output the category of the logging event.
%x Used to output the NDC (nested diagnostic context) associated with the thread that generated the logging event.
%X{key} Used to output the MDC (mapped diagnostic context) associated with the thread that generated the logging event for specified key.
%m Used to output the application supplied message associated with the logging event.
%n Used to output the platform-specific newline character or characters.
Appenders or handlers
Appenders takes message at or above a specified minimum severity level and passed and posts to appropriate message dispositions. Log4j2 supports below disposition of appenders.
ConsoleAppender
FileAppender
JDBCAppender
AsyncAppender
CassandraAppender
FailoverAppender
FlumeAppender
JMS Appender
JPAAppender
HttpAppender
KafkaAppender
MemoryMappedFileAppender
NoSQLAppender
OutputStreamAppender
RandomAccessFileAppender
RewriteAppender
RollingFileAppender
RollingRandomAccessFileAppender
RoutingAppender
SMTPAppender
ScriptAppenderSelector
SocketAppender
SyslogAppender
ZeroMQ/JeroMQ Appender
Log4j2 Configuration Support:
Log4j2 configuration can be accomplished 1 to 4 ways.
Through a configuration file written in XML, JSON, YAML, or properties format.
Programmatically, by creating a ConfigurationFactory and Configuration implementation.
Programmatically, by calling the APIs exposed in the Configuration interface to add components to the default configuration.
Programmatically, by calling methods on the internal Logger class.
Log4j2 Automatic Configuration:
Log4j2 has the ability to automatically configure itself during initialization. When Log4j starts it will look all the ConfigurationFactory plugins and arrange them in weighted order from highest to lowest. As above, Log4j contains four ConfigurationFactory implementations: one for JSON, one for YAML, one for properties, and one for XML.
Log4j will inspect the “log4j.configurationFile” system property and, if set, will attempt to load the configuration using the ConfigurationFactory that matches the file extension.
If no system property is set the properties ConfigurationFactory will look for log4j2-test.properties in the classpath.
If no such file is found the YAML ConfigurationFactory will look for log4j2-test.yaml or log4j2-test.yml in the classpath.
If no such file is found the JSON ConfigurationFactory will look for log4j2-test.json or log4j2-test.jsn in the classpath.
If no such file is found the XML ConfigurationFactory will look for log4j2-test.xml in the class path.
If a test file cannot be located the properties ConfigurationFactory will look for log4j2.properties on the classpath.
If a properties file cannot be located the YAML ConfigurationFactory will look for log4j2.yaml or log4j2.yml on the classpath.
If a YAML file cannot be located the JSON ConfigurationFactory will look for log4j2.json or log4j2.jsn on the classpath.
If a JSON file cannot be located the XML ConfigurationFactory will try to locate log4j2.xml on the classpath.
If no configuration file could be located the DefaultConfiguration will be used. This will cause logging output to go to the console.
Here we mainly focus on log4j2 XML configuration for ConsoleAppender, FileAppender and RollingFileAppender and will see how to apply filters for loggers on default, package level and root level with different scenarios. also see how same java program logging work on different configuration.
Steps to configuration of log4j2 with any java application:
<!-- Log File Name and Location -->
target/FacingIssueOnIT.log
C:/logs/
<!-- Console Logging -->
<!-- File Logging -->
%d %p %c{1.} [%t] %m%n
<!-- ByDefault, all log messages of level "trace" or higher will be logged.Log messages are sent to the "file" appender are severity level error or higher while for console appender and log messages of level "error" and higher will be sent to the "STDOUT" appender. -->
As below for console and file output are different because of logging configuration for STDOUT and file. If you noticed STDOUT is configured for severity level as debug that’s why in console printing all log lines for debug and above severity level except Trace. Same way for file output on location /target/FacingIssuesonIT.log are having logs for FATAL and ERROR only because file is configured for severity level as ERROR.
The above was basic configuration and design for implement log4j2 logging so that easily understand. Now we will go in more detail for configuration so that understand how to log rolling and archieve logs and maintain easily by date and size of log file by implement FileAppender. We will also know about to implement logger filter on package level so that you can easily main logs for specific module or functionality.
Now making some changes in configuration file as well as in JAVA program to testing FileAppender.
log4j2.xml configuration
<!-- Log File Name and Location -->
target/FacingIssueOnIT.log
C:/logs/
<!-- Console Logging -->
<!-- File Logging -->
%d %p %c{1.} [%t] %m%n
<!-- Rolling File -->
%d{yyyyMMdd HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n
<!-- package level logger -->
<!-- Loggers classes whose package name start with com.logging will log message of level "debug" or higher -->
<!-- ByDefault, all log messages of level "trace" or higher will be logged. Log messages are sent to the "file" appender are severity level error or higher while for console appender and log messages of level "error" and higher will be sent to the "STDOUT" appender. and rolling file for all level as configure for root -->
&lt;/Loggers
In above log4j2.xml configuration having additional changes for appender RollingFile. Let me explain about it in more detail:
%d{yyyyMMdd HH:mm:ss.SSS} [%t] %-5level %logger{36} – %msg%n : This pattern shows how your logs will format in logs file.
filename=”${log-path}/FacingIssueOnIT.log : Current logs will write on this file.
configurefilePattern=”${log-path}/$${date:yyyy-MM-dd}/myexample-%d{yyyy-MM-dd}-%i.log.gz : As configured for triggering policy will check in every second (interval=1) if current file size reach to 100MB (size=100MB) will create rolling file on current date folder as in below screen.
Archieve Delete Policy: represent how old logs you want to keep as backup as of now configured for last one hour. As per you application need change it to days and change path of delete achieve logs files as per your logs directory.
Here I have added RollingFile appenders in loggers as root with out any specified level so that we can do logging for all log line. If you want to filter logs and behave differently for different package you can use loggers with different severity levels as I have used for package com.logging.
JAVA Code :
Here I have added infinite loop for testing RollingFileAppender so that logs continuously added to log file. Additionally for big application prospects added condition for checking what level severity is configured in logs so that if not satisfy condition then save operation processing time of logger for logging, formatting and appending checking. In this way we can increase application performance for logging.
Archive Log Files: Rolling and archive file will create as below on directory C:\logs\2017-12-20
Summary
In this tutorial, I have considered logging importance, ways of centralize logging, log4j2 configuration for console, file and rolling file appenders. Also explained about rolling, archive management of logs and bit idea to increase you application performance with minor change for logging.
Filebeat, Kafka, Logstash, Elasticsearch and Kibana Integration is used for big organizations where applications deployed in production on hundreds/thousands of servers and scattered around different locations and need to do analysis on data from these servers on real time.
This integration helps mostly for log level analysis , tracking issues, anomalies with data and alerts on events of particular occurrence and where accountability measures.
By using these technology provide scalable architecture to enhance systems and decoupled of each other individually.
Provide window to view Elasticsearch data in form different charts and dashboard.
Provide way searches and operation of data easily with respect to time interval.
Easily Imported by any web application by embedded dashboards.
How Data flow works ?
In this integration filebeat will install in all servers where your application is deployed and filebeat will read and ship latest logs changes from these servers to Kafka topic as configured for this application.
Logstash will subscribe log lines from kafka topic and perform parsing on these lines make relevant changes, formatting, exclude and include fields then send this processed data to Elasticsearch Indexes as centralize location from different servers.
Kibana is linked with Elasticsearch indexes which will help to do analysis by search, charts and dashboards .
Design Architecture
In below configured architecture considering my application is deployed on three servers and each server having current log file name as App1.log . Our goal is read real time data from these servers and do analysis on these data.
Steps to Installation, Configuration and Start
Here first we will install Kafka and Elasticsearch run individually rest of tools will install and run sequence to test with data flow. Initially install all in same machine and test with sample data with below steps and at end of this post will tell about what changes need to make according to your servers.
Kafka Installation, Configuration and Start
Elasticsearch Installation,Configuration and Start
Filebeat Installation,Configuration and Start
Logstash Installation,Configuration and Start
Kibana Installation,Start and display.
Pre-Requisite
These Filebeat,Logstash, Elasticsearch and Kibana versions should be compatible better use latest from https://www.elastic.co/downloads.
Java 8+
Linux Server
Filebeat 5.XX
Kafka 2.11.XX
Logstash 5.XX
Elasticsearch 5.XX
Kibana 5.XX
Note : Make sure JDK 8 should be install and JAVA_HOME environment variable point to JDK 8 home directory wherever you want in install Elasticsearch, Logstash,Kibana and Kafka.
Window : My computer ->right click-> Properties -> Advance System Settings->System Variable
Set JAVA_HOME
Linux : Go to your home directory/ sudo directory and below line as below .
For testing we will use these sample log line which is having debug as well as stacktrace of logs and grok parsing of this example is designed according to it. For real time testing and actual data you can point to your server log files but you have to modify grok pattern in Logstash configuration accordingly.
2013-02-28 09:57:56,662 WARN CreateSomethingActivationKey - WhateverException for User 49-123-345678 {{rid,US8cFAp5eZgAABwUItEAAAAI_dev01_443}{realsid,60A9772A136B9912B6FF0C3627A47090.dev1-a}}
2013-02-28 09:57:56,663 INFO LMLogger - ERR1700 - u:null failures: 0 - Technical error {{rid,US8cFAp5eZgAABwUItEAAAAI_dev01_443}{realsid,60A9772A136B9912B6FF0C3627A47090.dev1-a}}
2013-02-28 09:57:56,668 ERROR SomeCallLogger - ESS10005 Cpc portalservices: Exception caught while writing log messege to MEA Call: {}
java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
2013-02-28 10:04:35,723 INFO EntryFilter - Fresh on request /portalservices/foobarwhatever {{rid,US8dogp5eZgAABwXPGEAAAAL_dev01_443}{realsid,56BA2AD41D9BB28AFCEEEFF927EE61C2.dev1-a}}
Create App1.log file in same machine where filebeat need to install and copy above logs lines in App1.log file.
Kafka Installation , Configuration and Start
Download latest version of Kafka from below link and use command to untar and installation in Linux server or if window just unzip downloaded file.
To test Kafka install successfully you can check by running Kafka process on Linux “ps -ef|grep kafka” or steps for consumer and producer to/from topic in Setup Kafka Cluster for Single Server/Broker.
Elasticsearch Installation,Configuration and Start
Download latest version of Elasticsearch from below link and use command to untar and installation in Linux server or if window just unzip downloaded file.
Before going to start Elasticsearch need to make some basic changes in config/elasticsearch.yml file for cluster and node name. You can configure it based on you application or organization name.
Download latest version of filebeat from below link and use command to untar and installation in Linux server. or if window just unzip downloaded file.
Now filebeat is configured and ready to start with below command, it will read from configured prospector for file App1.log continiously and publish log line events to Kafka . It will also create topic as APP-1-TOPIC in Kafka if not exist.
Now you can see from above filebeat debug statements publish event 3 is having multiline statements with stacktrace exception and each debug will have these fields like.
@timestamp: Timestamp of data shipped.
beat.hostname : filebeat machine name from where data is shipping.
beat.version: which version of filebeat installed on server that help for compatibility check on target end.
message : Log line from logs file or multline log lines
offset: it’s represent inode value in source file
source : it’s file name from where logs were read
Now time to check data is publish to Kafka topic or not. For this go to below directory and you will see two files as xyz.index and xyz.log for maintaining data offset and messages.
Before going to start Logstash need to create configuration file for taking input data from Kafka and parse these data in respected fields and send it elasticsearch. Create file logstash-app1.conf in logstash bin directory with below content.
To test your configuration file you can use below command.
./logstash -t -f logstash-app1.conf
If we get result OK from above command run below to start reading and parsing data from Kafka topic.
./logstash -f logstash-app1.conf
For design your own grok pattern for you logs line formatting you can follow below link that will help to generate incrementally and also provide some sample logs grok.
Logstash console will show parse data as below and you can remove unsed fields for storing in elasticsearch by uncomment mutate section from configuration file.
To test on elasticsearch end your data sent successfully you can use this url http://localhost:9200/_cat/indices on your browser and will display created index with current date.
yellow open app1-logs-2017.05.28 Qjs6XWiFQw2zsiVs9Ks6sw 5 1 4 0 47.3kb 47.3kb
Kibana Installation, Configuration and Start
Download latest version of Kibana from below link and use command to untar and installation in Linux server or if window just unzip downloaded file.
Now we are ready with Kibana configuration and time start Kibana. We can use below command to run Kibana in background.
screen -d -m /bin/kibana
Kibana take time to start and we can test it by using below url in browser
http://localhost:5601/
For checking this data in Kibana open above url in browser go to management tab on left side menu -> Index Pattern -> Click on Add New
Enter Index name or pattern and time field name as in below screen and click on create button.
Index Pattern Settings
Now go to Discover Tab and select index as app1-log* will display data as below.
Now make below changes according to your application specification .
Filebeat :
update prospector path to your log directory current file
Move Kafka on different machine because Kafka will single location where receive shipped data from different servers. Update localhost with same IP of kafka server in Kafka output section of filebeat.full.yml file for hosts properties.
Copy same filebeat setup on all servers from where you application deployed and need to read logs.
Start all filebeat instances on each Server.
Elasticsearch :
Uncomment network.host properties from elasticsearch.yml file for accessing by IP address.
Logstash:
Update localhost in logstash-app1.conf file input section with Kafka machine IP.
Update localhost output section for elasticsearch with IP if moving on different machine.
Kibana:
update localhost in kibana.yml file for elasticsearch.url properties with IP if kibana on different machine.
Conclusion :
In this tutorial considers below points :
Installation of Filebeat, Kafka, Logstash, Elasticsearch and Kibana.
Filebeat is configured to shipped logs to Kafka Message Broker.
Logstash configured to read logs line from Kafka topic , Parse and shipped to Elasticsearch.
Kibana show these Elasticsearch information in form of chart and dashboard to users for doing analysis.
Read More
To read more on Filebeat, Kafka, Elasticsearch configurations follow the links and Logstash Configuration,Input Plugins, Filter Plugins, Output Plugins, Logstash Customization and related issues follow Logstash Tutorial and Logstash Issues.
Hope this blog was helpful for you.
Leave you feedback to enhance more on this topic so that make it more helpful for others.
Unzip file apache-maven-X.X.X-bin.zip and change unzip file name like name as maven. It will have directory structure like below
Maven Directory Structure
Copy this maven folder to C:\Program Files (x86) directory.
Add below Environment Variable by right click on My Computer -> Properties-> Environment Variable/Advance System Settings
User Environment Variable
M2_HOME : C:\Program Files (x86)\maven
M2: %M2_HOME%\bin
PATH : %M2%;
If you already have any value in PATH add %M2%; after end of value.
It will show as below in screen shot
Close all console instances/Command prompt if running and open new command prompt and run below command and will receive output with version of Maven and Java.
Download latest version of Maven and Untar it with below command
tar -zxvf apache-maven-3.5.0-bin.tar.gz
change name of directory as apache-maven-3.5.0 and set maven environment variable as below for configuration
$ export M2_HOME=/opt/app/facingissuesonit/apache-maven-3.5.0
$ export M2=$M2_HOME/bin
Set below for memory configuration
$ export MAVEN_OPTS=-Xms256m -Xmx512m
with M2_Home path corresponding with the location of your extracted Maven files.
Now append the M2 environment variable to the system path as below:
$ export PATH=$M2:$PATH
To check maven install successfully follow below command
Filebeat, Kafka, Logstash, Elasticsearch and Kibana Integration is used for big organizations where applications deployed in production on hundreds/thousands of servers and scattered around different locations and need to do analysis on data from these servers on real time.
This integration helps mostly for log level analysis , tracking issues, anomalies with data and alerts on events of particular occurrence and where accountability measures.
By using these technology provide scalable architecture to enhance systems and decoupled of each other individually.
Provide window to view Elasticsearch data in form different charts and dashboard.
Provide way searches and operation of data easily with respect to time interval.
Easily Imported by any web application by embedded dashboards.
How Data flow works ?
In this integration filebeat will install in all servers where your application is deployed and filebeat will read and ship latest logs changes from these servers to Kafka topic as configured for this application.
Logstash will subscribe log lines from kafka topic and perform parsing on these lines make relevant changes, formatting, exclude and include fields then send this processed data to Elasticsearch Indexes as centralize location from different servers.
Kibana is linked with Elasticsearch indexes which will help to do analysis by search, charts and dashboards .
Design Architecture
In below configured architecture considering my application is deployed on three servers and each server having current log file name as App1.log . Our goal is read real time data from these servers and do analysis on these data.
Steps to Installation, Configuration and Start
Here first we will install Kafka and Elasticsearch run individually rest of tools will install and run sequence to test with data flow. Initially install all in same machine and test with sample data with below steps and at end of this post will tell about what changes need to make according to your servers.
Kafka Installation, Configuration and Start
Elasticsearch Installation,Configuration and Start
Filebeat Installation,Configuration and Start
Logstash Installation,Configuration and Start
Kibana Installation,Start and display.
Pre-Requisite
These Filebeat,Logstash, Elasticsearch and Kibana versions should be compatible better use latest from https://www.elastic.co/downloads.
Java 8+
Linux Server
Filebeat 5.XX
Kafka 2.11.XX
Logstash 5.XX
Elasticsearch 5.XX
Kibana 5.XX
Note : Make sure JDK 8 should be install and JAVA_HOME environment variable point to JDK 8 home directory wherever you want in install Elasticsearch, Logstash,Kibana and Kafka.
Window : My computer ->right click-> Properties -> Advance System Settings->System Variable
Set JAVA_HOME
Linux : Go to your home directory/ sudo directory and below line as below .
For testing we will use these sample log line which is having debug as well as stacktrace of logs and grok parsing of this example is designed according to it. For real time testing and actual data you can point to your server log files but you have to modify grok pattern in Logstash configuration accordingly.
2013-02-28 09:57:56,662 WARN CreateSomethingActivationKey - WhateverException for User 49-123-345678 {{rid,US8cFAp5eZgAABwUItEAAAAI_dev01_443}{realsid,60A9772A136B9912B6FF0C3627A47090.dev1-a}}
2013-02-28 09:57:56,663 INFO LMLogger - ERR1700 - u:null failures: 0 - Technical error {{rid,US8cFAp5eZgAABwUItEAAAAI_dev01_443}{realsid,60A9772A136B9912B6FF0C3627A47090.dev1-a}}
2013-02-28 09:57:56,668 ERROR SomeCallLogger - ESS10005 Cpc portalservices: Exception caught while writing log messege to MEA Call: {}
java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
2013-02-28 10:04:35,723 INFO EntryFilter - Fresh on request /portalservices/foobarwhatever {{rid,US8dogp5eZgAABwXPGEAAAAL_dev01_443}{realsid,56BA2AD41D9BB28AFCEEEFF927EE61C2.dev1-a}}
Create App1.log file in same machine where filebeat need to install and copy above logs lines in App1.log file.
Kafka Installation , Configuration and Start
Download latest version of Kafka from below link and use command to untar and installation in Linux server or if window just unzip downloaded file.
To test Kafka install successfully you can check by running Kafka process on Linux “ps -ef|grep kafka” or steps for consumer and producer to/from topic in Setup Kafka Cluster for Single Server/Broker.
Elasticsearch Installation,Configuration and Start
Download latest version of Elasticsearch from below link and use command to untar and installation in Linux server or if window just unzip downloaded file.
Before going to start Elasticsearch need to make some basic changes in config/elasticsearch.yml file for cluster and node name. You can configure it based on you application or organization name.
Download latest version of filebeat from below link and use command to untar and installation in Linux server. or if window just unzip downloaded file.
Now filebeat is configured and ready to start with below command, it will read from configured prospector for file App1.log continiously and publish log line events to Kafka . It will also create topic as APP-1-TOPIC in Kafka if not exist.
./filebeat -e -c filebeat.full.yml -d "publish"
On console it will display output as below for sample lines.
Now you can see from above filebeat debug statements publish event 3 is having multiline statements with stacktrace exception and each debug will have these fields like.
@timestamp: Timestamp of data shipped.
beat.hostname : filebeat machine name from where data is shipping.
beat.version: which version of filebeat installed on server that help for compatibility check on target end.
message : Log line from logs file or multline log lines
offset: it’s represent inode value in source file
source : it’s file name from where logs were read
Now time to check data is publish to Kafka topic or not. For this go to below directory and you will see two files as xyz.index and xyz.log for maintaining data offset and messages.
Before going to start Logstash need to create configuration file for taking input data from Kafka and parse these data in respected fields and send it elasticsearch. Create file logstash-app1.conf in logstash bin directory with below content.
To test your configuration file you can use below command.
./logstash -t -f logstash-app1.conf
If we get result OK from above command run below to start reading and parsing data from Kafka topic.
./logstash -f logstash-app1.conf
For design your own grok pattern for you logs line formatting you can follow below link that will help to generate incrementally and also provide some sample logs grok.
Logstash console will show parse data as below and you can remove unsed fields for storing in elasticsearch by uncomment mutate section from configuration file.
To test on elasticsearch end your data sent successfully you can use this url http://localhost:9200/_cat/indices on your browser and will display created index with current date.
yellow open app1-logs-2017.05.28 Qjs6XWiFQw2zsiVs9Ks6sw 5 1 4 0 47.3kb 47.3kb
Kibana Installation, Configuration and Start
Download latest version of Kibana from below link and use command to untar and installation in Linux server or if window just unzip downloaded file.
Now we are ready with Kibana configuration and time start Kibana. We can use below command to run Kibana in background.
screen -d -m /bin/kibana
Kibana take time to start and we can test it by using below url in browser
http://localhost:5601/
For checking this data in Kibana open above url in browser go to management tab on left side menu -> Index Pattern -> Click on Add New
Enter Index name or pattern and time field name as in below screen and click on create button.
Index Pattern Settings
Now go to Discover Tab and select index as app1-log* will display data as below.
Now make below changes according to your application specification .
Filebeat :
update prospector path to your log directory current file
Move Kafka on different machine because Kafka will single location where receive shipped data from different servers. Update localhost with same IP of kafka server in Kafka output section of filebeat.full.yml file for hosts properties.
Copy same filebeat setup on all servers from where you application deployed and need to read logs.
Start all filebeat instances on each Server.
Elasticsearch :
Uncomment network.host properties from elasticsearch.yml file for accessing by IP address.
Logstash:
Update localhost in logstash-app1.conf file input section with Kafka machine IP.
Elasticsearch 5 provides low-level client API’s to communicate with Elasticsearch Cluster over HTTP by JAVA. Because of REST services and JSON able to communicate with all version of Elasticseach and across firewall also.
Elasticsearch REST Client sent http requests internally by using Apache Http Async Client.
In case of version change on Elasticsearch Cluster no need to change for client and will not impact any communication.
Initial Release Version : 5.0.0-alpha4
Elasticsearch REST Features
Keep the connection persistent.
Logged request and response by Elasticsearch Rest API.
Load balancing across all available nodes.
Failover in case of node fails and upon specific response codes.
Provide sniffing to the discovery of clusters node.
Minimum Dependencies.
Pre-Requisite
Elasticsearch Rest required minimum Java version is 1.7
Elasticsearch REST Java API Configuration
Maven Configuration
Add below dependency in pom.xml file to import all library.
.
org.elasticsearch.client
rest
5.1.2
Gradle Configuration
Add below dependency in build.gradle file to import all library.
Kafka provide server level properties for configuration of Broker, Socket, Zookeeper, Buffering, Retention etc.
broker.id : This broker id which is unique integer value in Kafka cluster.
broker.id:0
Socket Server Settings :
listeners: default value is PLAINTEXT://:9092 where socket servers listens and if not configured will take from java.net.InetAddress.getCanonicalHostName()
Format: security_protocol://host_name:port
listeners:PLAINTEXT://:9092
advertised.listeners: Need to set this value if listeners value is not set. Broker will advertise this listener value to producers and consumers.
num.io.threads: Number of threads handling I/O for disk.
num.io.threads:8
socket.send.buffer.bytes: Buffer size used by socket server to keep records for sending.
socket.send.buffer.bytes:102400
socket.receive.buffer.bytes: Buffer size used by socket server to keep records for sending.
socket.receive.buffer.bytes:102400
socket.request.max.bytes: max size of request that the socket server will accept.
socket.request.max.bytes:104857600
Log Basics
log.dirs: a comma separated list of directories under which to store log files.
log.dirs=/tmp/kafka-logs
num.partitions: The default number of logs per topic. More partitions allow greater parallelism for consumption, but this will also result in more files across the brokers.
num.partitions=1
num.recovery.threads.per.data.dir: The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
Log Flush Policy
log.flush.interval.messages: Messages are immediately written to the file system but by default we only fsync() to sync the OS cache lazily. The following configurations control the flush of data to disk.There are a few important trade-offs here:
Durability: Unflushed data may be lost if you are not using replication.
Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
The settings below allow one to configure the flush policy to flush data after a period of time or every N messages (or both). This can be done globally and overridden on a per-topic basis.
The number of messages to accept before forcing a flush of data to disk.
log.flush.interval.messages=10000
The maximum amount of time a message can sit in a log before we force a flush.
log.flush.interval.ms=1000
Log Retention Policy
The following configurations control the disposal of log segments. The policy can be set to delete segments after a period of time, or after a given size has accumulated. A segment will be deleted whenever either of these criteria are met. Deletion always happens from the end of the log.
log.retention.hours:The minimum age of a log file to be eligible for deletion
log.retention.hours=168
log.retention.bytes:A size-based retention policy for logs. Segments are pruned from the log as long as the remaining segments don’t drop below log.retention.bytes.
log.retention.bytes=1073741824
log.segment.bytes:The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
log.retention.check.interval.ms: The interval at which log segments are checked to see if they can be deleted according to the retention policies
log.retention.check.interval.ms=300000
Zookeeper
zookeeper.connect: Zookeeper connection string is a comma separated host:port pairs, each corresponding to a zk server. e.g. “127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002”. We can also append an optional root string to the urls to specify theroot directory for all kafka znodes.
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms: Timeout in ms for connecting to zookeeper.
Logstash can take input from Kafka to parse data and send parsed output to Kafka for streaming to other Application.
Kafka Input Configuration in Logstash
Below are basic configuration for Logstash to consume messages from Logstash. For more information about Logstash, Kafka Input configuration refer this elasticsearch site Link
bootstrap_servers : Default value is “localhost:9092”. Here it takes list of all servers connections in the form of host1:port1,host2:port2 to establish the initial connection to the cluster. It will connect with other if one server is down.
topics: List of topics to subscribe from where it will consume messages.
Kafka Output Configuration in Logstash
Below are basic configuration for Logstash to publish messages to Logstash. For more information about Logstash, Kafka Output configuration refer this elasticsearch site Link
bootstrap_servers : Default value is “localhost:9092”. Here it takes list of all servers connections in the form of host1:port1,host2:port2 and producer will only use it for getting metadata(topics, partitions and replicas) .The socket connections for sending the actual data will be established based on the broker information returned in the metadata.
Kafka can consume messages published by Filebeat based on configuration filebeat.yml file for Kafka Output.
Filebeat Kafka Output Configuration
Filebeat.yml required below fields to connect and publish message to Kafka for configured topic. Kafka will create Topics dynamically based on filebeat requirement.
output.kafka:
#The list of Kafka broker addresses from where to fetch the cluster metadata.
#The cluster metadata contain the actual Kafka brokers events are published to.
hosts: <strong>["localhost:9092"]</strong>
# The Kafka topic used for produced events. The setting can be a format string
topic: <strong>Topic-Name</strong>
# Authentication details. Password is required if username is set.
#username: ''
#password: ''
For more information about filebeat Kafka Output configuration option refers below Links.
For setting up Kafka Cluster for Multi Broker/ Server on single Machine follow below steps:
In below example we will create Kafka cluster with three brokers on single machine. All steps are same as configured for Kafka Cluster with Single Server on same machineadditionally created two more file for additional brokers and run it on same Cluster.
Download and Installation
Download Latest version of Kafka from link download , copy it to installation directory and run below command to install it.
tar -zxvf kafka_2.11-0.10.0.0
Configuration Changes for Zookeeper and Server
Make below changes in zookeeper.properties configuration file in config directory.
config/zookeeper.properties
clientPort=2181
clientPort is the port where client will connect. By Default port is 2181 if port will update in zookeeper.properties have to update in below server.properties too.
Make below changes in server.properties configuration file in config directory.
By default server.properties file have above fields with default values.
broker.id : Represents broker unique id by which zookeeper recognize brokers in Kafka Cluster. If Kafka Cluster is having multiple server this broker id will in incremental order for servers.
listeners : Each broker runs on different port by default port for broker is 9092 and can change also.
log.dir: keep path of logs where Kafka will store steams records. By default point /tmp/kafka-logs.
For creating three brokers create two more copy of server.properties configuration file as server1.properties and server2.properties and make below changes in files so that configuration will ready with three brokers on Kafka cluster .
Below commands will return the port of Zookeeper and Servers processes Id
ps aux | grep zookeeper.properties
ps aux | grep server.properties
ps aux | grep server1.properties
ps aux | grep server2.properties
Now Kafka is ready to create topic publish and subscribe messages also.
Create a Topic and Check Status
Create topic with user defined name and by passing replication and number partitions for topic. For more info about how partition stores in Kafka Cluster Env follow link for Kafka Introduction and Architecture.
In above command response .The first line gives a summary of all the partitions, each additional line provide information about one partition. We have only one line for this topic because there is one partition.
“leader” is the broker responsible for all reads and writes for the given partition. Each broker will be the leader for a randomly selected portion of the partitions.
“replicas” is the list of brokers that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.
“isr” is the set of “in-sync” replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.
In above example broker 2 is the leader for one partition for the topic and replicas of these portion are stored on broker 0 and then 1. If any message publish for topic will store in partition 2 first then in brokers 0 and 1 in sequence.
For all the request for this topic will taken care by Broker 2 and if broker is busy or fail by some reason like shutdown then broker 0 will become lead. See below example I have stopped broker 2 and run below command again and there lead is showing as 0.
To test topic push your messages to topic by running below command
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic multi-test
Input Messages:
Hi Dear
How r u doing?
Where are u these days?
These message after publish to Topic will retain as logs retention is configured for server even it’s read by consumer or not. To get information about Retention Policy configuration follow link Kafka Server Properties Configuration.
Subscribe Messages by Consumer from Topic
Run below command to get all published messages from multi-test Topic. It will return all these messages from beginning.
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic multi-test
Output Messages:
Hi Dear
How r u doing?
Where are u these days?
Below examples are for Kafka Logs Producer and Consumer by Kafka Java API. Where Producer is sending logs from file to Topic1 on Kafka server and same logs Consumer is subscribing from Topic1. While Kafka Consumer can subscribe logs from multiple servers.
Pre-Requisite:
Kafka client work with Java 7 + versions.
Add Kafka library to your application class path from Installation directory
Kafka Logs Producer
Below Producer Example will create new topic as Topic1 in Kafka server if not exist and push all the messages in topic from below Test.txt file.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.Properties;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord;
public class KafkaLogsProducer {
public static void main(String[] args) throws Exception{
//Topic Name where logs message events need to publish
String topicName = "Topic1";
// create instance for properties to access producer configs
Properties props = new Properties();
//Kafka server host and port
props.put("bootstrap.servers", "kafkahost:9092");
//Will receive acknowledgemnt of requests
props.put("acks", "all");
//Buffer size of events
props.put("batch.size", 16384);
//Total available buffer memory to the producer .
props.put("buffer.memory", 33553333);
//request less than zero
props.put("linger.ms", 1);
//If the request get fails, then retry again,
props.put("retries", 0);
props.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
//Thread.currentThread().setContextClassLoader(null);
Producer producer = new KafkaProducer
(props);
File in = new File("C:\\Users\\Saurabh\\Desktop\\Test.txt");
try (BufferedReader br = new BufferedReader(new FileReader(in))) {
String line;
while ((line = br.readLine()) != null) {
producer.send(new ProducerRecord(topicName,
"message", line));
}
}
System.out.println("All Messages sent successfully");
producer.close();
}
}
Input File from Directory
C:\Users\Saurabh\Desktop\Test.txt
Hi
This is kafka Producer Test.
Now will check for Response.
Kafka Logs Consumer
Below Kafka Consumer will read from Topic1 and display output to console with offset value. Consumer can be read messages from multiple topics on same time.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Properties;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
public class KafkaLogsConsumer {
public static void main(String[] args) {
//Topics from where message need to consume
List topicsList=new ArrayList();
topicsList.add("Topic1");
//topicsList.add("Topic2");
Properties props = new Properties();
props.put("bootstrap.servers", "kafkahost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer consumer = new KafkaConsumer
(props);
//Kafka consumer subscribe to all these topics
consumer.subscribe(topicsList);
System.out.println("Subscribed to topic " + topicsList.get(0));
while (true) {
//Below poll setting will poll to kafka server in every 100 milliseconds
//and get logs mssage from there
ConsumerRecords records = consumer.poll(100);
for (ConsumerRecord record : records)
{
//Print offset value of Kafka partition where logs message store and value for it
System.out.println(record.offset()+"-"+record.value());
}
}
}
}
Kafka Consumer Output
1-Hi
2-This is kafka Producer Test.
3-Now will check for Response.
You must be logged in to post a comment.