In the previous parts of this series, we've discovered what Elastic Search is and how it can help us analyzing our logs.
In the previous part, we've downloaded and installed the necessary software. Now, the time has come to configure the components, starting with: Filebeat.
Filebeat is a component that checks your log files for new lines. When it finds new lines, Filebeat will send these lines to the next component, which is Logstash, in our case.
Filebeat can be configured with a single file: filebeat.yml
.
This file can be found in my example Git repo in the filebeat
folder.
An example:
filebeat:
registry_file: /usr/share/filebeat/data/.registry
inputs:
- type: log
enabled: true
paths:
- /app/logs/*/*.log
- /opt/jboss/wildfly/standalone/log/*.log
multiline.pattern: "^([0-9]{4}-[0-9]{2}-[0-9]{2})"
multiline.negate: true
multiline.match: after
fields:
application: UnionVMS
environment: production
output:
logstash:
hosts: ["logstash:5044"]
The registry file keeps record of which log lines are already sent to Logstash, so that Filebeat can determine which log lines are new. We'll get back to this, later in this blogpost.
You can define the location and file name with this property.
An input is a definition of which files to keep track of, and what to do with them.
In the “paths” section, you need to define a list of your log files. You can use * as a wildcard.
Example:
...
inputs:
...
paths:
- /app/logs/*/*.log
- /opt/jboss/wildfly/standalone/log/*.log
...
Filebeat will send every line separately to Logstash.
2018-04-13 19:14:19 INFO e.e.e.f.u.e.s.bean.PluginServiceBean - ConfigModule updated parameter table with settings of plugins
2018-04-13 19:14:19 INFO e.e.e.f.u.e.s.bean.ExchangeServiceBean - Upsert settings in service layer: eu.europa.ec.fisheries.uvms.plugins.inmarsat SettingListType[setting=[SettingType[key=eu.europa.ec.fisheries.uvms.plugins.inmarsat.USERNAME,value=USERNAME]]] UVMS
2018-04-13 19:14:19 INFO e.e.e.f.u.e.b.ServiceRegistryModelBean - Update plugin settings for eu.europa.ec.fisheries.uvms.plugins.inmarsat
2018-04-13 19:14:19 INFO e.e.e.f.u.e.m.c.bean.MessageConsumerBean - ExchangeBaseRequest received in Exchange Message MDB:null
But... what in case of an exception?
2018-04-13 19:11:01 ERROR eu.europa.ec.mare.usm.jwt.jndiUtil - Could not bind value to key USM/secretKey
java.lang.RuntimeException: javax.naming.NamingException: Failed to bind [secret] at location [service jboss.naming.context.java.USM.secretKey] [Root exception is org.jboss.msc.service.DuplicateServiceException: Service jboss.naming.context.java.USM.secretKey is already registered]
at eu.europa.ec.mare.usm.jwt.jndiUtil.bindParam(jndiUtil.java:108)
at eu.europa.ec.mare.usm.jwt.jndiUtil.createJNDI(jndiUtil.java:23)
at eu.europa.ec.mare.usm.jwt.DefaultJwtTokenHandler.initKey(DefaultJwtTokenHandler.java:264)
at eu.europa.ec.mare.usm.jwt.DefaultJwtTokenHandler.init(DefaultJwtTokenHandler.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
So, in the front-end, a line "at eu.europa.ec.mare.usm.jwt.DefaultJwtTokenHandler.initKey(DefaultJwtTokenHandler.java:264)
" will pop up, without any other context. What? What does that mean?
Long story short: we want to see the complete stack trace in the front-end, not just a part of it.
2018-04-13 19:11:01 ERROR eu.europa.ec.mare.usm.jwt.jndiUtil - Could not bind value to key USM/secretKey
java.lang.RuntimeException: javax.naming.NamingException: Failed to bind [secret] at location [service jboss.naming.context.java.USM.secretKey] [Root exception is org.jboss.msc.service.DuplicateServiceException: Service jboss.naming.context.java.USM.secretKey is already registered]
at eu.europa.ec.mare.usm.jwt.jndiUtil.bindParam(jndiUtil.java:108)
at eu.europa.ec.mare.usm.jwt.jndiUtil.createJNDI(jndiUtil.java:23)
at eu.europa.ec.mare.usm.jwt.DefaultJwtTokenHandler.initKey(DefaultJwtTokenHandler.java:264)
at eu.europa.ec.mare.usm.jwt.DefaultJwtTokenHandler.init(DefaultJwtTokenHandler.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
java.lang.RuntimeException: javax.naming.NamingException: Failed to bind [secret] at location [service jboss.naming.context.java.USM.secretKey] [Root exception is org.jboss.msc.service.DuplicateServiceException: Service jboss.naming.context.java.USM.secretKey is already registered]
How can we explain to Filebeat that it should consider stack traces as one log line? What can Filebeat use to find the end of a log?
We can do that with regular expressions.
Every log starts with a date. If a line does not start with a date, consider it to be a part of the previous log line
inputs:
...
multiline.pattern: "^([0-9]{4}-[0-9]{2}-[0-9]{2})"
multiline.negate: true
multiline.match: after
...
Wait, what does “multiline.negate” and “multiline.match” do? A very good explanation, with examples, can be found in this guide.
Let’s say you've pumped logs of multiple environments (test, acceptance and production) in Elastic Search.
A customer sends you an email: “I’ve had a problem yesterday”. Ok, let’s see what the logs say. You open Kibana (the front-end of Elastic Search, we’ll get to that part later) and filter on the logs of yesterday. You get the logs of all enviroments, all mixed up. Oops.
We need to be able to filter on the environment name. Each environment has its own log directory. Based on the log directory, we can say: these are the logs for application x in environment y.
In order to make this distinction, we can add an extra field for each input:
inputs:
- type: log
paths:
- /logs/unionvms/production/*
fields:
application: UnionVMS
environment: production
- type: log
paths:
- /logs/unionvms/acceptance/*
fields:
application: UnionVMS
environment: acceptance
So there we have it. In the front-end, we can now filter on “fields.environment”!
Here you define where Filebeat should send the log lines to.
Since I use Docker, the app runs in a Docker network and the host name is equal to the container name.
Not using Docker? Fill in the name of the actual host which you Logstash runs on. Are Filebeat and Logstash on the same pc? Use localhost.
Filebeat remembers which lines were sent to Logstash. This information is saved in the registry file.
If a file changes, Filebeat will detect this and rescan all log lines in that file. Only new log lines will be sent to Logstash.
It depends on which environment Filebeat runs. The good news: you can define the location and name of the registry file yourself.
filebeat:
registry_file: /usr/share/filebeat/data/.registry
In the Docker example, I’ve configured Filebeat to save the registry file in a folder that is mapped by a Docker volume, so that it doesn’t get lost when the container restarts.
When a log gets a certain size, or when 24 hours have passed, most logging frameworks decide to start a new log file. The old log file gets renamed, often by appending a timestamp to the name. This is called log rotation.
For example:
logs.txt (current log file)
logs_2019-02-27_01.txt (first 10 MB of the logs at 27 February 2019)
logs_2019-02-27_02.txt (next bunch of logs at 27 February 2019)
If your file gets rotated, the file gets a new name but its contents don’t change. Filebeat will detect that the file was already harvested, so logs don’t get sent twice.
You can find more info here.
We've configured Filebeat to send all the log lines, including stracktraces and extra fields, to Logstash. The next step is to configure Logstash to interpret the log lines, in part 4 of this blogpost.