log data parsing

In the field of log analysis, the key factor determining how easy it is to analyze the log data is the parsing. The more that the fields in the log messages are parsed correctly, the easier it is to query Kibana and create visualizations.

As part of our service, we provide automatic parsing for the most common log types, making the process of integrating with Logz.io much easier. Sometimes, however, your logs do not comply with the standard structure. They may include additional fields or contain differently formatted fields. Application logs will most definitely be unique in format and structure.

That’s where our new Data Parsing wizard comes into the picture by allowing you to define your own custom parsing method for your logs using an easy-to-use, dedicated wizard.

The wizard, accessed from the Log Shipping page in Logz.io (you need to create a free account to access the page), includes four steps: defining sample logs for testing the new parsing, selecting and then configuring a parsing method, and customizing specific field settings.

The Data Parsing wizard is accessed via the Log Shipping tab (Log Shipping → Data Parsing).

choose log data source

The first step allows us to select a source of data to use to configure the new parsing. Currently, you can select any of the log types that you are shipping to Logz.io as the data source, but in the future, you will be allowed to upload additional log samples as well.

A small caveat here is that only logs with no defined parsing can be selected. For example, since Apache access logs already have a Logz.io defined parsing method, this log type cannot be selected.

In the next step, you configure precisely how you want your log messages to be broken up and parsed.

define log parsing

After entering a pattern name, click Select. This opens up a dialog in which you can select up to five sample logs out of the last five hundred that have been shipped to Logz.io to use to create the parsing.  You can use the filter option to locate specific logs and add them as samples.

sample log data

You will now be required to select and configure your parsing method.

Please note that as of yet, only Grok is supported with Delimiter. JSON and Key Value parsing will be introduced soon.

With Grok selected as the parsing method, all you need to do is enter your grok pattern. Granted, grokking is not straightforward. We recommend using both the Grok Debugger and this list of grok patterns as a reference.

As soon as you begin grokking, your log lines will begin to be parsed into separate fields in the Parse Results table below.

log parsing results table

The colors help you to match the fields in the log lines with your parsing results. Make sure you see the name you chose for the fields in the log.

Here are some additional pointers for grokking your logs using Data Parsing:

  • To omit data, simply do not name the fields.
  • Using “message” as the field name overrides the existing message field in the log, and you can use it as many times as you want — the values for this field will be concatenated into a comma-separated message field. If you do not use this field name in the grok pattern, the default message field will be used.

After entering your grok pattern, you can define a field type for each field that you parse.

While we recommend leaving this default setting (“Automatic”), you have the option to define other types such as boolean, date, IP, and byte. For geo-enrichment for example, you will need to select the “Geo-Enrichment” field type.

Moving on, in the Enrich step, you can apply some advanced parsing customizations.

advanced log parsing customizations

If you selected to parse one of the IP fields as a Geo IP field in the previous step, you will now be able to decide with which geo fields to enrich the field.

Under “Set Timestamp,” you will be able to configure all the timestamp fields that appear within your logs. For example, you can set which timestamp is the leading timestamp (a leading timestamp determines the sequential order in which the logs are displayed).

It’s now time to make sure that you are happy with the parsing results. In the fourth and final step, you will be able to see the parsing applied to both the sample logs you selected to work with at the beginning of the process and to the last five hundred lines of logs that belong to that specific log type.

validate log type

Data Parsing is currently in Beta mode, and we are counting on our community to give us feedback so that we can improve it before announcing its general availability. If you do have any feedback, let us know at: info (at) logz.io.

Some points that are worth highlighting:

  • Data Parsing is currently in Beta mode, and is only available upon request for Pro and Enterprise customers.
  • As mentioned above, the Data Parsing feature can only be used on log types that do not have pre-defined parsing. If you would like to apply changes to those logs anyway, contact support (at) logz.io.
  • Again, Grok is the only supported parsing method. Delimiter, JSON, and Key Value parsing will be introduced soon.

For an overview of this new feature, my colleague Daniel Berman has created a video as well:

Enjoy!

Power your DevOps Initiatives with Logz.io's Machine Learning Features!