apache log analyzer

It’s no secret that Apache is the most popular web server in use today. Netcraft has Apache usage at 47.8% as of February 2015, and according to a w3techs report, Apache is used by 52% of all of the websites they monitor (with NGINX trailing behind at 30%).

Why is Apache so popular (as shown in that statistic)? It’s free and open source — and open source is becoming vastly more popular than proprietary software. It’s maintained by a bunch of dedicated developers, it provides security and is well suited for small and large websites alike, and it can be easily set up on all major operating systems and is extremely powerful and flexible. Does that sound about right?

The big “but” here is that this popularity does not necessarily reflect the challenges facing organizations running business-critical apps on Apache, one of these being log analytics. Being able to gain insight into Apache access and error logs is crucial for analyzing crashes, load times, and other data on app performance. But in production environments in which huge amounts of requests are sent to the web logs server every second, extracting actionable data from thousands of log files is virtually impossible.

This tutorial will show you one easy way to do just that — by describing how to use the Logz.io ELK Stack as an Apache log analyzer. Of course, this tutorial can be used with any on-premise installation of the ELK Stack. The guide will take you through the steps of using our service on a vanilla Linux environment (ubuntu 14.04) — setting up your environment, shipping logs, and then creating visualizations in Kibana.

How to analyze Apache logs using Logstash

One of the most important — and also common — steps that need to be taken first is to enhance our Apache logs. In technical terms, this involves configuring Logstash filters to parse the logs in a way that will make them more understandable and analyzable in Kibana.

Here is an example of an Apache log line and the Logstash configuration that we at Logz.io use to parse such logs in our own environment.

A sample Apache access log entry:

The Logstash configuration to parse that Apache access log entry:

A sample Apache error log:

The Logstash configuration to parse that Apache error log:

These are two of the configurations that we are currently using ourselves — of course, there are more fields that can be added to the Apache log files and then can be parsed and analyzed accordingly.

Apache Log Analysis Use Cases

The following use cases exemplify the benefits of using ELK with Apache logs.

Use Case #1: Operational Analysis

operational analysis apache

One of the most common use cases for analyzing Apache logs with ELK, is for operational purposes. Using ELK, DevOps teams can get notifications on events such as when traffic or error rates are significantly higher than usual. Needless to say, these issues can have a significant impact on business — for example, site page response rates can slow down to undesirable levels and create a poor user experience.

Using ELK to analyze Apache error logs, users can detect whether there is a significant decrease in the number of users accessing the servers or whether there is an unprecedented peak in traffic that overloaded the server and caused it to crash. Monitoring Apache logs in a single Kibana dashboard can help identify DDoS attacks, and will allow DevOps teams to quickly find the source IP address and subsequently block it.

One of the most popular Kibana dashboards we have is the Apache Access dashboard — which displays a fully detailed and comprehensive dashboard on Apache traffic, including the most visited URLs on the server, HTTP response codes, access by browser and country, and more.

This visualization and more can be found in our ELK Apps library by searching for Apache.

Use Case #2: Technical SEO

apache server technical seo

Quality content creation is now extremely important for SEO purposes, although it’s basically useless if Google has not crawled, parsed, and indexed the content. Using ELK to analyze and monitor your Apache logs can tell you when Google last crawled your site, thus allowing you to verify that it is being constantly crawled by Googlebot.

In addition, tracking Apache access logs with ELK allows you to check whether you have reached your limit of Google crawls, see how Google crawlers are prioritizing your web pages, and which specific URLs get the most and least attention (read more about using server log analysis for technical SEO here.)

Use Case #3: Business Intelligence

Apache Access logs contain all the information you need to be able to run a deep analysis of your application users, from their geographic location to the pages they visit to the experience they are receiving. The benefit of using ELK to monitor Apache logs is that you can also correlate it with infrastructure-level logs. This gives you a better understanding of your audience’s experience as it is affected by the underlying infrastructure.

For example, you can analyze response times and correlate them with the CPUs and memory loads on the machines themselves. This in turn, allows you to explore whether stronger machines can provide a better UX.

Visualizing Apache Logs

One of the biggest advantages of using the ELK Stack as an Apache web log analyzer is the ability to visualize analyses and downloads as well as identify correlations.

In Kibana, you can  allows create detailed visualizations and dashboards that can help you keep tabs on your web server and identify anomalies within the data. Configuring and building these visualizations is not always easy, but the end-result is extremely valuable.

Examples abound — starting with the most simple visualizations, showing a breakdown of requests per country, through heat maps showing users and response codes, and ending with complex line charts displaying response time per response code, broken down into sub-groups per agent and client.

To help you hit the ground running, we provide a free library of pre-made searches, visualizations and dashboards for Apache — ELK Apps. The library includes 11 visualizations for Apache server log analysis, including a complete monitoring dashboard.

apache elk apps dashboard

Summary

Log analysis for operational intelligence, business intelligence and technical SEO are just three examples of why Apache users need to monitor logs. There are many more use cases, such as log driven development and application monitoring. In fact, not a week goes by without us learning about a new way the ELK Stack is being used by one of our customers.

We’d love to hear in the comments below how you are using ELK or other tools to analyze Apache log files!

Logz.io is a predictive, cloud-based log management platform that is built on top of the open-source ELK Stack and can be used for web log analysis, application monitoring, business intelligence, and more. Start your free trial today!

Daniel Berman is Product Evangelist at Logz.io. He is passionate about log analytics, big data, cloud, and family and loves running, Liverpool FC, and writing about disruptive tech stuff.