Fluentd is an open source data collector developed by Treasure Data that acts as a unifying logging layer between input sources and output services. Fluentd is easy to install and has a light footprint along with a fully pluggable architecture.
In the world of the ELK Stack, Fluentd acts as a log collector—aggregating logs, parsing them, and forwarding them on to Elasticsearch. As such, Fluentd is often compared to Logstash, which has similar traits and functions (see a detailed comparison between the two here).
Both Logstash and Fluentd are supported by us at Logz.io, and we see quite a large number of customers using the latter to ship logs to us. This Fluentd tutorial describes how to establish the log shipping pipeline—from the source (Apache in this case), via Fluentd, to Logz.io.
Prerequisites
To complete the steps below, you’ll need the following:
- Ruby and ruby-dev 2.5 or higher pre-installed (support for Ruby 2.1, 2.2., and 2.3 was deprecated with Fluentd v1.9.0 on 2020-Jan-22)
- (The startup log should show the current version of Ruby.)
- HTTPS traffic allowed to port 8071
- An installed cURL and Apache web server
- An active Logz.io account. If you don’t have one yet, create a free account here.
- 5 minutes of free time!
Step 1: Installing Fluentd
To install it the stable release called ‘td-agent’, use this cURL command (this command is for Ubuntu 12.04 — if you’re using a different Linux distribution, click here):
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent2.sh | sh
The command will automatically install Fluentd and start the daemon. To make sure all is running as expected, run:
$ sudo /etc/init.d/td-agent status
If you’re on Mac OS X, use these instructions.
Note: If you want insight into creating a daemonset for yourself, look at our Fluentd tutorial for Kubernetes.
Step 2: Installing the Logz.io plugin
Our next step is to install the Logz.io plugin for Fluentd. To do this, we need to use the gem supplied with the td-agent:
$ /opt/td-agent/usr/sbin/td-agent-gem
To install the Logz.io plugin, run:
$ sudo /opt/td-agent/usr/sbin/td-agent-gem install fluent-plugin-logzio
Alternatively:
$ gem install fluentd fluent-plugin-logzio
or
$ sudo gem install fluentd fluent-plugin-logzio
Step 3: Configuring Fluentd
We now have to configure the input and output sources for Fluentd logs. In this tutorial, we’ll be using Apache as the input and Logz.io as the output.
Open the Fluentd configuration file:
$ sudo vi /etc/td-agent/td-agent.conf
Define Apache as the input source for Fluentd:
<source> @type tail format none path /var/log/apache2/access.log Pos_file /tmp/access_log.pos tag apache </source>
Note: Make sure you have full permissions to access Apache files. If you do not, Fluentd will fail to pull the logs and send them on to Logz.io.
Next, we’re going to define Logz.io as a “match” (the Fluentd term for an output destination):
<match **> @type logzio_buffered endpoint_url https://<<LISTENER-HOST>>:8071?token=<<SHIPPING-TOKEN>>&type=my_type output_include_time true output_include_tags true http_idle_timeout 10 <buffer> @type memory flush_thread_count 4 flush_interval 3s chunk_limit_size 16m queue_limit_length 4096 </buffer> </match>
Fine-tune this configuration as follows:
- <token>: Use your token in the token placeholder (which can be found in the Logz.io Settings section)
- <logtype> : Specify the log type (e.g. ‘apache-access’) in the type placeholder. This helps Logz.io to parse and grok your data. A complete list of known types is available here. If your type is not listed here, please let us know.
- <pathtobuffer>: Enter a path to the folder in your file system for which you have full permissions (e.g. /tmp/buffer). The buffer file helps to aggregate logs together and ship them in bulk
- Replace <<LISTENER-HOST>> with your region’s listener host (for example, listener.logz.io)
- Replace <<SHIPPING-TOKEN>> with the token of the account you want to ship to.
Last but not least, restart Fluentd:
$ sudo /etc/init.d/td-agent restart
That’s it. After a minute or two, your Apache logs will show up in the Logz.io user interface. To create some log files, run this ab command to simulate traffic (you’ll need to place a file on your web server to use first):
$ sudo ab -k -c 350 -n 1000 localhost/<file.html>
Additional Fluentd configurations
Some configurations are optional but might be worth your time depending on your needs.
You can set to rotate Fluentd daemon logs; ensure there is a constant flow of Fluentd filter optimization logs; and turn off the default setting that suppresses Fluentd startup/shutdown log events.
For Apache, consider whether or not you want in the parser the escape sequence for Apache access logs.
Check out further Fluentd tutorials by Logz.io.