In a previous post, we described how to use Packetbeat to analyze networks by monitoring metrics on web, database, and other network protocols. Another member of Elastic’s “Beats” family is Topbeat — a shipper that monitors system data and processes.
Topbeat collects data on CPU usage, memory, process statistics, and other system-related metrics that when shipped into the ELK Stack for indexing and analysis, can be used for real-time monitoring of your infrastructure.
In this post, we will describe how to monitor a basic infrastructure setup that consists of a single server (in this case, deployed on AWS) using Topbeat and the Logz.io ELK Stack. We will begin by configuring the pipeline from Topbeat into the ELK Stack and then show how to analyze and visualize the data.
Setting Up Topbeat
Our first step is to install and configure Topbeat (the full installation instructions are here):
$ curl -L -O
$ sudo dpkg -i topbeat_1.2.3_amd64.deb
Open the configuration file at /etc/topbeat/topbeat.yml:
$ sudo vim /etc/topbeat/topbeat.yml
The first section if the configuration file allows you to define how often statistics are read from your system and the specific processes to monitor. In our case, the default settings will do just fine.
Moving on, you need to define to where the data will be outputted. By default, Topbeat is configured to output the data to Elasticsearch. If you’re using a locally-installed Elasticsearch instance, this default configuration will suit you just fine:
### Elasticsearch as output
Or, you could ship to Logstash using the default configuration in the ‘Logstash as output’ section. You will need to uncomment the relevant lines.
In our case, though, we’re going to comment out the Elasticsearch output configuration and define a file output configuration. In the File as an output section, uncomment the default settings as follows:
### File as output
Next, in the Logging section, define a log file size limit that, once reached, will trigger an automatic rotation:
Once done, start Topbeat:
$ sudo /etc/init.d/topbeat start
Setting Up Filebeat
As shown above, Topbeat data can be sent directly to Elasticsearch or forwarded via Logstash. Since we do not yet have a native log shipper for Topbeat, we’re going to use Filebeat to input the file exported by Topbeat into the Logz.io ELK setup (if you’re using the open source ELK Stack, you can skip this step).
First, download and install the Public Signing Key:
$ curl https://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
Then, save the repository definition to /etc/apt/sources.list.d/beats.list:
$ echo "deb https://packages.elastic.co/beats/apt stable main" | sudo tee -a /etc/apt/sources.list.d/beats.list
Now, update the system and install Filebeat:
$ sudo apt-get update && sudo apt-get install filebeat
The next step is to download a certificate and move it to the correct location, so first run:
$ wget http://raw.githubusercontent.com/cloudflare/cfssl_trust/master/intermediate_ca/COMODORSADomainValidationSecureServerCA.crt
$ sudo mkdir -p /etc/pki/tls/certs
$ sudo cp COMODORSADomainValidationSecureServerCA.crt /etc/pki/tls/certs/
We now need to configure Filebeat to ship our Topbeat file into Logz.io.
Open the Filebeat configuration file:
$ sudo vim /etc/filebeat/filebeat.yml
Defining the Filebeat Prospector
Prospectors are where we define the files that we want to tail. You can tail JSON files and simple text files. In our case, we’re going to define the path to our Topbeat JSON file.
Please note that when harvesting JSON files, you need to add ‘logzio_codec: json’ to the fields object. Also, the fields_under_root property must be set to ‘true.’ Be sure to enter your Logz.io token in the necessary namespace:
Defining the Filebeat Output
Outputs are responsible for sending the data in JSON format to Logstash. In the configuration below, the Logz.io Logstash host is already defined along with the location of the certificate that you downloaded earlier and the log rotation setting:
# The Logstash hosts
# List of root certificates for HTTPS server verifications
# To enable logging to files, to_files option has to be set to true
# Configure log file size limit.
rotateeverybytes: 10485760 # = 10MB
Be sure to put your Logz.io token in the required fields.
Once done, start Filebeat:
$ sudo service filebeat start
Analyzing the Data
Important note! If you’re using the open source ELK Stack, another step is necessary — loading the Topbeat index template in Elasticsearch. Since Logz.io uses dynamic mapping, this step is not necessary in our case. Please refer to Elastic’s documentation for more information.
To verify that the pipeline is up and running, access the Logz.io user interface and open the Kibana tab. After a minute or two, you should see a stream of events coming into the system.
You may be shipping other types of logs into Logz.io, so the best way to filter out the other logs is by first opening one of the messages coming in from Topbeat and filtering via the ‘source’ field.
The messages list is then filtered to show only the data outputted by Topbeat:
Start by adding some fields to the messages list. Useful fields are the ‘type’ and ‘host’ fields, especially when monitoring a multi-node environment. This will give you a slightly clearer picture of the messages coming in from Topbeat.
Next, query Elasticsearch. For example, if you’d like to focus on system data, use a field-level search to pinpoint these specific messages:
Our next step is to visualize the data. To do this, we’re going to save the search and then select the Visualize tab in Kibana.
For starters, let’s begin with a simple pie chart that gives us a breakdown of the different source types coming into Elasticsearch from Topbeat. The configuration of this visualization looks like this:
Hit the Play button to preview the visualization:
Memory Usage Over Time
Now, let’s try to create a more advanced visualization — a new line chart that shows memory usage over time. To do this, we’re going to use the saved search for system-type messages (shown above) as the basis for the visualization.
The Y axis, in this case, will aggregate the average value for the ‘mem.actual_used’ field, and the X-axis will aggregate by the ‘@timestamp’ field. We can also add a sub-aggregation to show data for other hosts (in this case, only one host will be displayed).
The configuration of this visualization looks like this:
And the end-result:
Per-Process Memory Consumption
Another example of a visualization that we can create is an area chart comparing the memory consumption for specific processes on our server.
The configuration of this visualization will cross-reference the average values for the ‘proc.mem.rss_p’ field (the Y-axis) with a date histogram and the ‘proc.name’ field (X-axis).
The configuration looks like this:
And the end-result:
After saving the visualizations, it’s time to create your own personalized dashboard. To do this, select the Dashboard tab, and use the + icon in the top-right corner to add your two visualizations.
Now, If you’re using Logz.io, you can use a ready-made dashboard that will save you the time spent on creating your own set of visualizations.
Select the ELK Apps tab:
ELK Apps are free and pre-made visualizations, searches and dashboards customized for specific log types. (You can see the library directly or learn more about them.) Enter ‘Topbeat’ in the search field:
Install the Topbeat dashboard, and then open it in Kibana:
So, in just a few minutes, you can set up a monitoring system for your infrastructure with metrics on CPU, memory, and disk usage as well as per-process stats. Pretty nifty, right?