Configuring YAML Files after Installing the ELK Stack
What is YAML? YAML is a readable data serialization language used frequently in configuration files for software. Oddly enough, it stands for “Ain’t Markup Language.” This article will show you samples of YAML files (written .yml or .yaml) for the ELK Stack and other programs DevOps team commonly uses. And while some people love yaml and some hate it, it’s not going away.
This article is for anyone looking to quickly configure their entire ELK stack immediately after installing it. It’ll provide basic yaml configurations, when to uncomment lines, and advanced configurations (including .config files which might be in YAML or JSON). Please note, some files are .yml and others are .yaml. This is not a misprint. Pay attention to these details.
Introduction to YAML
YAML is actually a superset of JSON, but was created to be more legible than other markup languages—specifically XML. It also supports more features like lists, maps and even scalar types. Put another way, it works well for data serialization, something more advanced than markup (and yes, even XML does data serialization). To quote yaml.org, it is “designed around the common native data structures of agile programming languages.”
JSON won out as lingua franca of APIs and technical docs even though JSON and YAML were both designed with data serialization in mind. YAML is better for configuration because it allows for directions—use a # to write a non-code comment in the file to tell people configuring files exactly what to do. Many YAML templates actually function based on these hashtags. All you have to do—sometimes—is “uncomment lines” to make configurations work at their most basic level.
As a superset, YAML can parse JSON with a YAML parser. Again, YAML will support more advanced datatypes like embedded block literals and is also self-referential. You can also for free convert JSON to YAML and YAML to JSON online. YAML is also critical for log files, hence its importance for companies like Logz.io and open source tools like ELK.
YAML and Kubernetes
Why is YAML so important to Kubernetes?
Kubernetes is incredibly complex. YAML affords a lot of advantages for such a system, including YAML’s declarative traits. All Kubernetes resources are created by declaring them within YAML files. This saves a lot of time when scaling Kubernetes. Within a Kubernetes YAML configuration file, you declare the number of replica pods you want, which automatically creates them once the file’s changes are saved. You can also define deployment strategies for new containers and pods, pod limits, labels and filters to target specific pods called “selectors.”
YAML Files in the ELK Stack
Of course, you might also be here because you are trying to keep your YAML configurations straight specifically for the ELK Stack (or another monitoring tool, whether or not related to Docker and/or Kubernetes).
The elasticsearch.yml
file—like similar files in the ELK Stack and Beats—will be by default located in different places depending on the way you install ELK. In general, this is where you will find them:
Linux:
/etc/elasticsearch/elasticsearch.yml /etc/kibana/kibana.yml /etc/filebeat/filebeat.yml /etc/metricbeat/metricbeat.yml
Homebrew (Mac):
/usr/local/etc/elasticsearch/elasticsearch.yml /usr/local/etc/kibana/kibana.yml /usr/local/etc/filebeat/filebeat.yml /usr/local/etc/metricbeat/metricbeat.yml
Docker:
/usr/share/elasticsearch/elasticsearch.yml /usr/share/kibana/kibana.yml /usr/share/filebeat/filebeat.yml /usr/share/metricbeat/metricbeat.yml
Configure the elasticsearch.yml File
To configure elasticsearch.yml
, enter the file.
sudo vim /etc/elasticsearch/elasticsearch.yml
Hit i in order to edit the file. You will want to uncomment lines for the following fields:
-
cluster.name
-
node.name
-
path.data
-
path.logs
-
network.host #Depending on your situation, this should usually be 127.0.0.1 or 0.0.0.0
- http.port 9200 (When you uncomment this, make sure this is already set at 9200. Otherwise, set it yourself and type it in.)
Press esc and then type :wq in order to 1) save AND 2) exit the file simultaneously.
Advanced YAML: Elasticsearch Cluster Configuration
You will need more advanced settings for Elasticsearch clusters, including disabling swapping unused memory.
bootstrap.memory_lock: true OR bootstrap.mlockall: true
And:
MAX_LOCKED_MEMORY=unlimited
Configure the kibana.yml File
Besides the default file locations mentioned above, if you installed Kibana from a tar.gz or .zip distribution, look for the file in KIBANA_HOME/config.
Uncomment the following lines and/or make sure the settings match:
server.port: 5601 elasticsearch.url: "http://localhost:9200"
An alternative kibana.yml
example might look like this:
server.port: 127.0.0.1:5601 elasticsearch.url: "http://elasticsearch:9200"
In general, Elasticsearch should be located at localhost:9200 in all ELK Stack configuration files for system-hosted ELK, unless of course you have a different location.
Intermediate/Advanced Kibana configurations
You can point to an X.509 server certificate and their respective private keys using different options. These configurations are possible for both Elasticsearch input and Kibana itself. Both sets of configurations, however, would be in the kibana.yml
configuration file.
Kibana uses all of these options to validate certificates and create a chain of trust with SSL/TLS connections from end users coming into Kibana.
For Kibana:
server.ssl.keystore.path:
For Elasticsearch:
elasticsearch.ssl.keystore.path:
Alternatively, the server.ssl.certificate and server.ssl.key configurations can be used. These together function as an alternative because they cannot be used in conjunction with the server.ssl.keystore.path configuration.
For Kibana:
server.ssl.certificate: server.ssl.key:
For Elasticsearch:
elasticsearch.ssl.certificate: elasticsearch.ssl.key:
Elasticsearch or Kibana will use these chains, respectively, when PKI authentication is active.
Additionally, you can enable the following configuration to encrypt the respective ssl.key configs:
For Kibana:
server.ssl.keyPassphrase:
For Elasticsearch:
elasticsearch.ssl.keyPassphrase:
Configure the logstash.conf and logstash.yml Files
You will mainly configure Logstash in its .conf file, which is in JSON. However, the logstash.yml
file is still relevant.
The logstash.conf file is actually in JSON. However, while this post obviously focuses on YAML configurations, it would be a disservice not to include the basics for the .conf file.
Here is a basic Logstash configuration example for the file’s three main sections: input, filter, and output:
logstash.conf Configuration
input { file { path => "/var/log/apache2/access.log" start_position => "beginning" sincedb_path => "/dev/null" } } filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } geoip { source => "clientip" } } output { elasticsearch { hosts => ["localhost:9200"] } }
logstash.yml Configuration
Specify Logstash modules:
modules: - name: MODULE_NAME1 var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY1: VALUE var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY2: VALUE var.PLUGIN_TYPE2.PLUGIN_NAME2.KEY1: VALUE var.PLUGIN_TYPE3.PLUGIN_NAME3.KEY1: VALUE - name: MODULE_NAME2 var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY1: VALUE var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY2: VALUE
Configure the filebeat.yml Files
Configure filebeat.inputs for type: log. Identify separate paths for each kind of log (Apache2, nginx, MySQL, etc.)
filebeat.inputs: - type: log #Change value to true to activate the input configuration enabled: false paths: - “/var/log/apache2/*” - “/var/log/nginx/*” - “/var/log/mysql/*”
Then define processors within filebeat.inputs. This example defines the drop_fields processor:
filebeat.inputs: - type: log paths: - "/var/log/apache2/access.log" fields: apache: true processors: - drop_fields: fields: ["verb","id"]
Then define the Filebeat output. Uncomment or set the outputs for Elasticsearch or Logstash:
output.elasticsearch: hosts: ["localhost:9200"]
output.logstash: hosts: ["localhost:5044"]
Configuring Filebeat on Docker
The most common method to configure Filebeat when running it as a Docker container is by bind-mounting a configuration file when running said container. To do this, create a new filebeat.yml
file on your host. This example is for a locally hosted version of Docker:
filebeat.inputs: - type: log paths: - '/var/lib/docker/containers/*/*.log' json.message_key: log json.keys_under_root: true processors: - add_docker_metadata: ~ output.elasticsearch: hosts: ["localhost:9200"]
To see further examples of advanced Filebeat configurations, check out our other Filebeat tutorials::
What is Filebeat Autodiscover?
Using the Filebeat Wizard in Logz.io
Musings in YAML—Tips for Configuring Your Beats
Configure the metricbeat.yml File
metricbeat.yml
will list a number of modules (Apache, system, nginx, etc.). Make sure to identify the module, the metricsets, the interval, processes, hosts and enabled: true.
Here is an example configuration of two modules in Metricbeat, one for your system and another for Apache metrics:
metricbeat.modules: - module: system metricsets: ["cpu","memory","network"] enabled: true period: 15s processes: ['.*'] - module: apache metricsets: ["status"] enabled: true period: 5s hosts: ["http://172.20.11.7"]
If you are setting up a Metricbeat Docker module, it’s advisable to mark the following metricsets:
metricsets: [“container”, “cpu”, “diskio”, “healthcheck”, “info”, “memory”, “network”]
Metricbeat Output
Just as with Filebeat, uncomment or set the outputs for Elasticsearch or Logstash:
output.elasticsearch: hosts: ["localhost:9200"]
output.logstash: hosts: ["localhost:5044"]
Be sure to go over our full Metricbeat tutorial.