What is Autodiscover for Filebeat? And why do we need it?

How to use Autodiscover with Filebeat

Microservices constantly change in containerized environs, making pod or node identification and their logging more of a challenge. Autodiscover simplifies monitoring movements in these sorts of environments—places like Kubernetes and Docker.

Autodiscover allows you to track pods and adapt settings as changes happen in your environment. You achieve that by configuring respective nodes or pods, and consequently the autodiscover subsystem can monitor services as they start running within the cluster.

In other words, you can identify microservices running in your containerized environment and apply specific configs per microservice, be it a node or a pod.



For example, you might have two identical pods. One microservice might require multi-line configuration for its logs with different endpoints. However, another microservice might require logs to be sent to a different endpoint. Autodiscover will clearly distinguish between the two.

Providers

Providers are essential configurables that monitor system events and reformat them as internal autodiscover events. Providers must be defined in order for Autodiscover to work. 

There are two specific Autodiscover providers you’ll want to take special note of: 1) Docker and 2) Kubernetes.

The Docker provider watches for starts and stops; the K8S providers for starts, stops, and updates. Here is a chart of fields for each of those providers:

Docker providerKubernetes provider
hosthost
portport
docker.container.idkubernetes.container.id
docker.container.imagekubernetes.container.image
docker.container.namekubernetes.container.name
docker.container.labelskubernetes.labels
kubernetes.namespace
kubernetes.node.name
kubernetes.pod.name
kubernetes.pod.uid

Configuring Filebeat Autodiscover

In the following example, I used Minikube v1.6.1 to run a local cluster on my machine. I also used Filebeat version 7.3.1 with RBAC. 

In that cluster, I am running a WordPress website along with a MySQL DB for the website. 

In the following part of the article, I will explain how to apply Autodiscover via a YAML daemonset in Kubernetes. 

The first part of the setup is to use the following YAML, so in this instance you should have RBAC configured in order for it to work properly. 

Please find below the snippet of code from the actual YAML (Please note that type under providers and config are extremely important. If you select the wrong type, Autodiscover will not work. Under fields, type isn’t essential): 

############################# Filebeat  #####################################
    filebeat.autodiscover:
     providers:
       - type: kubernetes
          templates:
            - condition:
                equals: 
                  kubernetes.labels.app: wordpress   
              config: 
                - type: docker
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  fields:
                    logzio_codec: plain
                    token: ${LOGZIO_TOKEN}
                    type: wordpress
                    env: prod
                    test: wordpress-true
                  fields_under_root: true
            - condition:
                equals: 
                  kubernetes.container.name: mysql
              config: 
                - type: docker
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  fields:
                    logzio_codec: plain
                    token: ${LOGZIO_TOKEN}
                    type: mysql
                    env: prod
                    test: mysql-true
                  fields_under_root: true
    # Add Kubernetes Metadata
    processors:
      - add_kubernetes_metadata:
          in_cluster: true
      - rename:
          fields:
            - from: "agent"
              to: "beat_agent"
          ignore_missing: true
      - rename:
          fields:
            - from: "log.file.path"
              to: "source"
          ignore_missing: true

    filebeat.registry.path: /usr/share/filebeat/data/registry

When would you want to use Filebeat Autodiscover?

Let’s assume you have a microservice environment running in Kubernetes or Docker and you would like to apply different log settings to different types of microservices, i.e. for one microservice you want to apply multi line settings; in another, you want to apply masking for sensitive PII. 

Without the use of Autodiscover, you would have to manually create a separate YAML config for each microservice. You would also need to make sure that if you create a new cluster, that you copy the config. While this is feasible, it is more time-consuming, complicated, and not scalable. 

If you opt to use Autodiscover, you will need only one YAML with multiple configs to efficiently identify changes in the cluster.

This is just one example. The use cases may vary according to your need and the architecture of your environment. Autodiscover is best for large-scale environments, for instance with multiple clusters.

Conclusion: 

If you have a very dynamic environment with different types of logs coming from a variety of microservices, and you want to make sure your settings and configs are applied, I would suggest using Filebeat Autodiscover, as it makes life much easier while saving a lot of time and effort. 

Curious to learn more? Click here to sign up for a Logz.io webinar on how to configure Filebeat Autodiscover!

 

Need Help with Filebeat and Autodiscover? Register for the upcoming Logz.io webinar!

Artboard Created with Sketch.

Internal