Using Amazon EC2 Tags to Deploy Servers with Puppet

ec2 tags

Those of us at the log analytics platform love both AWS and configuration management, so we were thinking about how to combine the two.

Our goal was to provision servers quickly, dynamically, and with as little configuration as possible. Later, we decided it would be cool if we could determine the server configuration simply by tagging it in Amazon Elastic Compute Cloud (Amazon EC2).

In this post, I will describe how we use tags to provision EC2 instances so that you can do the same in your environment.

A Brief Introduction to Puppet

You will need a basic understanding of Puppet. If you are already familiar with the open-source configuration management tool, feel free to skip this part.

We chose to use Puppet because it is simple, but still offers a feature set so extensive that it’s hard to pick a favorite. With 419 contributors and 22,000 commits in its GitHub repository, the Puppet community is big and active — and the Puppet forge (a repository for pre-made Puppet modules) is extensive.

Puppet, just like every other configuration management software, follows this simple principle: Code your desired state, and the software will make sure that the server complies. There’s no need to mess with the how, only with the what.


As its name implies, Facter provides facts (for example, OS version, kernel version, and the amount of free memory) about servers. The really cool thing about Facter is that it is extensible. You can write your own code to supply facts about a server.

In Puppet, a manifest is the basic logical unit. Puppet can use Facter values inside Puppet manifests. All of your Puppet code will reside inside a manifest, so you can change things dynamically from one server to another. (Just guess where I’m going with this!)


Hiera (a shortened form of “hierarchy”) is a tool that allows you to define an answering mechanism for manifest questions.

For you inexperienced Puppet users, let’s say I have a Puppet manifest that configures NTP. I also have instances in several AWS regions, so I want to pick the closest NTP server to my region.

I have a couple of options:

  • Write a different manifest for each region. Not a good idea! It’s error-prone and requires too much maintenance. We like to keep things DRY.
  • Hard-code the NTP servers of different regions in the manifest and use Facter to determine which one to use. This is a better option, but what would happen if I were to deploy servers in a new region? I would need to redeploy the manifest simply to add the NTP servers.
  • Use Hiera!

In Hiera, you can define a hierarchy called region and then specify the NTP servers there.

When the manifest runs, you can ask Hiera, “Which NTP servers should I use?” Based on the Facter value for your region, the correct NTP servers will be provided. For more information about Hiera, see the excellent Puppet documentation.

How Puppet and Hiera Work Together

This is the basic process to keep in mind for now: By default, a node managed by Puppet “wakes up” once every 30 minutes and sends the Puppetmaster (the Puppet server) its facts (from Facter). The Puppetmaster will decide which manifests this node should receive (I’ll leave out the how for now) and then compiles the manifests, using Hiera to answer all of the questions, as described earlier.

All of the manifests compiled together are referred to as a catalog. They are sent to the managed agent, which, in turn, will apply the catalog. The outcome of this process will be sent back to the master in the form of a Puppet report.

How Does It Work?

We want to expose EC2 tags as facts so we can use them later with Puppet to place our instance in the right spot inside Hiera.

First, we had to be able to fetch tags from EC2. Although it seems obvious to query EC2 instance metadata from inside the host, tags are not returned. We had to go another way: by letting each instance query the tags directly from the AWS API. Luckily, we found this awesome module from the Puppet forge that lets us do exactly that.

After you deploy the module into your environment, you should see something like this:

deploy module into environment

For the module to work, you should either supply an access key and secret access key or set a role for the instance. You’ll find setup instructions for the module in the Forge documentation.

Now that we have tags as facts, let’s combine the facts with Hiera.

Let’s say we have three types of instances ─ app, bl, and db ─ and we want to be able to provision any one of them based only on its AWS role tag. We configure Hiera to look something like this:

  - yaml
  :datadir: /etc/puppet/hieradata
  - "%{::fqdn}"
  - "%{ec2_tag_role}"
  - common

In this configuration, each manifest that will ask Hiera a question will first search for an answer in /etc/puppet/hieradata/ Then, it will search under /etc/puppet/hieradata/myrole.yaml, and then in /etc/puppet/hieradata/common.yaml.

Unless we asked Hiera to merge results, only the first answer will be accepted.

We use fqdn for instance-specific configuration and common for global configuration.

Let’s create a yaml file for our app under /etc/puppet/hieradata/app.yaml:

 - "my-awesome-app-module"

my-awesome-app-module::first_param: 1
my-awesome-app-module::second_param: 2
my-awesome-app-module::param_based_on_fact: "%{ec2_tag_name}"
my-awesome-app-module::param_based_on_hiera: %{hiera('my-awesome-app-module::first_param')}

The classes section defines which classes the instance should get, but the use of the classes entry does not work by itself. You must configure it under /etc/puppet/manifests/site.pp:

node default {

This best configuration practice is based on the Puppet node definition.

All of the parameters starting with my-awesome-app-module can be accepted as either class parameters or Hiera lookup functions. For information about automatic parameter lookup, see here.

The End Result

Here are some ways we have used tags to configure our servers:

  1. Setting the role tag with the value Elasticsearch will install Amazon Elasticsearch Service (Amazon ES) on the machine.
  2. Setting the es-role tag with the value master will make sure that the server is an Elasticsearch master.
  3. On staging servers, setting the branch tag to feature/abc will deploy code from the feature/abc branch.

This is just the tip of the arrow for using EC2 tags to provision your instances. If you give this a try in your environment, contact me on GitHub. I’d love to hear about the amazing things you are doing!

Observability at scale, powered by open source


2022 Gartner® Magic Quadrant for Application Performance Monitoring and Observability
Forrester Observability Snapshot.

Centralize Server Monitoring With

See Plans