Terraform vs. Ansible vs. Puppet
Infrastructure as Code (IAC)–ten years ago it was just coming to light, now most organizations are either adopting it or are built upon it. So what does this mean in simple terms? IAC takes manual configurations, procedures, build guides, run books, and treats them like code. It means building tests for your configurations to ensure you’re getting what you expect, and enforces that configuration. It takes the “person” out of manual configuration. If you are familiar with the OSI model, it treats Layer 1-6 as revisioned software, not just Layer 7.
IAC solves many basic yet crippling problems: configuration drift, human error, inconsistencies, and loss of context. Let’s step back seven years for an example at an enterprise. Your server admin racked a 4U server (1 hour), loaded the OS and all patches (1–2 hours depending on scripting). The application admin received the server, loaded a web service, configured a proxy, a certificate, and uploaded their content (3 hours). A network engineer provisioned firewall ports, the subnet, and VLANs for the server(s) (1 hour). Perhaps the SAN admin configured storage for content (1 hour). Then the server ran for 7 years, getting updated with little more than a change control ticket tracker to find out the evolution of its life (that is, if all changes made it into change control).
With IAC tools, this process would take minutes, and it is repeatable, or scalable, as in-loading five servers or hundreds should take the same amount of time to provision. The best part is you gain predictable architecture and can confidently maintain that desired state.
This article will take you through a comparison of Terraform, Ansible, and Puppet–three IAC tools that have unique strengths and weaknesses.
One concept to understand: Desired State (Configuration) Manager vs. Orchestrator. The Desired State Manager is declarative and brings your controlled assets (typically servers) into an expected state. It is a server/client relationship. An Orchestrator controls the deployment of services. It is a client-only model and managed through configuration files.
Terraform
Terraform is the service provisioner and infrastructure orchestrator in the suite of offerings by Hashicorp. Terraform is cloud-agnostic and supports a multitude of providers, giving you efficiency in managing your multi-cloud, multi-offering environment using the same configuration construct and language. It is written in the Hashicorp Language (HCL) and is very easy to get up and running quickly.
Where It Is Awesome
Using Terraform to manage your framework, or your infrastructure scaffolding, is its real strength. Basically, you can use Terraform to build everything in your cloud up to the point of configuring the servers themselves. It can build your cloud networks, your security objects, network objects, scaling objects, pretty much every cloud feature.
Terraform is fast. It allows you to plan your changes and gives you feedback on what will change before you run it and should be revision controlled in your favorite code repository. It is relatively simple to build from scratch in comparison to the other tools in this article.
Terraform provides the means to leverage modules for more efficient code usage. An excellent way to further reduce duplication is to incorporate Terragrunt to wrap your Terraform code. Terragrunt allows you to completely remove duplication no matter how many environments you manage.
Where It Struggles
Terraform is an amazing tool but a major challenge is managing the state file. Whenever you apply changes to your infrastructure, the entire managed body of code and created objects are tracked in the Terraform State file (.tfstate), which can reach hundreds of thousands of lines and must be managed carefully lest you incur large merge conflicts or unwanted resource changes.
One drawback is if you are used to Amazon CloudFormation’s rollback on failure, Terraform does not rollback. Some see this as a better option, while others would rather employ an all-or-nothing deployment.
Terraform can use methods like user-data to configure servers themselves but it is a far less efficient way to provision servers than the other tools in this article. If you’re relying on containers or prebaked images, then this shouldn’t be a concern. If not, you may need one of the other two featured tools in this article to perform the actual configuration.
Ansible
Ansible is a powerful imperative tool that offers a suite of classes and configuration methods to bring servers and services into the desired state, otherwise, it connects to different providers with wrapper modules to configure resources. It is lighter weight, coding-wise, than its competitors and the speed to deploy and its ease of use make it an attractive technology choice.
Where It Is Strong
Ansible uses a configuration model similar to Puppet but the key noticeable difference is that its workflow feel of building plays and playbooks is very “scriptlike”. This model is going to appeal more to your traditional ops folks with a strong scripting background.
It’s procedural, meaning for those who enjoy classic scripting and “one-step-leads-to-the-next,” then they will love how easily Ansible can put together quick and tidy configurations. It allows for multiple levels of variables to the point where you need an extensive order of operations list to keep your coding straight. It allows for templating configurations through Jinja2 which flattens the need for code duping.
Ansible also shines in configuring container hosts (think Kubernetes). Since most Docker containers will be images, they don’t require the same level of configuration management as traditional cloud hosts. Ansible can bring up your kube hosts and minions fast and let Kubernetes and Docker handle the rest.
Where It Lacks
Ansible is one and done, unless you run it again. Unlike many of its peers, it doesn’t have an agent checking in every 30 minutes to not only look for changes, but fix any configuration drift. This can be very problematic in large mutable environments.
It’s the outlier in so much as Puppet, (as well as Chef, and Salt) are written using the same tropes (classes, modules, etc.) as software. A developer is going to favor this less than their ops brethren.
It can configure cloud infrastructure similar to Terraform, but its potential to configure the amount of resources is more limited and the easability to deploy and update is not as great.
Puppet
Puppet is a declarative desired state tool. It is the oldest of its peers and has a lot of maturity in the market. It is a server/client-based tool that refreshes state on clients by way of a catalog. It leverages a robust metadata configuration method called hieradata. Its use and construct mimic software development.
How Puppet Is the Master
Puppet may appeal to developers more than Ansible because of its familiar software development paradigm. It is robust and manages the state of servers constantly thanks to agents. It has a massive community, so a lot of work is done for you.
Puppet module testing is fairly easy and robust. Why? Because it is just like testing your rails or node code, it’s built-in language is tested in the same way as its platform-building counterparts.
Hieradata is strong and can be written in YAML for ease of reading. It is a deep metadata framework that allows you to manage unique configuration between environments, deployments, variations of systems, while keeping actual modules as light as possible.
Server/Client management allows you to know your drift is under control. For those in a compliance-based world (think HIPAA, SOC), it matters. There is also community support for managing Docker containers and container orchestrators (Swarm).
Puppet’s Challenges
Puppet is hard to get off the ground. Its complexity requires thought in the big picture of your entire deployment, meaning Ansible is much faster to get the code out to servers.
It can provision infrastructure through help in the community, but it is incomplete and more difficult than Terraform. It heavily relies on a tool like Terraform to build the underpinnings until it can take the servers the last few miles.
Though Puppet has docker integrations, its strength is not in managing container architecture and scheduling, like Kubernetes.
You need to explicitly identify dependencies. This is one of the toughest differences to manage compared to Ansible. In Puppet, modules can be executed in any order, so if you don’t explicitly define dependencies, it may take several runs to successfully complete a configuration. Whereas Ansible is procedural and will load in order—much easier.
Community
All three tools have excellent community support. Being open sourced strengthens the offerings and allows for faster development into the content market.
Terraform has a community page that takes you to their IRC and other sites. There is also an active Google Group, as well as a robust bug-tracker on their github page. The tool reached 1.0.0 last year and is really cranking out new features. Also, Terraform is known for releasing new and updated features to major cloud providers (think AWS, GCP, Azure) very fast.
Ansible Galaxy provides its community with reusable plays to deploy configurations. Whether it is a MySQL server, or Windows IIS web service, it cuts down on the development to deployment time. It also has a Google Group full of information. Ansible was purchased by RedHat in 2016 and received a lot of “enterprising” in their Tower offering.
Puppet Forge is a vast repo of modules to do everything from configuring logrotate.d to installing MongoDB, to hardening an Ubuntu server with CIS benchmarks. Puppet’s User Groups, Slack, and IRC are deep and developed, and this is a good landing page for their community content.
Conclusion
One of the greatest methods to be successful in IAC is knowing which tools to use for which jobs. Terraform is an excellent tool to manage cloud services below the server. Puppet and Ansible excel at their way of configuring systems, depending on your team and needs. You can design your IAC environment to achieve automation, or use the tools to redefine your delivery of services. There is a massive community of content—take advantage of it, and don’t forget to contribute back to it.