The Puppet ecosystem
Puppet is a configuration management and automation tool. We use it to install, configure, and manage the components of our servers.
Written in Ruby and released with an open source license (Apache 2), it can run on any Linux distribution, many other UNIX variants (Solaris, *BSD, AIX, and Mac OS X), and Windows.
Its development started in 2005 by Luke Kanies as an alternate approach to the existing configuration management tools (most notably, CFEngine and BladeLogic).
The project has grown year after year. Kanies' own company, Reductive Labs, which was renamed in 2010 to Puppet Labs, has received a total funding of $45.5 million in various funding rounds (among the investors, there are names such as VMware, Google, and Cisco).
Now, it is one of the top 100 fastest growing companies in the US. It employs more than 250 people, and has a solid business based on open source software, consulting services, training, and certifications. It also has Puppet Enterprise, which is the commercial version that is based on the same open source Puppet code base, but it provides a web GUI that improves and helps in easier Puppet usage and administration.
The Puppet ecosystem features a vibrant, large, and active community that discusses it at the Puppet Users and Puppet Developers Google group, on the crowded Freenode's #puppet
IRC channel, at the various Puppet Camps that are held multiple times a year all over the world, and at the annual PuppetConf, which is improving and getting bigger year after year.
Various software products are complementary to Puppet; some of them are developed by Puppet Labs, which are as follows:
- Hiera is a key-value lookup tool that is the current choice of reference for storing data related to your Puppet infrastructure
- MCollective is an orchestration framework that allows parallel execution of tasks on multiple servers. It is a separate project by Puppet Labs, which works well with Puppet
- Facter is a required complementary tool as it is executed on each managed node and gathers local information in key/value pairs (facts) that are used by Puppet
- Geppetto is an IDE that is based on Eclipse that allows easier and assisted development of Puppet code
- Puppet Dashboard is an open source web console for Puppet
- PuppetDB is a powerful backend that can store all the data gathered and generated by Puppet
- Puppet Enterprise is the commercial solution to manage Puppet, Mcollective, and PuppetDB via a web frontend
The community has produced other tools and resources; the most noticeable ones are the following:
- The Foreman is a systems lifecycle management tool that integrates perfectly with Puppet
- Puppetboard is a web frontend for PuppetDB
- Kermit is a web frontend for Puppet and Mcollective
- A lot of community code is released as modules, which are reusable components that allow the management of any kind of application and software via Puppet
Why configuration management matters
IT operations have changed drastically in the last few years; virtualization, cloud, business needs, and emerging technologies have accelerated the pace of how systems are provisioned, configured, and managed.
The manual setup of a growing number of operating systems is no longer a sustainable option. At the same time, in-house custom solutions to automate the installation and the management of systems cannot scale in terms of required maintenance and development efforts.
For these reasons, configuration management tools such as Puppet, Chef, CFEngine, Rudder, Salt, and Ansible (to mention only the most known open source ones) are becoming increasingly popular in many infrastructures.
They allow a centralized and controlled approach to systems' management, based on code and data structures, which can be managed via a Software Change Management (SCM) tool (git
is the choice of reference in Puppet world).
Once we can express the status of our infrastructure with versioned code, we gain powerful benefits:
- We can reproduce our setups in a consistent way; what is executed once can be executed any time; the procedure to configure a server from scratch can be repeated without the risk of missing parts.
- The log of our code commits reflects the history of changes on our infrastructure: who did what, when, and if commits comments are pertinent, then why.
- We can scale quickly; the configurations we did for a server can be applied to all the servers of the same kind.
- We have aligned and coherent environments. Our development, test, QA, staging, and production servers can share the same setup procedures and configurations.
With these kinds of tools, we can have a system provisioned from zero to production in a few minutes, or we can quickly propagate a configuration change over our whole infrastructure automatically.
Their power is huge and has to be handled with care as we can automate massive and parallelized setups and configurations of systems; we might automate distributed destructions. With great power comes great responsibility.