Puppet as a platform
So far, this chapter has focused on the Puppet language, but now we will look at the Puppet platform and how it applies the desired state to client servers. Puppet can be run with just an installed agent and all the files locally, which is common for testing, but this overview will focus on the client-server setup. In Chapters 10, 13, and 14, we will go into much more detail about resilience, scalability, and more advanced running options. However, for now, we will focus on how a Puppet client talks to a server to request and apply its desired state.
Every client under Puppet control will install a Puppet agent. Figure 1.4 shows the steps of a Puppet agent run, which this section will outline:
Figure 1.4 – The Puppet agent run life cycle
The first step is for the agent to identify itself to the primary server with SSL keys or to create new SSL keys for the primary server to sign. This will secure communication between the server and client.
The next action is for the client to use a Ruby library called Facter
. This is a system profiler to gather what is known as facts about the system. This can be things such as the OS version or RAM size. These facts can be used in code or by Hiera to make choices about what state a host should be in, such as Windows Server 2022 having a particular registry setting.
Then, the server identifies what classes should be applied to a server. Typically, this is done by what is called an external node classifier (ENC) script, which is based on the facts and user definitions. Normally, this will apply a role class to a server, which, as we discussed in the previous section, builds up a definition of profiles and module classes.
Then, the primary server compiles a catalog and a YAML file of the resources to be applied to the node (ensuring the CPU-intensive work happens on the server and not the client).
This catalog is then sent to the client who uses the catalog as a blueprint of what the state should look like and makes any necessary changes to enforce the state on the client.
Finally, a report is sent back to the primary server confirming what resources were applied and whether these resources had to be changed due to a change in Puppet code or whether they were changed outside of Puppet control (which might be an audit or security breach).
In Figure 1.5, we see an example extract from a Puppet report showing the name of the resource, the type of change made, and the value it needed to change. Additionally, the report includes a record of unchanged resources highlighting what is part of Puppet's enforcement:
Figure 1.5 – The Puppet console server report
By default, this cycle takes place every 30 minutes. In the previous sections, the focus was on how the language can automate the building of servers. Here, we can see that, via the platform, we can ensure all our deployed servers are enforced with the state we set out to achieve; whether that be a security standard profile or whether we decided to update the settings in a particular implementation such as adding extra features to IIS. This avoids server drift, where servers on the estate are difficult to keep up to date or are vulnerable to changes made manually in error or that maliciously breach standards. Figure 1.6 shows the dashboard view of Puppet Enterprise, giving a clear view of an estate of servers and the status of the last run. This highlights whether the servers are in compliance with our state or had to make changes in their previous run:
Figure 1.6 – The Puppet console status dashboard
What we have reviewed so far would presume a common code base, and when any code changes are made, all clients would have a new state enforced within the next 30 minutes as agents contact the primary server. This would clearly be problematic, as bugs will affect all servers within a brief period. This is why Puppet has environments. An environment is a collection of versioned modules. This is achieved by storing the modules in revision control, such as git
, where the version can be declared as a commit, a tag, or a branch, which we can list in a file called a Puppetfile.
An example module declaration would look like this:
mod 'apache', :git => 'https://github.com/exampleorg/exampleapp' :tag => '1.2'
By maintaining this Puppetfile in git
, in what is known as a control repo, it is possible to represent multiple environments by having different branches with different versions of the Puppet file.
A common practice is to match environments against how your organization classifies server usage. Normally, this means a minimum of a development environment and a production environment. So, changes can be tested against servers in development, and then the successfully tested ones can be deployed to production. This can be taken further using canary environments to test small subsets of the server. This approach can all be customized to the change and risk setup of different organizations.
All the facts and reports we mentioned, as part of the agent cycle, are stored in PuppetDB
, a frontend application using PostgreSQL as a backend database, which is designed to manage Puppet data such as reports and facts. This is used with the Puppet Query Language (PQL), which allows us to search the information we have gathered. This can allow for searching of facts giving CMDB
style data and for combinations where we can check whether a certain resource for a role had changed, which could indicate a change breach had taken place.
So, in this section, we have seen that the Puppet platform gives a way to progressively deploy new code based on environments. It stores facts about the clients along with the reports generated on each run, giving a powerful view of CMDB along with audit and compliance information in the reports as we confirm what state the servers are in. This can all be searched using PQL. This can lead to huge savings in operational toil in terms of audit and compliance report generation and helps avoid building technical debt as standards and configurations change.