Chapter 6. Managing Resources and Files
 | "The art of simplicity is a puzzle of complexity". |  |
 | --Douglas Horton |
In this chapter, we will cover the following recipes:
- Distributing cron jobs efficiently
- Scheduling when resources are applied
- Using host resources
- Using exported host resources
- Using multiple file sources
- Distributing and merging directory trees
- Cleaning up old files
- Auditing resources
- Temporarily disabling resources
Introduction
In the previous chapter, we introduced virtual and exported resources. Virtual and exported resources are ways to manage the way in which resources are applied to a node. In this chapter, we will deal with when and how to apply resources. In some cases, you may only wish to apply a resource off hours, while in others, you may wish to only audit the resource but change nothing. In other cases, you may wish to apply completely different resources based on which node is using the code. As we will see, Puppet has the flexibility to deal with all these scenarios.
Distributing cron jobs efficiently
When you have many servers executing the same cron job, it's usually a good idea not to run them all at the same time. If all the jobs access a common server (for example, when running backups), it may put too much load on that server, and even if they don't, all the servers will be busy at the same time, which may affect their capacity to provide other services.
As usual, Puppet can help; this time, using the inline_template
function to calculate a unique time for each job.
How to do it...
Here's how to have Puppet schedule the same job at a different time for each machine:
- Modify your
site.pp
file as follows:node 'cookbook' { cron { 'run-backup': ensure => present, command => '/usr/local/bin/backup', hour => inline_template('<%= @hostname.sum % 24 %>'), minute => '00', } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413730771' Notice: /Stage[main]/Main/Node[cookbook]/Cron[run-backup]/ensure: created Notice: Finished catalog run in 0.11 seconds
- Run
crontab
to see how the job has been configured:[root@cookbook ~]# crontab -l # HEADER: This file was autogenerated at Sun Oct 19 10:59:32 -0400 2014 by puppet. # HEADER: While it can still be managed manually, it is definitely not recommended. # HEADER: Note particularly that the comments starting with 'Puppet Name' should # HEADER: not be deleted, as doing so could cause duplicate cron jobs. # Puppet Name: run-backup 0 15 * * * /usr/local/bin/backup
How it works...
We want to distribute the hour of the cron job runs across all our nodes. We choose something that is unique across all the machines and convert it to a number. This way, the value will be distributed across the nodes and will not change per node.
We can do the conversion using Ruby's sum
method, which computes a numerical value from a string that is unique to the machine (in this case, the machine's hostname). The sum
function will generate a large integer (in the case of the string cookbook
, the sum is 855), and we want values for hour
between 0 and 23, so we use Ruby's %
(modulo) operator to restrict the result to this range. We should get a reasonably good (though not statistically uniform) distribution of values, depending on your hostnames. Another option here is to use the fqdn_rand()
function, which works in much the same way as our example.
If all your machines have the same name (it does happen), don't expect this trick to work! In this case, you can use some other string that is unique to the machine, such as ipaddress
or fqdn
.
There's more...
If you have several cron jobs per machine and you want to run them a certain number of hours apart, add this number to the hostname.sum
resource before taking the modulus. Let's say we want to run the dump_database
job at some arbitrary time and the run_backup
job an hour later, this can be done using the following code snippet:
cron { 'dump-database':
ensure => present,
command => '/usr/local/bin/dump_database',
hour => inline_template('<%= @hostname.sum % 24 %>'),
minute => '00',
}
cron { 'run-backup':
ensure => present,
command => '/usr/local/bin/backup',
hour => inline_template('<%= ( @hostname.sum + 1) % 24 %>'),
minute => '00',
}
The two jobs will end up with different hour
values for each machine Puppet runs on, but run_backup
will always be one hour after dump_database
.
Most cron implementations have directories for hourly, daily, weekly, and monthly tasks. The directories /etc/cron.hourly
, /etc/cron.daily
, /etc/cron.weekly
, and /etc/cron.monthly
exist on both our Debian and Enterprise Linux machines. These directories hold executables, which will be run on the referenced schedule (hourly, daily, weekly, or monthly). I find it better to describe all the jobs in these folders and push the jobs as file
resources. An admin on the box searching for your script will be able to find it with grep
in these directories. To use the same trick here, we would push a cron task into /etc/cron.hourly
and then verify that the hour is the correct hour for the task to run. To create the cron jobs using the cron directories, follow these steps:
- First, create a
cron
class inmodules/cron/init.pp
:class cron { file { '/etc/cron.hourly/run-backup': content => template('cron/run-backup'), mode => 0755, } }
- Include the
cron
class in your cookbook node insite.pp
:node cookbook { include cron }
- Create a template to hold the cron task:
#!/bin/bash runhour=<%= @hostname.sum%24 %> hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
- Then, run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413732254' Notice: /Stage[main]/Cron/File[/etc/cron.hourly/run-backup]/ensure: defined content as '{md5}5e50a7b586ce774df23301ee72904dda' Notice: Finished catalog run in 0.11 seconds
- Verify that the script has the same value we calculated before,
15
:#!/bin/bash runhour=15 hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
Now, this job will run every hour but only when the hour, returned by $(date +%H)
, is equal to 15
will the rest of the script run. Creating your cron jobs as file resources in a large organization makes it easier for your fellow administrators to find them. When you have a very large number of machines, it can be advantageous to add another random wait at the beginning of your job. You would need to modify the line before echo run-backup
and add the following:
MAXWAIT=600
sleep $((RANDOM%MAXWAIT))
This will sleep a maximum of 600
seconds but will sleep a different amount each time it runs (assuming your random number generator is working). This sort of random wait is useful when you have thousands of machines, all running the same task and you need to stagger the runs as much as possible.
See also
- The Running Puppet from cron recipe in Chapter 2, Puppet Infrastructure
site.pp
file as follows:node 'cookbook' { cron { 'run-backup': ensure => present, command => '/usr/local/bin/backup', hour => inline_template('<%= @hostname.sum % 24 %>'), minute => '00', } }
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413730771' Notice: /Stage[main]/Main/Node[cookbook]/Cron[run-backup]/ensure: created Notice: Finished catalog run in 0.11 seconds
crontab
to see how the job has been configured:[root@cookbook ~]# crontab -l # HEADER: This file was autogenerated at Sun Oct 19 10:59:32 -0400 2014 by puppet. # HEADER: While it can still be managed manually, it is definitely not recommended. # HEADER: Note particularly that the comments starting with 'Puppet Name' should # HEADER: not be deleted, as doing so could cause duplicate cron jobs. # Puppet Name: run-backup 0 15 * * * /usr/local/bin/backup
How it works...
We want to distribute the hour of the cron job runs across all our nodes. We choose something that is unique across all the machines and convert it to a number. This way, the value will be distributed across the nodes and will not change per node.
We can do the conversion using Ruby's sum
method, which computes a numerical value from a string that is unique to the machine (in this case, the machine's hostname). The sum
function will generate a large integer (in the case of the string cookbook
, the sum is 855), and we want values for hour
between 0 and 23, so we use Ruby's %
(modulo) operator to restrict the result to this range. We should get a reasonably good (though not statistically uniform) distribution of values, depending on your hostnames. Another option here is to use the fqdn_rand()
function, which works in much the same way as our example.
If all your machines have the same name (it does happen), don't expect this trick to work! In this case, you can use some other string that is unique to the machine, such as ipaddress
or fqdn
.
There's more...
If you have several cron jobs per machine and you want to run them a certain number of hours apart, add this number to the hostname.sum
resource before taking the modulus. Let's say we want to run the dump_database
job at some arbitrary time and the run_backup
job an hour later, this can be done using the following code snippet:
cron { 'dump-database':
ensure => present,
command => '/usr/local/bin/dump_database',
hour => inline_template('<%= @hostname.sum % 24 %>'),
minute => '00',
}
cron { 'run-backup':
ensure => present,
command => '/usr/local/bin/backup',
hour => inline_template('<%= ( @hostname.sum + 1) % 24 %>'),
minute => '00',
}
The two jobs will end up with different hour
values for each machine Puppet runs on, but run_backup
will always be one hour after dump_database
.
Most cron implementations have directories for hourly, daily, weekly, and monthly tasks. The directories /etc/cron.hourly
, /etc/cron.daily
, /etc/cron.weekly
, and /etc/cron.monthly
exist on both our Debian and Enterprise Linux machines. These directories hold executables, which will be run on the referenced schedule (hourly, daily, weekly, or monthly). I find it better to describe all the jobs in these folders and push the jobs as file
resources. An admin on the box searching for your script will be able to find it with grep
in these directories. To use the same trick here, we would push a cron task into /etc/cron.hourly
and then verify that the hour is the correct hour for the task to run. To create the cron jobs using the cron directories, follow these steps:
- First, create a
cron
class inmodules/cron/init.pp
:class cron { file { '/etc/cron.hourly/run-backup': content => template('cron/run-backup'), mode => 0755, } }
- Include the
cron
class in your cookbook node insite.pp
:node cookbook { include cron }
- Create a template to hold the cron task:
#!/bin/bash runhour=<%= @hostname.sum%24 %> hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
- Then, run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413732254' Notice: /Stage[main]/Cron/File[/etc/cron.hourly/run-backup]/ensure: defined content as '{md5}5e50a7b586ce774df23301ee72904dda' Notice: Finished catalog run in 0.11 seconds
- Verify that the script has the same value we calculated before,
15
:#!/bin/bash runhour=15 hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
Now, this job will run every hour but only when the hour, returned by $(date +%H)
, is equal to 15
will the rest of the script run. Creating your cron jobs as file resources in a large organization makes it easier for your fellow administrators to find them. When you have a very large number of machines, it can be advantageous to add another random wait at the beginning of your job. You would need to modify the line before echo run-backup
and add the following:
MAXWAIT=600
sleep $((RANDOM%MAXWAIT))
This will sleep a maximum of 600
seconds but will sleep a different amount each time it runs (assuming your random number generator is working). This sort of random wait is useful when you have thousands of machines, all running the same task and you need to stagger the runs as much as possible.
See also
- The Running Puppet from cron recipe in Chapter 2, Puppet Infrastructure
distribute the hour of the cron job runs across all our nodes. We choose something that is unique across all the machines and convert it to a number. This way, the value will be distributed across the nodes and will not change per node.
We can do the conversion using Ruby's sum
method, which computes a numerical value from a string that is unique to the machine (in this case, the machine's hostname). The sum
function will generate a large integer (in the case of the string cookbook
, the sum is 855), and we want values for hour
between 0 and 23, so we use Ruby's %
(modulo) operator to restrict the result to this range. We should get a reasonably good (though not statistically uniform) distribution of values, depending on your hostnames. Another option here is to use the fqdn_rand()
function, which works in much the same way as our example.
If all your machines have the same name (it does happen), don't expect this trick to work! In this case, you can use some other string that is unique to the machine, such as ipaddress
or fqdn
.
There's more...
If you have several cron jobs per machine and you want to run them a certain number of hours apart, add this number to the hostname.sum
resource before taking the modulus. Let's say we want to run the dump_database
job at some arbitrary time and the run_backup
job an hour later, this can be done using the following code snippet:
cron { 'dump-database':
ensure => present,
command => '/usr/local/bin/dump_database',
hour => inline_template('<%= @hostname.sum % 24 %>'),
minute => '00',
}
cron { 'run-backup':
ensure => present,
command => '/usr/local/bin/backup',
hour => inline_template('<%= ( @hostname.sum + 1) % 24 %>'),
minute => '00',
}
The two jobs will end up with different hour
values for each machine Puppet runs on, but run_backup
will always be one hour after dump_database
.
Most cron implementations have directories for hourly, daily, weekly, and monthly tasks. The directories /etc/cron.hourly
, /etc/cron.daily
, /etc/cron.weekly
, and /etc/cron.monthly
exist on both our Debian and Enterprise Linux machines. These directories hold executables, which will be run on the referenced schedule (hourly, daily, weekly, or monthly). I find it better to describe all the jobs in these folders and push the jobs as file
resources. An admin on the box searching for your script will be able to find it with grep
in these directories. To use the same trick here, we would push a cron task into /etc/cron.hourly
and then verify that the hour is the correct hour for the task to run. To create the cron jobs using the cron directories, follow these steps:
- First, create a
cron
class inmodules/cron/init.pp
:class cron { file { '/etc/cron.hourly/run-backup': content => template('cron/run-backup'), mode => 0755, } }
- Include the
cron
class in your cookbook node insite.pp
:node cookbook { include cron }
- Create a template to hold the cron task:
#!/bin/bash runhour=<%= @hostname.sum%24 %> hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
- Then, run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413732254' Notice: /Stage[main]/Cron/File[/etc/cron.hourly/run-backup]/ensure: defined content as '{md5}5e50a7b586ce774df23301ee72904dda' Notice: Finished catalog run in 0.11 seconds
- Verify that the script has the same value we calculated before,
15
:#!/bin/bash runhour=15 hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
Now, this job will run every hour but only when the hour, returned by $(date +%H)
, is equal to 15
will the rest of the script run. Creating your cron jobs as file resources in a large organization makes it easier for your fellow administrators to find them. When you have a very large number of machines, it can be advantageous to add another random wait at the beginning of your job. You would need to modify the line before echo run-backup
and add the following:
MAXWAIT=600
sleep $((RANDOM%MAXWAIT))
This will sleep a maximum of 600
seconds but will sleep a different amount each time it runs (assuming your random number generator is working). This sort of random wait is useful when you have thousands of machines, all running the same task and you need to stagger the runs as much as possible.
See also
- The Running Puppet from cron recipe in Chapter 2, Puppet Infrastructure
hostname.sum
resource before taking the modulus. Let's say we want to run the dump_database
job at some arbitrary time and the run_backup
job an hour later, this can be done using the following code snippet:
hour
values for each machine Puppet runs on, but run_backup
will always be one hour after dump_database
.
/etc/cron.hourly
, /etc/cron.daily
, /etc/cron.weekly
, and /etc/cron.monthly
exist on both our Debian and Enterprise Linux machines. These directories hold executables, which will be run on the referenced schedule (hourly, daily, weekly, or monthly). I find it better to describe all the jobs in these folders and push the jobs as file
resources. An
admin on the box searching for your script will be able to find it with grep
in these directories. To use the same trick here, we would push a cron task into /etc/cron.hourly
and then verify that the hour is the correct hour for the task to run. To create the cron jobs using the cron directories, follow these steps:
- First, create a
cron
class inmodules/cron/init.pp
:class cron { file { '/etc/cron.hourly/run-backup': content => template('cron/run-backup'), mode => 0755, } }
- Include the
cron
class in your cookbook node insite.pp
:node cookbook { include cron }
- Create a template to hold the cron task:
#!/bin/bash runhour=<%= @hostname.sum%24 %> hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
- Then, run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413732254' Notice: /Stage[main]/Cron/File[/etc/cron.hourly/run-backup]/ensure: defined content as '{md5}5e50a7b586ce774df23301ee72904dda' Notice: Finished catalog run in 0.11 seconds
- Verify that the script has the same value we calculated before,
15
:#!/bin/bash runhour=15 hour=$(date +%H) if [ "$runhour" -ne "$hour" ]; then exit 0 fi echo run-backup
Now, this job will run every hour but only when the hour, returned by $(date +%H)
, is equal to 15
will the rest of the script run. Creating your cron jobs as file resources in a large organization makes it easier for your fellow administrators to find them. When you have a very large number of machines, it can be advantageous to add another random wait at the beginning of your job. You would need to modify the line before echo run-backup
and add the following:
MAXWAIT=600
sleep $((RANDOM%MAXWAIT))
This will sleep a maximum of 600
seconds but will sleep a different amount each time it runs (assuming your random number generator is working). This sort of random wait is useful when you have thousands of machines, all running the same task and you need to stagger the runs as much as possible.
See also
- The Running Puppet from cron recipe in Chapter 2, Puppet Infrastructure
- Chapter 2, Puppet Infrastructure
Scheduling when resources are applied
So far, we looked at what Puppet can do, and the order that it does things in, but not when it does them. One way to control this is to use the schedule
metaparameter. When you need to limit the number of times a resource is applied within a specified period, schedule
can help. For example:
exec { "/usr/bin/apt-get update":
schedule => daily,
}
The most important thing to understand about schedule
is that it can only stop a resource being applied. It doesn't guarantee that the resource will be applied with a certain frequency. For example, the exec
resource shown in the preceding code snippet has schedule => daily
, but this just represents an upper limit on the number of times the exec
resource can run per day. It won't be applied more than once a day. If you don't run Puppet at all, the resource won't be applied at all. Using the hourly schedule, for instance, is meaningless on a machine configured to run the agent every 4 hours (via the runinterval
configuration setting).
That being said, schedule
is best used to restrict resources from running when they shouldn't, or don't need to; for example, you might want to make sure that apt-get update
isn't run more than once an hour. There are some built-in schedules available for you to use:
hourly
daily
weekly
monthly
never
However, you can modify these and create your own custom schedules, using the schedule
resource. We'll see how to do this in the following example. Let's say we want to make sure that an exec
resource representing a maintenance job won't run during office hours, when it might interfere with production.
How to do it...
In this example, we'll create a custom schedule
resource and assign this to the resource:
- Modify your
site.pp
file as follows:schedule { 'outside-office-hours': period => daily, range => ['17:00-23:59','00:00-09:00'], repeat => 1, } node 'cookbook' { notify { 'Doing some maintenance': schedule => 'outside-office-hours', } }
- Run Puppet. What you'll see will depend on the time of the day. If it's currently outside the office hours period you defined, Puppet will apply the resource as follows:
[root@cookbook ~]# date Fri Jan 2 23:59:01 PST 2015 [root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413734477' Notice: Doing some maintenance Notice: /Stage[main]/Main/Node[cookbook]/Notify[Doing some maintenance]/message: defined 'message' as 'Doing some maintenance' Notice: Finished catalog run in 0.07 seconds
- If the time is within the office hours period, Puppet will do nothing:
[root@cookbook ~]# date Fri Jan 2 09:59:01 PST 2015 [root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413734289' Notice: Finished catalog run in 0.09 seconds
How it works...
A schedule consists of three bits of information:
- The
period
(hourly
,daily
,weekly
, ormonthly
) - The
range
(defaults to the whole period, but can be a smaller part of it) - The
repeat
count (how often the resource is allowed to be applied within the range; the default is 1 or once per period)
Our custom schedule named outside-office-hours
supplies these three parameters:
schedule { 'outside-office-hours':
period => daily,
range => ['17:00-23:59','00:00-09:00'],
repeat => 1,
}
The period
is daily
, and range
is defined as an array of two time intervals:
17:00-23:59
00:00-09:00
The schedule named outside-office-hours
is now available for us to use with any resource, just as though it were built into Puppet such as the daily
or hourly
schedules. In our example, we assign this schedule to the exec
resource using the schedule
metaparameter:
notify { 'Doing some maintenance':
schedule => 'outside-office-hours',
}
Without this schedule
parameter, the resource would be applied every time Puppet runs. With it, Puppet will check the following parameters to decide whether or not to apply the resource:
- Whether the time is in the permitted range
- Whether the resource has already been run the maximum permitted number of times in this period
For example, let's consider what happens if Puppet runs at 4 p.m., 5 p.m., and 6 p.m. on a given day:
- 4 p.m.: It's outside the permitted time range, so Puppet will do nothing
- 5 p.m.: It's inside the permitted time range, and the resource hasn't been run yet in this period, so Puppet will apply the resource
- 6 p.m.: It's inside the permitted time range, but the resource has already been run the maximum number of times in this period, so Puppet will do nothing
And so on until the next day.
There's more...
The repeat
parameter governs how many times the resource will be applied given the other constraints of the schedule. For example, to apply a resource no more than six times an hour, use a schedule as follows:
period => hourly,
repeat => 6,
Remember that this won't guarantee that the job is run six times an hour. It just sets an upper limit; no matter how often Puppet runs or anything else happens, the job won't be run if it has already run six times this hour. If Puppet only runs once a day, the job will just be run once. So schedule
is best used to make sure things don't happen at certain times (or don't exceed a given frequency).
custom schedule
resource and assign this to the resource:
- Modify your
site.pp
file as follows:schedule { 'outside-office-hours': period => daily, range => ['17:00-23:59','00:00-09:00'], repeat => 1, } node 'cookbook' { notify { 'Doing some maintenance': schedule => 'outside-office-hours', } }
- Run Puppet. What you'll see will depend on the time of the day. If it's currently outside the office hours period you defined, Puppet will apply the resource as follows:
[root@cookbook ~]# date Fri Jan 2 23:59:01 PST 2015 [root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413734477' Notice: Doing some maintenance Notice: /Stage[main]/Main/Node[cookbook]/Notify[Doing some maintenance]/message: defined 'message' as 'Doing some maintenance' Notice: Finished catalog run in 0.07 seconds
- If the time is within the office hours period, Puppet will do nothing:
[root@cookbook ~]# date Fri Jan 2 09:59:01 PST 2015 [root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413734289' Notice: Finished catalog run in 0.09 seconds
How it works...
A schedule consists of three bits of information:
- The
period
(hourly
,daily
,weekly
, ormonthly
) - The
range
(defaults to the whole period, but can be a smaller part of it) - The
repeat
count (how often the resource is allowed to be applied within the range; the default is 1 or once per period)
Our custom schedule named outside-office-hours
supplies these three parameters:
schedule { 'outside-office-hours':
period => daily,
range => ['17:00-23:59','00:00-09:00'],
repeat => 1,
}
The period
is daily
, and range
is defined as an array of two time intervals:
17:00-23:59
00:00-09:00
The schedule named outside-office-hours
is now available for us to use with any resource, just as though it were built into Puppet such as the daily
or hourly
schedules. In our example, we assign this schedule to the exec
resource using the schedule
metaparameter:
notify { 'Doing some maintenance':
schedule => 'outside-office-hours',
}
Without this schedule
parameter, the resource would be applied every time Puppet runs. With it, Puppet will check the following parameters to decide whether or not to apply the resource:
- Whether the time is in the permitted range
- Whether the resource has already been run the maximum permitted number of times in this period
For example, let's consider what happens if Puppet runs at 4 p.m., 5 p.m., and 6 p.m. on a given day:
- 4 p.m.: It's outside the permitted time range, so Puppet will do nothing
- 5 p.m.: It's inside the permitted time range, and the resource hasn't been run yet in this period, so Puppet will apply the resource
- 6 p.m.: It's inside the permitted time range, but the resource has already been run the maximum number of times in this period, so Puppet will do nothing
And so on until the next day.
There's more...
The repeat
parameter governs how many times the resource will be applied given the other constraints of the schedule. For example, to apply a resource no more than six times an hour, use a schedule as follows:
period => hourly,
repeat => 6,
Remember that this won't guarantee that the job is run six times an hour. It just sets an upper limit; no matter how often Puppet runs or anything else happens, the job won't be run if it has already run six times this hour. If Puppet only runs once a day, the job will just be run once. So schedule
is best used to make sure things don't happen at certain times (or don't exceed a given frequency).
bits of information:
- The
period
(hourly
,daily
,weekly
, ormonthly
) - The
range
(defaults to the whole period, but can be a smaller part of it) - The
repeat
count (how often the resource is allowed to be applied within the range; the default is 1 or once per period)
Our custom schedule named outside-office-hours
supplies these three parameters:
schedule { 'outside-office-hours':
period => daily,
range => ['17:00-23:59','00:00-09:00'],
repeat => 1,
}
The period
is daily
, and range
is defined as an array of two time intervals:
17:00-23:59
00:00-09:00
The schedule named outside-office-hours
is now available for us to use with any resource, just as though it were built into Puppet such as the daily
or hourly
schedules. In our example, we assign this schedule to the exec
resource using the schedule
metaparameter:
notify { 'Doing some maintenance':
schedule => 'outside-office-hours',
}
Without this schedule
parameter, the resource would be applied every time Puppet runs. With it, Puppet will check the following parameters to decide whether or not to apply the resource:
- Whether the time is in the permitted range
- Whether the resource has already been run the maximum permitted number of times in this period
For example, let's consider what happens if Puppet runs at 4 p.m., 5 p.m., and 6 p.m. on a given day:
- 4 p.m.: It's outside the permitted time range, so Puppet will do nothing
- 5 p.m.: It's inside the permitted time range, and the resource hasn't been run yet in this period, so Puppet will apply the resource
- 6 p.m.: It's inside the permitted time range, but the resource has already been run the maximum number of times in this period, so Puppet will do nothing
And so on until the next day.
There's more...
The repeat
parameter governs how many times the resource will be applied given the other constraints of the schedule. For example, to apply a resource no more than six times an hour, use a schedule as follows:
period => hourly,
repeat => 6,
Remember that this won't guarantee that the job is run six times an hour. It just sets an upper limit; no matter how often Puppet runs or anything else happens, the job won't be run if it has already run six times this hour. If Puppet only runs once a day, the job will just be run once. So schedule
is best used to make sure things don't happen at certain times (or don't exceed a given frequency).
repeat
parameter
governs how many times the resource will be applied given the other constraints of the schedule. For example, to apply a resource no more than six times an hour, use a schedule as follows:
period => hourly,
repeat => 6,
Remember that this won't guarantee that the job is run six times an hour. It just sets an upper limit; no matter how often Puppet runs or anything else happens, the job won't be run if it has already run six times this hour. If Puppet only runs once a day, the job will just be run once. So schedule
is best used to make sure things don't happen at certain times (or don't exceed a given frequency).
Using host resources
It's not always practical or convenient to use DNS to map your machine names to IP addresses, especially in cloud infrastructures, where those addresses may change all the time. However, if you use entries in the /etc/hosts
file instead, you then have the problem of how to distribute these entries to all machines and keep them up to date.
Here's a better way to do it; Puppet's host resource type controls a single /etc/hosts
entry, and you can use this to map a hostname to an IP address easily across your whole network. For example, if all your machines need to know the address of the main database server, you can manage it with a host
resource.
How to do it...
Follow these steps to create an example host
resource:
- Modify your
site.pp
file as follows:node 'cookbook' { host { 'packtpub.com': ensure => present, ip => '83.166.169.231', } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413781153' Notice: /Stage[main]/Main/Node[cookbook]/Host[packtpub.com]/ensure: created Info: Computing checksum on file /etc/hosts Notice: Finished catalog run in 0.12 seconds
How it works...
Puppet will check the target
file (usually /etc/hosts
) to see whether the host entry already exists, and if not, add it. If an entry for that hostname already exists with a different address, Puppet will change the address to match the manifest.
There's more...
Organizing your host resources into classes can be helpful. For example, you could put the host resources for all your DB servers into one class called admin::dbhosts
, which is included by all web servers.
Where machines may need to be defined in multiple classes (for example, a database server might also be a repository server), virtual resources can solve this problem. For example, you could define all your hosts as virtual in a single class:
class admin::allhosts {
@host { 'db1.packtpub.com':
tag => 'database'
...
}
}
You could then realize the hosts you need in the various classes:
class admin::dbhosts {
Host <| tag=='database' |>
}
class admin::webhosts {
Host <| tag=='web' |>
}
host
resource:
site.pp
file as follows:node 'cookbook' { host { 'packtpub.com': ensure => present, ip => '83.166.169.231', } }
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413781153' Notice: /Stage[main]/Main/Node[cookbook]/Host[packtpub.com]/ensure: created Info: Computing checksum on file /etc/hosts Notice: Finished catalog run in 0.12 seconds
How it works...
Puppet will check the target
file (usually /etc/hosts
) to see whether the host entry already exists, and if not, add it. If an entry for that hostname already exists with a different address, Puppet will change the address to match the manifest.
There's more...
Organizing your host resources into classes can be helpful. For example, you could put the host resources for all your DB servers into one class called admin::dbhosts
, which is included by all web servers.
Where machines may need to be defined in multiple classes (for example, a database server might also be a repository server), virtual resources can solve this problem. For example, you could define all your hosts as virtual in a single class:
class admin::allhosts {
@host { 'db1.packtpub.com':
tag => 'database'
...
}
}
You could then realize the hosts you need in the various classes:
class admin::dbhosts {
Host <| tag=='database' |>
}
class admin::webhosts {
Host <| tag=='web' |>
}
the target
file (usually /etc/hosts
) to see whether the host entry already exists, and if not, add it. If an entry for that hostname already exists with a different address, Puppet will change the address to match the manifest.
There's more...
Organizing your host resources into classes can be helpful. For example, you could put the host resources for all your DB servers into one class called admin::dbhosts
, which is included by all web servers.
Where machines may need to be defined in multiple classes (for example, a database server might also be a repository server), virtual resources can solve this problem. For example, you could define all your hosts as virtual in a single class:
class admin::allhosts {
@host { 'db1.packtpub.com':
tag => 'database'
...
}
}
You could then realize the hosts you need in the various classes:
class admin::dbhosts {
Host <| tag=='database' |>
}
class admin::webhosts {
Host <| tag=='web' |>
}
admin::dbhosts
, which is included by all web servers.
realize the hosts you need in the various classes:
class admin::dbhosts {
Host <| tag=='database' |>
}
class admin::webhosts {
Host <| tag=='web' |>
}
Using exported host resources
In the previous example, we used the spaceship syntax to collect virtual host resources for hosts of type database or type web. You can use the same trick with exported resources. The advantage to using exported resources is that as you add more database servers, the collector syntax will automatically pull in the newly created exported host entries for those servers. This makes your /etc/hosts
entries more dynamic.
Getting ready
We will be using exported resources. If you haven't already done so, set up puppetdb and enable storeconfigs to use puppetdb as outlined in Chapter 2, Puppet Infrastructure.
How to do it...
In this example, we will configure database servers and clients to communicate with each other. We'll make use of exported resources to do the configuration.
- Create a new database module,
db
:t@mylaptop ~/puppet/modules $ mkdir -p db/manifests
- Create a new class for your database servers,
db::server
:class db::server { @@host {"$::fqdn": host_aliases => $::hostname, ip => $::ipaddress, tag => 'db::server', } # rest of db class }
- Create a new class for your database clients:
class db::client { Host <<| tag == 'db::server' |>> }
- Apply the database server module to some nodes, in
site.pp
, for example:node 'dbserver1.example.com' { class {'db::server': } } node 'dbserver2.example.com' { class {'db::server': } }
- Run Puppet on the nodes with the database server module to create the exported resources.
- Apply the database client module to cookbook:
node 'cookbook' { class {'db::client': } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413782501' Notice: /Stage[main]/Db::Client/Host[dbserver2.example.com]/ensure: created Info: Computing checksum on file /etc/hosts Notice: /Stage[main]/Db::Client/Host[dbserver1.example.com]/ensure: created Notice: Finished catalog run in 0.10 seconds
- Verify the host entries in
/etc/hosts
:[root@cookbook ~]# cat /etc/hosts # HEADER: This file was autogenerated at Mon Oct 20 01:21:42 -0400 2014 # HEADER: by puppet. While it can still be managed manually, it # HEADER: is definitely not recommended. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 83.166.169.231 packtpub.com 192.168.122.150 dbserver2.example.com dbserver2 192.168.122.151 dbserver1.example.com dbserver1
How it works...
In the db::server
class, we create an exported host resource:
@@host {"$::fqdn":
host_aliases => $::hostname,
ip => $::ipaddress,
tag => 'db::server',
}
This resource uses the fully qualified domain name ($::fqdn
) of the node on which it is applied. We also use the short hostname ($::hostname
) as an alias of the node. Aliases are printed after fqdn
in /etc/hosts
. We use the node's $::ipaddress
fact as the IP address for the host entry. Finally, we add a tag to the resource so that we can collect based on that tag later.
The important thing to remember here is that if the ip address should change for the host, the exported resource will be updated, and nodes that collect the exported resource will update their host records accordingly.
We created a collector in db::client
, which only collects exported host resources that have been tagged with 'db::server'
:
Host <<| tag == 'db::server' |>>
We applied the db::server
class for a couple of nodes, dbserver1 and dbserver2, which we then collected on cookbook by applying the db::client
class. The host entries were placed in /etc/hosts
(the default file). We can see that the host entry contains both the fqdn and the short hostname for dbserver1 and dbserver2.
There's more...
Using exported resources in this manner is very useful. Another similar system would be to create an NFS server class, which creates exported resources for the mount points that it exports (via NFS). You can then use tags to have clients collect the appropriate mount points from the server. In the previous example, we made use of a tag to aid in our collection of exported resources. It is worth noting that there are several tags automatically added to resources when they are created, one of which is the scope where the resource was created.
Chapter 2, Puppet Infrastructure.
How to do it...
In this example, we will configure database servers and clients to communicate with each other. We'll make use of exported resources to do the configuration.
- Create a new database module,
db
:t@mylaptop ~/puppet/modules $ mkdir -p db/manifests
- Create a new class for your database servers,
db::server
:class db::server { @@host {"$::fqdn": host_aliases => $::hostname, ip => $::ipaddress, tag => 'db::server', } # rest of db class }
- Create a new class for your database clients:
class db::client { Host <<| tag == 'db::server' |>> }
- Apply the database server module to some nodes, in
site.pp
, for example:node 'dbserver1.example.com' { class {'db::server': } } node 'dbserver2.example.com' { class {'db::server': } }
- Run Puppet on the nodes with the database server module to create the exported resources.
- Apply the database client module to cookbook:
node 'cookbook' { class {'db::client': } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413782501' Notice: /Stage[main]/Db::Client/Host[dbserver2.example.com]/ensure: created Info: Computing checksum on file /etc/hosts Notice: /Stage[main]/Db::Client/Host[dbserver1.example.com]/ensure: created Notice: Finished catalog run in 0.10 seconds
- Verify the host entries in
/etc/hosts
:[root@cookbook ~]# cat /etc/hosts # HEADER: This file was autogenerated at Mon Oct 20 01:21:42 -0400 2014 # HEADER: by puppet. While it can still be managed manually, it # HEADER: is definitely not recommended. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 83.166.169.231 packtpub.com 192.168.122.150 dbserver2.example.com dbserver2 192.168.122.151 dbserver1.example.com dbserver1
How it works...
In the db::server
class, we create an exported host resource:
@@host {"$::fqdn":
host_aliases => $::hostname,
ip => $::ipaddress,
tag => 'db::server',
}
This resource uses the fully qualified domain name ($::fqdn
) of the node on which it is applied. We also use the short hostname ($::hostname
) as an alias of the node. Aliases are printed after fqdn
in /etc/hosts
. We use the node's $::ipaddress
fact as the IP address for the host entry. Finally, we add a tag to the resource so that we can collect based on that tag later.
The important thing to remember here is that if the ip address should change for the host, the exported resource will be updated, and nodes that collect the exported resource will update their host records accordingly.
We created a collector in db::client
, which only collects exported host resources that have been tagged with 'db::server'
:
Host <<| tag == 'db::server' |>>
We applied the db::server
class for a couple of nodes, dbserver1 and dbserver2, which we then collected on cookbook by applying the db::client
class. The host entries were placed in /etc/hosts
(the default file). We can see that the host entry contains both the fqdn and the short hostname for dbserver1 and dbserver2.
There's more...
Using exported resources in this manner is very useful. Another similar system would be to create an NFS server class, which creates exported resources for the mount points that it exports (via NFS). You can then use tags to have clients collect the appropriate mount points from the server. In the previous example, we made use of a tag to aid in our collection of exported resources. It is worth noting that there are several tags automatically added to resources when they are created, one of which is the scope where the resource was created.
db
:t@mylaptop ~/puppet/modules $ mkdir -p db/manifests
db::server
:class db::server { @@host {"$::fqdn": host_aliases => $::hostname, ip => $::ipaddress, tag => 'db::server', } # rest of db class }
class db::client { Host <<| tag == 'db::server' |>> }
site.pp
, for example:node 'dbserver1.example.com' { class {'db::server': } } node 'dbserver2.example.com' { class {'db::server': } }
- with the database server module to create the exported resources.
- Apply the database client module to cookbook:
node 'cookbook' { class {'db::client': } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413782501' Notice: /Stage[main]/Db::Client/Host[dbserver2.example.com]/ensure: created Info: Computing checksum on file /etc/hosts Notice: /Stage[main]/Db::Client/Host[dbserver1.example.com]/ensure: created Notice: Finished catalog run in 0.10 seconds
- Verify the host entries in
/etc/hosts
:[root@cookbook ~]# cat /etc/hosts # HEADER: This file was autogenerated at Mon Oct 20 01:21:42 -0400 2014 # HEADER: by puppet. While it can still be managed manually, it # HEADER: is definitely not recommended. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 83.166.169.231 packtpub.com 192.168.122.150 dbserver2.example.com dbserver2 192.168.122.151 dbserver1.example.com dbserver1
How it works...
In the db::server
class, we create an exported host resource:
@@host {"$::fqdn":
host_aliases => $::hostname,
ip => $::ipaddress,
tag => 'db::server',
}
This resource uses the fully qualified domain name ($::fqdn
) of the node on which it is applied. We also use the short hostname ($::hostname
) as an alias of the node. Aliases are printed after fqdn
in /etc/hosts
. We use the node's $::ipaddress
fact as the IP address for the host entry. Finally, we add a tag to the resource so that we can collect based on that tag later.
The important thing to remember here is that if the ip address should change for the host, the exported resource will be updated, and nodes that collect the exported resource will update their host records accordingly.
We created a collector in db::client
, which only collects exported host resources that have been tagged with 'db::server'
:
Host <<| tag == 'db::server' |>>
We applied the db::server
class for a couple of nodes, dbserver1 and dbserver2, which we then collected on cookbook by applying the db::client
class. The host entries were placed in /etc/hosts
(the default file). We can see that the host entry contains both the fqdn and the short hostname for dbserver1 and dbserver2.
There's more...
Using exported resources in this manner is very useful. Another similar system would be to create an NFS server class, which creates exported resources for the mount points that it exports (via NFS). You can then use tags to have clients collect the appropriate mount points from the server. In the previous example, we made use of a tag to aid in our collection of exported resources. It is worth noting that there are several tags automatically added to resources when they are created, one of which is the scope where the resource was created.
db::server
class, we
create an exported host resource:
@@host {"$::fqdn":
host_aliases => $::hostname,
ip => $::ipaddress,
tag => 'db::server',
}
This resource uses the fully qualified domain name ($::fqdn
) of the node on which it is applied. We also use the short hostname ($::hostname
) as an alias of the node. Aliases are printed after fqdn
in /etc/hosts
. We use the node's $::ipaddress
fact as the IP address for the host entry. Finally, we add a tag to the resource so that we can collect based on that tag later.
The important thing to remember here is that if the ip address should change for the host, the exported resource will be updated, and nodes that collect the exported resource will update their host records accordingly.
We created a collector in db::client
, which only collects exported host resources that have been tagged with 'db::server'
:
Host <<| tag == 'db::server' |>>
We applied the db::server
class for a couple of nodes, dbserver1 and dbserver2, which we then collected on cookbook by applying the db::client
class. The host entries were placed in /etc/hosts
(the default file). We can see that the host entry contains both the fqdn and the short hostname for dbserver1 and dbserver2.
There's more...
Using exported resources in this manner is very useful. Another similar system would be to create an NFS server class, which creates exported resources for the mount points that it exports (via NFS). You can then use tags to have clients collect the appropriate mount points from the server. In the previous example, we made use of a tag to aid in our collection of exported resources. It is worth noting that there are several tags automatically added to resources when they are created, one of which is the scope where the resource was created.
Using multiple file sources
A neat feature of Puppet's file
resource is that you can specify multiple values for the source
parameter. Puppet will search them in order. If the first source isn't found, it moves on to the next, and so on. You can use this to specify a default substitute if the particular file isn't present, or even a series of increasingly generic substitutes.
How to do it...
This example demonstrates using multiple file sources:
- Create a new greeting module as follows:
class greeting { file { '/tmp/greeting': source => [ 'puppet:///modules/greeting/hello.txt', 'puppet:///modules/greeting/universal.txt'], } }
- Create the file
modules/greeting/files/hello.txt
with the following contents:Hello, world.
- Create the file
modules/greeting/files/universal.txt
with the following contents:Bah-weep-Graaaaagnah wheep ni ni bong
- Add the class to a node:
node cookbook { class {'greeting': } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413784347' Notice: /Stage[main]/Greeting/File[/tmp/greeting]/ensure: defined content as '{md5}54098b367d2e87b078671fad4afb9dbb' Notice: Finished catalog run in 0.43 seconds
- Check the contents of the
/tmp/greeting
file:[root@cookbook ~]# cat /tmp/greeting Hello, world.
- Now remove the
hello.txt
file from your Puppet repository and rerun the agent:[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413784939' Notice: /Stage[main]/Greeting/File[/tmp/greeting]/content: --- /tmp/greeting 2014-10-20 01:52:28.117999991 -0400 +++ /tmp/puppet-file20141020-4960-1o9g344-0 2014-10-20 02:02:20.695999979 -0400 @@ -1 +1 @@ -Hello, world. +Bah-weep-Graaaaagnah wheep ni ni bong Info: Computing checksum on file /tmp/greeting Info: /Stage[main]/Greeting/File[/tmp/greeting]: Filebucketed /tmp/greeting to puppet with sum 54098b367d2e87b078671fad4afb9dbb Notice: /Stage[main]/Greeting/File[/tmp/greeting]/content: content changed '{md5}54098b367d2e87b078671fad4afb9dbb' to '{md5}933c7f04d501b45456e830de299b5521' Notice: Finished catalog run in 0.77 seconds
How it works...
On the first Puppet run, puppet searches for the available file sources in the order given:
source => [
'puppet:///modules/greeting/hello.txt',
'puppet:///modules/greeting/universal.txt'
],
The file hello.txt
is first in the list, and is present, so Puppet uses that as the source for /tmp/greeting
:
Hello, world.
On the second Puppet run, hello.txt
is missing, so Puppet goes on to look for the next file, universal.txt
. This is present, so it becomes the source for /tmp/greeting
:
Bah-weep-Graaaaagnah wheep ni ni bong
There's more...
You can use this trick anywhere you have a file
resource. A common example is a service that is deployed on all nodes, such as rsyslog. The rsyslog
configuration is the same on every host except for the rsyslog server. Create an rsyslog
class with a file resource for the rsyslog
configuration file:
class rsyslog {
file { '/etc/rsyslog.conf':
source => [
"puppet:///modules/rsyslog/rsyslog.conf.${::hostname}",
'puppet:///modules/rsyslog/rsyslog.conf' ],
}
Then, you put the default configuration in rsyslog.conf
. For your rsyslog server, logger
, create an rsyslog.conf.logger
file. On the machine logger, rsyslog.conf.logger
will be used before rsyslog.conf
because it is listed first in the array of sources.
See also
- The Passing parameters to classes recipe in Chapter 3, Writing Better Manifests
class greeting { file { '/tmp/greeting': source => [ 'puppet:///modules/greeting/hello.txt', 'puppet:///modules/greeting/universal.txt'], } }
modules/greeting/files/hello.txt
with the following contents:Hello, world.
modules/greeting/files/universal.txt
with the following contents:Bah-weep-Graaaaagnah wheep ni ni bong
node cookbook { class {'greeting': } }
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413784347' Notice: /Stage[main]/Greeting/File[/tmp/greeting]/ensure: defined content as '{md5}54098b367d2e87b078671fad4afb9dbb' Notice: Finished catalog run in 0.43 seconds
/tmp/greeting
file:[root@cookbook ~]# cat /tmp/greeting Hello, world.
hello.txt
file from your Puppet repository and rerun the agent:[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413784939' Notice: /Stage[main]/Greeting/File[/tmp/greeting]/content: --- /tmp/greeting 2014-10-20 01:52:28.117999991 -0400 +++ /tmp/puppet-file20141020-4960-1o9g344-0 2014-10-20 02:02:20.695999979 -0400 @@ -1 +1 @@ -Hello, world. +Bah-weep-Graaaaagnah wheep ni ni bong Info: Computing checksum on file /tmp/greeting Info: /Stage[main]/Greeting/File[/tmp/greeting]: Filebucketed /tmp/greeting to puppet with sum 54098b367d2e87b078671fad4afb9dbb Notice: /Stage[main]/Greeting/File[/tmp/greeting]/content: content changed '{md5}54098b367d2e87b078671fad4afb9dbb' to '{md5}933c7f04d501b45456e830de299b5521' Notice: Finished catalog run in 0.77 seconds
How it works...
On the first Puppet run, puppet searches for the available file sources in the order given:
source => [
'puppet:///modules/greeting/hello.txt',
'puppet:///modules/greeting/universal.txt'
],
The file hello.txt
is first in the list, and is present, so Puppet uses that as the source for /tmp/greeting
:
Hello, world.
On the second Puppet run, hello.txt
is missing, so Puppet goes on to look for the next file, universal.txt
. This is present, so it becomes the source for /tmp/greeting
:
Bah-weep-Graaaaagnah wheep ni ni bong
There's more...
You can use this trick anywhere you have a file
resource. A common example is a service that is deployed on all nodes, such as rsyslog. The rsyslog
configuration is the same on every host except for the rsyslog server. Create an rsyslog
class with a file resource for the rsyslog
configuration file:
class rsyslog {
file { '/etc/rsyslog.conf':
source => [
"puppet:///modules/rsyslog/rsyslog.conf.${::hostname}",
'puppet:///modules/rsyslog/rsyslog.conf' ],
}
Then, you put the default configuration in rsyslog.conf
. For your rsyslog server, logger
, create an rsyslog.conf.logger
file. On the machine logger, rsyslog.conf.logger
will be used before rsyslog.conf
because it is listed first in the array of sources.
See also
- The Passing parameters to classes recipe in Chapter 3, Writing Better Manifests
the available file sources in the order given:
source => [
'puppet:///modules/greeting/hello.txt',
'puppet:///modules/greeting/universal.txt'
],
The file hello.txt
is first in the list, and is present, so Puppet uses that as the source for /tmp/greeting
:
Hello, world.
On the second Puppet run, hello.txt
is missing, so Puppet goes on to look for the next file, universal.txt
. This is present, so it becomes the source for /tmp/greeting
:
Bah-weep-Graaaaagnah wheep ni ni bong
There's more...
You can use this trick anywhere you have a file
resource. A common example is a service that is deployed on all nodes, such as rsyslog. The rsyslog
configuration is the same on every host except for the rsyslog server. Create an rsyslog
class with a file resource for the rsyslog
configuration file:
class rsyslog {
file { '/etc/rsyslog.conf':
source => [
"puppet:///modules/rsyslog/rsyslog.conf.${::hostname}",
'puppet:///modules/rsyslog/rsyslog.conf' ],
}
Then, you put the default configuration in rsyslog.conf
. For your rsyslog server, logger
, create an rsyslog.conf.logger
file. On the machine logger, rsyslog.conf.logger
will be used before rsyslog.conf
because it is listed first in the array of sources.
See also
- The Passing parameters to classes recipe in Chapter 3, Writing Better Manifests
have a file
resource. A common example is a service that is deployed on all nodes, such as rsyslog. The rsyslog
configuration is the same on every host except for the rsyslog server. Create an rsyslog
class with a file resource for the rsyslog
configuration file:
class rsyslog {
file { '/etc/rsyslog.conf':
source => [
"puppet:///modules/rsyslog/rsyslog.conf.${::hostname}",
'puppet:///modules/rsyslog/rsyslog.conf' ],
}
Then, you put the default configuration in rsyslog.conf
. For your rsyslog server, logger
, create an rsyslog.conf.logger
file. On the machine logger, rsyslog.conf.logger
will be used before rsyslog.conf
because it is listed first in the array of sources.
See also
- The Passing parameters to classes recipe in Chapter 3, Writing Better Manifests
- Chapter 3, Writing Better Manifests
Distributing and merging directory trees
As we saw in the previous chapter, the file resource has a recurse
parameter, which allows Puppet to transfer entire directory trees. We used this parameter to copy an admin user's dotfiles into their home directory. In this section, we'll show how to use recurse
and another parameter sourceselect
to extend our previous example.
How to do it...
Modify our admin user example as follows:
- Remove the
$dotfiles
parameter, remove the condition based on$dotfiles
. Add a second source to the home directoryfile
resource:define admin_user ($key, $keytype) { $username = $name user { $username: ensure => present, } file { "/home/${username}/.ssh": ensure => directory, mode => '0700', owner => $username, group => $username, require => File["/home/${username}"], } ssh_authorized_key { "${username}_key": key => $key, type => "$keytype", user => $username, require => File["/home/${username}/.ssh"], } # copy in all the files in the subdirectory file { "/home/${username}": recurse => true, mode => '0700', owner => $username, group => $username, source => [ "puppet:///modules/admin_user/${username}", 'puppet:///modules/admin_user/base' ], sourceselect => 'all', require => User["$username"], } }
- Create a base directory and copy all the system default files from
/etc/skel
:t@mylaptop ~/puppet/modules/admin_user/files $ cp -a /etc/skel base
- Create a new
admin_user
resource, one that will not have a directory defined:node 'cookbook' { admin_user {'steven': key => 'AAAAB3N...', keytype => 'dsa', } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413787159' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/User[steven]/ensure: created Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven]/ensure: created Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.bash_logout]/ensure: defined content as '{md5}6a5bc1cc5f80a48b540bc09d082b5855' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.emacs]/ensure: defined content as '{md5}de7ee35f4058681a834a99b5d1b048b3' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.bashrc]/ensure: defined content as '{md5}2f8222b4f275c4f18e69c34f66d2631b' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.bash_profile]/ensure: defined content as '{md5}f939eb71a81a9da364410b799e817202' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.ssh]/ensure: created Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/Ssh_authorized_key[steven_key]/ensure: created Notice: Finished catalog run in 1.11 seconds
How it works...
If a file
resource has the recurse
parameter set on it, and it is a directory, Puppet will deploy not only the directory itself, but all its contents (including subdirectories and their contents). As we saw in the previous example, when a file has more than one source, the first source file found is used to satisfy the request. This applies to directories as well.
There's more...
By specifying the parameter sourceselect
as 'all', the contents of all the source directories will be combined. For example, add thomas admin_user
back into your node definition in site.pp
for cookbook:
admin_user {'thomas':
key => 'ABBA...',
keytype => 'rsa',
}
Now run Puppet again on cookbook:
[root@cookbook thomas]# puppet agent -t
Info: Caching catalog for cookbook.example.com
Info: Applying configuration version '1413787770'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/content: content changed '{md5}3e8337f44f84b298a8a99869ae8ca76a' to '{md5}f939eb71a81a9da364410b799e817202'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bash_profile]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_logout]/ensure: defined content as '{md5}6a5bc1cc5f80a48b540bc09d082b5855'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/content: content changed '{md5}db2a20b2b9cdf36cca1ca4672622ddd2' to '{md5}033c3484e4b276e0641becc3aa268a3a'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bashrc]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.emacs]/ensure: defined content as '{md5}de7ee35f4058681a834a99b5d1b048b3'
Notice: Finished catalog run in 0.86 seconds
Because we previously applied the thomas admin_user
to cookbook, the user existed. The two files defined in the thomas
directory on the Puppet server were already in the home directory, so only the additional files, .bash_logout
, .bash_profile
, and .emacs
were created. Using these two parameters together, you can have default files that can be overridden easily.
Sometimes you want to deploy files to an existing directory but remove any files which aren't managed by Puppet. A good example would be if you are using mcollective
in your environment. The directory holding client credentials should only have certificates that come from Puppet.
The purge
parameter will do this for you. Define the directory as a resource in Puppet:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
The combination of recurse
and purge
will remove all files and subdirectories in /etc/mcollective/ssl/clients
that are not deployed by Puppet. You can then deploy your own files to that location by placing them in the appropriate directory on the Puppet server.
If there are subdirectories that contain files you don't want to purge, just define the subdirectory as a Puppet resource, and it will be left alone:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
file { '/etc/mcollective/ssl/clients/local':
ensure => directory,
}
Note
Be aware that, at least in current implementations of Puppet, recursive file copies can be quite slow and place a heavy memory load on the server. If the data doesn't change very often, it might be better to deploy and unpack a tar
file instead. This can be done with a file resource for the tar
file and an exec, which requires the file resource and unpacks the archive. Recursive directories are less of a problem when filled with small files. Puppet is not a very efficient file server, so creating large tar files and distributing them with Puppet is not a good idea either. If you need to copy large files around, using the Operating Systems packager is a better solution.
$dotfiles
parameter, remove the condition based on $dotfiles
. Add a second source to the home directory file
resource:define admin_user ($key, $keytype) { $username = $name user { $username: ensure => present, } file { "/home/${username}/.ssh": ensure => directory, mode => '0700', owner => $username, group => $username, require => File["/home/${username}"], } ssh_authorized_key { "${username}_key": key => $key, type => "$keytype", user => $username, require => File["/home/${username}/.ssh"], } # copy in all the files in the subdirectory file { "/home/${username}": recurse => true, mode => '0700', owner => $username, group => $username, source => [ "puppet:///modules/admin_user/${username}", 'puppet:///modules/admin_user/base' ], sourceselect => 'all', require => User["$username"], } }
- base directory and copy all the system default files from
/etc/skel
:t@mylaptop ~/puppet/modules/admin_user/files $ cp -a /etc/skel base
- Create a new
admin_user
resource, one that will not have a directory defined:node 'cookbook' { admin_user {'steven': key => 'AAAAB3N...', keytype => 'dsa', } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413787159' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/User[steven]/ensure: created Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven]/ensure: created Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.bash_logout]/ensure: defined content as '{md5}6a5bc1cc5f80a48b540bc09d082b5855' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.emacs]/ensure: defined content as '{md5}de7ee35f4058681a834a99b5d1b048b3' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.bashrc]/ensure: defined content as '{md5}2f8222b4f275c4f18e69c34f66d2631b' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.bash_profile]/ensure: defined content as '{md5}f939eb71a81a9da364410b799e817202' Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/File[/home/steven/.ssh]/ensure: created Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[steven]/Ssh_authorized_key[steven_key]/ensure: created Notice: Finished catalog run in 1.11 seconds
How it works...
If a file
resource has the recurse
parameter set on it, and it is a directory, Puppet will deploy not only the directory itself, but all its contents (including subdirectories and their contents). As we saw in the previous example, when a file has more than one source, the first source file found is used to satisfy the request. This applies to directories as well.
There's more...
By specifying the parameter sourceselect
as 'all', the contents of all the source directories will be combined. For example, add thomas admin_user
back into your node definition in site.pp
for cookbook:
admin_user {'thomas':
key => 'ABBA...',
keytype => 'rsa',
}
Now run Puppet again on cookbook:
[root@cookbook thomas]# puppet agent -t
Info: Caching catalog for cookbook.example.com
Info: Applying configuration version '1413787770'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/content: content changed '{md5}3e8337f44f84b298a8a99869ae8ca76a' to '{md5}f939eb71a81a9da364410b799e817202'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bash_profile]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_logout]/ensure: defined content as '{md5}6a5bc1cc5f80a48b540bc09d082b5855'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/content: content changed '{md5}db2a20b2b9cdf36cca1ca4672622ddd2' to '{md5}033c3484e4b276e0641becc3aa268a3a'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bashrc]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.emacs]/ensure: defined content as '{md5}de7ee35f4058681a834a99b5d1b048b3'
Notice: Finished catalog run in 0.86 seconds
Because we previously applied the thomas admin_user
to cookbook, the user existed. The two files defined in the thomas
directory on the Puppet server were already in the home directory, so only the additional files, .bash_logout
, .bash_profile
, and .emacs
were created. Using these two parameters together, you can have default files that can be overridden easily.
Sometimes you want to deploy files to an existing directory but remove any files which aren't managed by Puppet. A good example would be if you are using mcollective
in your environment. The directory holding client credentials should only have certificates that come from Puppet.
The purge
parameter will do this for you. Define the directory as a resource in Puppet:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
The combination of recurse
and purge
will remove all files and subdirectories in /etc/mcollective/ssl/clients
that are not deployed by Puppet. You can then deploy your own files to that location by placing them in the appropriate directory on the Puppet server.
If there are subdirectories that contain files you don't want to purge, just define the subdirectory as a Puppet resource, and it will be left alone:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
file { '/etc/mcollective/ssl/clients/local':
ensure => directory,
}
Note
Be aware that, at least in current implementations of Puppet, recursive file copies can be quite slow and place a heavy memory load on the server. If the data doesn't change very often, it might be better to deploy and unpack a tar
file instead. This can be done with a file resource for the tar
file and an exec, which requires the file resource and unpacks the archive. Recursive directories are less of a problem when filled with small files. Puppet is not a very efficient file server, so creating large tar files and distributing them with Puppet is not a good idea either. If you need to copy large files around, using the Operating Systems packager is a better solution.
file
resource
has the recurse
parameter set on it, and it is a directory, Puppet will deploy not only the directory itself, but all its contents (including subdirectories and their contents). As we saw in the previous example, when a file has more than one source, the first source file found is used to satisfy the request. This applies to directories as well.
There's more...
By specifying the parameter sourceselect
as 'all', the contents of all the source directories will be combined. For example, add thomas admin_user
back into your node definition in site.pp
for cookbook:
admin_user {'thomas':
key => 'ABBA...',
keytype => 'rsa',
}
Now run Puppet again on cookbook:
[root@cookbook thomas]# puppet agent -t
Info: Caching catalog for cookbook.example.com
Info: Applying configuration version '1413787770'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/content: content changed '{md5}3e8337f44f84b298a8a99869ae8ca76a' to '{md5}f939eb71a81a9da364410b799e817202'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bash_profile]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_logout]/ensure: defined content as '{md5}6a5bc1cc5f80a48b540bc09d082b5855'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/content: content changed '{md5}db2a20b2b9cdf36cca1ca4672622ddd2' to '{md5}033c3484e4b276e0641becc3aa268a3a'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bashrc]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.emacs]/ensure: defined content as '{md5}de7ee35f4058681a834a99b5d1b048b3'
Notice: Finished catalog run in 0.86 seconds
Because we previously applied the thomas admin_user
to cookbook, the user existed. The two files defined in the thomas
directory on the Puppet server were already in the home directory, so only the additional files, .bash_logout
, .bash_profile
, and .emacs
were created. Using these two parameters together, you can have default files that can be overridden easily.
Sometimes you want to deploy files to an existing directory but remove any files which aren't managed by Puppet. A good example would be if you are using mcollective
in your environment. The directory holding client credentials should only have certificates that come from Puppet.
The purge
parameter will do this for you. Define the directory as a resource in Puppet:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
The combination of recurse
and purge
will remove all files and subdirectories in /etc/mcollective/ssl/clients
that are not deployed by Puppet. You can then deploy your own files to that location by placing them in the appropriate directory on the Puppet server.
If there are subdirectories that contain files you don't want to purge, just define the subdirectory as a Puppet resource, and it will be left alone:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
file { '/etc/mcollective/ssl/clients/local':
ensure => directory,
}
Note
Be aware that, at least in current implementations of Puppet, recursive file copies can be quite slow and place a heavy memory load on the server. If the data doesn't change very often, it might be better to deploy and unpack a tar
file instead. This can be done with a file resource for the tar
file and an exec, which requires the file resource and unpacks the archive. Recursive directories are less of a problem when filled with small files. Puppet is not a very efficient file server, so creating large tar files and distributing them with Puppet is not a good idea either. If you need to copy large files around, using the Operating Systems packager is a better solution.
parameter sourceselect
as 'all', the contents of all the source directories will be combined. For example, add thomas admin_user
back into your node definition in site.pp
for cookbook:
admin_user {'thomas':
key => 'ABBA...',
keytype => 'rsa',
}
Now run Puppet again on cookbook:
[root@cookbook thomas]# puppet agent -t
Info: Caching catalog for cookbook.example.com
Info: Applying configuration version '1413787770'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/content: content changed '{md5}3e8337f44f84b298a8a99869ae8ca76a' to '{md5}f939eb71a81a9da364410b799e817202'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_profile]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bash_profile]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bash_logout]/ensure: defined content as '{md5}6a5bc1cc5f80a48b540bc09d082b5855'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/content: content changed '{md5}db2a20b2b9cdf36cca1ca4672622ddd2' to '{md5}033c3484e4b276e0641becc3aa268a3a'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/group: group changed 'root' to 'thomas'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.bashrc]/mode: mode changed '0644' to '0700'
Notice: /File[/home/thomas/.bashrc]/seluser: seluser changed 'system_u' to 'unconfined_u'
Notice: /Stage[main]/Main/Node[cookbook]/Admin_user[thomas]/File[/home/thomas/.emacs]/ensure: defined content as '{md5}de7ee35f4058681a834a99b5d1b048b3'
Notice: Finished catalog run in 0.86 seconds
Because we previously applied the thomas admin_user
to cookbook, the user existed. The two files defined in the thomas
directory on the Puppet server were already in the home directory, so only the additional files, .bash_logout
, .bash_profile
, and .emacs
were created. Using these two parameters together, you can have default files that can be overridden easily.
Sometimes you want to deploy files to an existing directory but remove any files which aren't managed by Puppet. A good example would be if you are using mcollective
in your environment. The directory holding client credentials should only have certificates that come from Puppet.
The purge
parameter will do this for you. Define the directory as a resource in Puppet:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
The combination of recurse
and purge
will remove all files and subdirectories in /etc/mcollective/ssl/clients
that are not deployed by Puppet. You can then deploy your own files to that location by placing them in the appropriate directory on the Puppet server.
If there are subdirectories that contain files you don't want to purge, just define the subdirectory as a Puppet resource, and it will be left alone:
file { '/etc/mcollective/ssl/clients':
purge => true,
recurse => true,
}
file { '/etc/mcollective/ssl/clients/local':
ensure => directory,
}
Note
Be aware that, at least in current implementations of Puppet, recursive file copies can be quite slow and place a heavy memory load on the server. If the data doesn't change very often, it might be better to deploy and unpack a tar
file instead. This can be done with a file resource for the tar
file and an exec, which requires the file resource and unpacks the archive. Recursive directories are less of a problem when filled with small files. Puppet is not a very efficient file server, so creating large tar files and distributing them with Puppet is not a good idea either. If you need to copy large files around, using the Operating Systems packager is a better solution.
Cleaning up old files
Puppet's tidy
resource will help you clean up old or out-of-date files, reducing disk usage. For example, if you have Puppet reporting enabled as described in the section on generating reports, you might want to regularly delete old report files.
How to do it...
Let's get started.
- Modify your
site.pp
file as follows:node 'cookbook' { tidy { '/var/lib/puppet/reports': age => '1w', recurse => true, } }
- Run Puppet:
[root@cookbook clients]# puppet agent -t Info: Caching catalog for cookbook.example.com Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409090637.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409100556.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409090631.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201408210557.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409080557.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409100558.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201408210546.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201408210539.yaml]/ensure: removed Notice: Finished catalog run in 0.80 seconds
How it works...
Puppet searches the specified path for any files matching the age
parameter; in this case, 2w
(two weeks). It also searches subdirectories (recurse => true
).
Any files matching your criteria will be deleted.
There's more...
You can specify file ages in seconds, minutes, hours, days, or weeks by using a single character to specify the time unit, as follows:
60s
180m
24h
30d
4w
You can specify that files greater than a given size should be removed, as follows:
size => '100m',
This removes files of 100 megabytes and over. For kilobytes, use k
, and for bytes, use b
.
Note
Note that if you specify both age and size parameters, they are treated as independent criteria. For example, if you specify the following, Puppet will remove all files that are either at least one day old, or at least 512 KB in size:
age => "1d",
size => "512k",
site.pp
file as follows:node 'cookbook' { tidy { '/var/lib/puppet/reports': age => '1w', recurse => true, } }
[root@cookbook clients]# puppet agent -t Info: Caching catalog for cookbook.example.com Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409090637.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409100556.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409090631.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201408210557.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409080557.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201409100558.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201408210546.yaml]/ensure: removed Notice: /Stage[main]/Main/Node[cookbook]/File[/var/lib/puppet/reports/cookbook.example.com/201408210539.yaml]/ensure: removed Notice: Finished catalog run in 0.80 seconds
How it works...
Puppet searches the specified path for any files matching the age
parameter; in this case, 2w
(two weeks). It also searches subdirectories (recurse => true
).
Any files matching your criteria will be deleted.
There's more...
You can specify file ages in seconds, minutes, hours, days, or weeks by using a single character to specify the time unit, as follows:
60s
180m
24h
30d
4w
You can specify that files greater than a given size should be removed, as follows:
size => '100m',
This removes files of 100 megabytes and over. For kilobytes, use k
, and for bytes, use b
.
Note
Note that if you specify both age and size parameters, they are treated as independent criteria. For example, if you specify the following, Puppet will remove all files that are either at least one day old, or at least 512 KB in size:
age => "1d",
size => "512k",
specified path for any files matching the age
parameter; in this case, 2w
(two weeks). It also searches subdirectories (recurse => true
).
Any files matching your criteria will be deleted.
There's more...
You can specify file ages in seconds, minutes, hours, days, or weeks by using a single character to specify the time unit, as follows:
60s
180m
24h
30d
4w
You can specify that files greater than a given size should be removed, as follows:
size => '100m',
This removes files of 100 megabytes and over. For kilobytes, use k
, and for bytes, use b
.
Note
Note that if you specify both age and size parameters, they are treated as independent criteria. For example, if you specify the following, Puppet will remove all files that are either at least one day old, or at least 512 KB in size:
age => "1d",
size => "512k",
60s
180m
24h
30d
4w
k
, and for bytes, use b
.
Note
Note that if you specify both age and size parameters, they are treated as independent criteria. For example, if you specify the following, Puppet will remove all files that are either at least one day old, or at least 512 KB in size:
age => "1d",
size => "512k",
Auditing resources
Dry run mode, using the --noop
switch, is a simple way to audit any changes to a machine under Puppet's control. However, Puppet also has a dedicated audit feature, which can report changes to resources or specific attributes.
How to do it...
Here's an example showing Puppet's auditing capabilities:
- Modify your
site.pp
file as follows:node 'cookbook' { file { '/etc/passwd': audit => [ owner, mode ], } }
- Run Puppet:
[root@cookbook clients]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413789080' Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/passwd]/owner: audit change: newly-recorded value 0 Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/passwd]/mode: audit change: newly-recorded value 644 Notice: Finished catalog run in 0.55 seconds
How it works...
The audit
metaparameter tells Puppet that you want to record and monitor certain things about the resource. The value can be a list of the parameters that you want to audit.
In this case, when Puppet runs, it will now record the owner and mode of the /etc/passwd
file. In future runs, Puppet will spot whether either of these has changed. For example, if you run:
[root@cookbook ~]# chmod 666 /etc/passwd
Puppet will pick up this change and log it on the next run:
Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/passwd]/mode: audit change: previously recorded value 0644 has been changed to 0666
There's more...
This feature is very useful to audit large networks for any changes to machines, either malicious or accidental. It's also very handy to keep an eye on things that aren't managed by Puppet, for example, application code on production servers. You can read more about Puppet's auditing capability here:
http://puppetlabs.com/blog/all-about-auditing-with-puppet/
If you just want to audit everything about a resource, use all
:
file { '/etc/passwd': audit => all, }
See also
- The Noop - the don't change anything option recipe in Chapter 10, Monitoring, Reporting, and Troubleshooting
showing Puppet's auditing capabilities:
- Modify your
site.pp
file as follows:node 'cookbook' { file { '/etc/passwd': audit => [ owner, mode ], } }
- Run Puppet:
[root@cookbook clients]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413789080' Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/passwd]/owner: audit change: newly-recorded value 0 Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/passwd]/mode: audit change: newly-recorded value 644 Notice: Finished catalog run in 0.55 seconds
How it works...
The audit
metaparameter tells Puppet that you want to record and monitor certain things about the resource. The value can be a list of the parameters that you want to audit.
In this case, when Puppet runs, it will now record the owner and mode of the /etc/passwd
file. In future runs, Puppet will spot whether either of these has changed. For example, if you run:
[root@cookbook ~]# chmod 666 /etc/passwd
Puppet will pick up this change and log it on the next run:
Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/passwd]/mode: audit change: previously recorded value 0644 has been changed to 0666
There's more...
This feature is very useful to audit large networks for any changes to machines, either malicious or accidental. It's also very handy to keep an eye on things that aren't managed by Puppet, for example, application code on production servers. You can read more about Puppet's auditing capability here:
http://puppetlabs.com/blog/all-about-auditing-with-puppet/
If you just want to audit everything about a resource, use all
:
file { '/etc/passwd': audit => all, }
See also
- The Noop - the don't change anything option recipe in Chapter 10, Monitoring, Reporting, and Troubleshooting
audit
metaparameter tells Puppet that you want to record and monitor certain things about the resource. The value can be a list of the parameters that you want to audit.
/etc/passwd
file. In future runs, Puppet will spot whether either of these has changed. For example, if you run:
There's more...
This feature is very useful to audit large networks for any changes to machines, either malicious or accidental. It's also very handy to keep an eye on things that aren't managed by Puppet, for example, application code on production servers. You can read more about Puppet's auditing capability here:
http://puppetlabs.com/blog/all-about-auditing-with-puppet/
If you just want to audit everything about a resource, use all
:
file { '/etc/passwd': audit => all, }
See also
- The Noop - the don't change anything option recipe in Chapter 10, Monitoring, Reporting, and Troubleshooting
here:
http://puppetlabs.com/blog/all-about-auditing-with-puppet/
If you just want to audit everything about a resource, use all
:
file { '/etc/passwd': audit => all, }
See also
- The Noop - the don't change anything option recipe in Chapter 10, Monitoring, Reporting, and Troubleshooting
- Chapter 10, Monitoring, Reporting, and Troubleshooting
Temporarily disabling resources
Sometimes you want to disable a resource for the time being so that it doesn't interfere with other work. For example, you might want to tweak a configuration file on the server until you have the exact settings you want, before checking it into Puppet. You don't want Puppet to overwrite it with an old version in the meantime, so you can set the noop
metaparameter on the resource:
noop => true,
How to do it...
This example shows you how to use the noop
metaparameter:
- Modify your
site.pp
file as follows:node 'cookbook' { file { '/etc/resolv.conf': content => "nameserver 127.0.0.1\n", noop => true, } }
- Run Puppet:
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413789438' Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/resolv.conf]/content: --- /etc/resolv.conf 2014-10-20 00:27:43.095999975 -0400 +++ /tmp/puppet-file20141020-8439-1lhuy1y-0 2014-10-20 03:17:18.969999979 -0400 @@ -1,3 +1 @@ -; generated by /sbin/dhclient-script -search example.com -nameserver 192.168.122.1 +nameserver 127.0.0.1 Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/resolv.conf]/content: current_value {md5}4c0d192511df253826d302bc830a371b, should be {md5}949343428bded6a653a85910f6bdb48e (noop) Notice: Node[cookbook]: Would have triggered 'refresh' from 1 events Notice: Class[Main]: Would have triggered 'refresh' from 1 events Notice: Stage[main]: Would have triggered 'refresh' from 1 events Notice: Finished catalog run in 0.50 seconds
How it works...
The noop
metaparameter is set to true
, so for this particular resource, it's as if you had to run Puppet with the --noop
flag. Puppet noted that the resource would have been applied, but otherwise did nothing.
The nice thing with running the agent in test mode (-t
) is that Puppet output a diff of what it would have done if the noop
was not present (you can tell puppet to show the diff's without using -t
with --show_diff
; -t
implies many different settings):
--- /etc/resolv.conf 2014-10-20 00:27:43.095999975 -0400
+++ /tmp/puppet-file20141020-8439-1lhuy1y-0 2014-10-20 03:17:18.969999979 -0400
@@ -1,3 +1 @@
-; generated by /sbin/dhclient-script
-search example.com
-nameserver 192.168.122.1
+nameserver 127.0.0.1
This can be very useful when debugging a template; you can work on your changes and then see what they would look like on the node without actually applying them. Using the diff, you can see whether your updated template produces the correct output.
noop
metaparameter:
site.pp
file as follows:node 'cookbook' { file { '/etc/resolv.conf': content => "nameserver 127.0.0.1\n", noop => true, } }
[root@cookbook ~]# puppet agent -t Info: Caching catalog for cookbook.example.com Info: Applying configuration version '1413789438' Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/resolv.conf]/content: --- /etc/resolv.conf 2014-10-20 00:27:43.095999975 -0400 +++ /tmp/puppet-file20141020-8439-1lhuy1y-0 2014-10-20 03:17:18.969999979 -0400 @@ -1,3 +1 @@ -; generated by /sbin/dhclient-script -search example.com -nameserver 192.168.122.1 +nameserver 127.0.0.1 Notice: /Stage[main]/Main/Node[cookbook]/File[/etc/resolv.conf]/content: current_value {md5}4c0d192511df253826d302bc830a371b, should be {md5}949343428bded6a653a85910f6bdb48e (noop) Notice: Node[cookbook]: Would have triggered 'refresh' from 1 events Notice: Class[Main]: Would have triggered 'refresh' from 1 events Notice: Stage[main]: Would have triggered 'refresh' from 1 events Notice: Finished catalog run in 0.50 seconds
How it works...
The noop
metaparameter is set to true
, so for this particular resource, it's as if you had to run Puppet with the --noop
flag. Puppet noted that the resource would have been applied, but otherwise did nothing.
The nice thing with running the agent in test mode (-t
) is that Puppet output a diff of what it would have done if the noop
was not present (you can tell puppet to show the diff's without using -t
with --show_diff
; -t
implies many different settings):
--- /etc/resolv.conf 2014-10-20 00:27:43.095999975 -0400
+++ /tmp/puppet-file20141020-8439-1lhuy1y-0 2014-10-20 03:17:18.969999979 -0400
@@ -1,3 +1 @@
-; generated by /sbin/dhclient-script
-search example.com
-nameserver 192.168.122.1
+nameserver 127.0.0.1
This can be very useful when debugging a template; you can work on your changes and then see what they would look like on the node without actually applying them. Using the diff, you can see whether your updated template produces the correct output.
noop
metaparameter is set to true
, so for this particular resource, it's as if you had to run Puppet with the --noop
flag. Puppet noted that the resource would have been applied, but otherwise did nothing.
-t
) is that Puppet output a diff of what it would have done if the noop
was not present (you can tell puppet to show the diff's without using -t
with --show_diff
; -t
implies many different settings):