Connecting programs using pipes
In this recipe, we'll learn how to use pipes to connect programs. When we write our C programs, we always want to strive to make them easy to pipe together with other programs. That way, our programs will be much more useful. Sometimes, programs that are connected with pipes are called filters. The reason for this is that, often, when we connect programs with pipes, it is to filter or transform some data.
Getting ready
Just as in the previous recipe, it's recommended that we use the Bash shell.
How to do it…
Follow these steps to explore pipes in Linux:
- We are already familiar with
wc
andls
from the previous recipe. Here, we will use them together with a pipe to count the number of files and directories in the root directory of the system. The pipe is the vertical line symbol:$> ls / | wc -l 29
- Let's make things a bit more interesting. This time, we want to list only symbolic links in the root directory (by using two programs with a pipe). The result will differ from system to system:
$> ls -l / | grep lrwx lrwxrwxrwx   1 root root    31 okt 21 06:53 initrd.img -> boot/initrd.img-4.19.0-12-amd64 lrwxrwxrwx   1 root root    31 okt 21 06:53 initrd.img.old -> boot/initrd.img-4.19.0-11-amd64 lrwxrwxrwx   1 root root    28 okt 21 06:53 vmlinuz -> boot/vmlinuz-4.19.0-12-amd64 lrwxrwxrwx   1 root root    28 okt 21 06:53 vmlinuz.old -> boot/vmlinuz-4.19.0-11-amd64
- Now, we only want the actual filenames, not the information about them. So, this time, we will add another program at the end called
awk
. In this example, we are tellingawk
to print the ninth field. One or more whitespaces separate each field:$> ls -l / | grep lrwx | awk '{ print $9 }' initrd.img initrd.img.old vmlinuz vmlinuz.old
- We can add another "filter", one that adds some text in front of every link. This can be accomplished using
sed
–s
means substitute. Then, we can tellsed
that we want to substitute the start of the line (^
) with the textThis is a link:
:$> ls -l / | grep lrwx | awk '{ print $9 }' \ > | sed 's/^/This is a link: /' This is a link: initrd.img This is a link: initrd.img.old This is a link: vmlinuz This is a link: vmlinuz.old
How it works…
A lot of things are going on here, but don't feel discouraged if you don't get it all. The importance of this recipe is to demonstrate how to use a pipe (the vertical line symbol, |
).
In the very first step, we counted the number of files and directories in the root of the filesystem using wc
. When we run ls
interactively, we get a nice-looking list that spans the width of our terminal. The output is also most likely color-coded. But when we run ls
by redirecting its output through a pipe, ls
doesn't have a real terminal to output to, so it falls back to outputting the text one file or directory per line, without any colors. You can try this yourself if you like by running the following:
$> ls / | cat
Since ls
it outputting one file or directory per line, we can count the number of lines with wc
(the -l
option).
In the next step (Step 2), we used grep
to only list links from the output of ls -l
. Links in the output from ls -l
start with the letter l
at the start of the line. After that is the access rights, which for links is rwx
for everyone. This is what we search for with lrwx
with grep
.
Then, we only wanted the actual filenames, so we added a program called awk
. The awk
tool lets us single out a particular column or field in the output. We singled out the ninth column ($9
), which is the filename.
By running the output from ls
through two other tools, we created a list of only the links in the root directory.
In Step 3, we added another tool, or filter as it sometimes called. This tool is sed
, a stream editor. With this program, we can make changes to the text. In this case, we added the text This is a link:
in front of every link. The following is a short explanation of the line:
sed 's/^/This is a link: /'
s
means "substitute"; that is, we wish to modify some text. Inside the two first slashes (/
) is the text or expressions that should match what we want to modify. Here, we have the beginning of the line, ^
. Then, after the second slash, we have the text that we want to replace the matched text with, up until the final slash. Here, we have the text This is a link:
.
There's more…
Beware of unnecessary piping; it's easy to get caught up in endless piping. One silly—but instructive—example is this:
$> ls / | cat | grep tmp tmp
We could leave out cat
and still get the same result:
$> ls / | grep tmp tmp
The same goes for this one (which I am guilty of myself from time to time):
$> cat /etc/passwd | grep root root:x:0:0:root:/root:/bin/bash
There is no reason to pipe the previous example at all. The grep
utility can take a filename argument, like so:
$> grep root /etc/passwd root:x:0:0:root:/root:/bin/bash
See also
For anyone interested in the history of Unix and how far back pipes go, there is an exciting video from 1982 on YouTube, uploaded by AT&T: https://www.youtube.com/watch?v=tc4ROCJYbm0.