A process is an instance of a program in execution. A file is an object on the filesystem; beside regular file with plain text or binary content; it could also be a directory, a symbolic link, a device-special file, a named pipe, or a (Unix-domain) socket.
The Unix design philosophy abstracts peripheral devices (such as the keyboard, monitor, mouse, a sensor, and touchscreen) as files – what it calls device files. By doing this, Unix allows the application programmer to conveniently ignore the details and just treat (peripheral) devices as though they are ordinary disk files.
The kernel provides a layer to handle this very abstraction – it's called the Virtual Filesystem Switch (VFS). So, with this in place, the application developer can open a device file and perform I/O (reads and writes) upon it, all using the usual API interfaces provided (relax, these APIs will be covered in a subsequent chapter).
In fact, every process inherits three files on creation:
- Standard input (stdin: fd 0): The keyboard device, by default
- Standard output (stdout: fd 1): The monitor (or terminal) device, by default
- Standard error (stderr: fd 2): The monitor (or terminal) device, by default
fd is the common abbreviation, especially in code, for file descriptor; it's an integer value that refers to the open file in question.
Also, note that we mention it's a certain device by default – this implies the defaults can be changed. Indeed, this is a key part of the design: changing standard input, output, or error channels is called redirection, and by using the familiar <, > and 2> shell operators, these file channels are redirected to other files or devices.
On Unix, there exists a class of programs called filters.
A filter is a program that reads from its standard input, possibly modifies the input, and writes the filtered result to its standard output.
Filters on Unix are very common utilities, such as cat, wc, sort, grep, perl, head, and tail.
Filters allow Unix to easily sidestep design and code complexity. How?
Let's take the sort filter as a quick example. Okay, we'll need some data to sort. Let's say we run the following commands:
$ cat fruit.txt
orange
banana
apple
pear
grape
pineapple
lemon
cherry
papaya
mango
$
Now we consider four scenarios of using sort; based on the parameter(s) we pass, we are actually performing explicit or implicit input-, output-, and/or error-redirection!
Scenario 1: Sort a file alphabetically (one parameter, input implicitly redirected to file):
$ sort fruit.txt
apple
banana
cherry
grape
lemon
mango
orange
papaya
pear
pineapple
$
All right!
Hang on a second, though. If sort is a filter (and it is), it should read from its stdin (the keyboard) and write to its stdout (the terminal). It is indeed writing to the terminal device, but it's reading from a file, fruit.txt.
This is deliberate; if a parameter is provided, the sort program treats it as standard input, as clearly seen.
Also, note that sort fruit.txt is identical to sort < fruit.txt.
Scenario 2: Sort any given input alphabetically (no parameters, input and output from and to stdin/stdout):
$ sort
mango
apple
pear
^D
apple
mango
pear
$
Once you type sort and press the Enter key, and the sort process comes alive and just waits. Why? It's waiting for you, the user, to type something. Why? Recall, every process by default reads its input from standard input or stdin – the keyboard device! So, we type in some fruit names. When we're done, press Ctrl + D. This is the default character sequence that signifies end-of-file (EOF), or in cases such as this, end-of-input. Voila! The input is sorted and written. To where? To the sort process's stdout – the terminal device, hence we see it.
Scenario 3: Sort any given input alphabetically and save the output to a file (explicit output redirection):
$ sort > sorted.fruit.txt
mango
apple
pear
^D
$
Similar to Scenario 2, we type in some fruit names and then Ctrl + D to tell sort we're done. This time, though, note that the output is redirected (via the > meta-character) to the sorted.fruits.txt file!
So, as expected is the following output:
$ cat sorted.fruit.txt
apple
mango
pear
$
Scenario 4: Sort a file alphabetically and save the output and errors to a file (explicit input-, output-, and error-redirection):
$ sort < fruit.txt > sorted.fruit.txt 2> /dev/null
$
Interestingly, the end result is the same as in the preceding scenario, with the added advantage of redirecting any error output to the error channel. Here, we redirect the error output (recall that file descriptor 2 always refers to stderr) to the /dev/null special device file; /dev/null is a device file whose job is to act as a sink (a black hole). Anything written to the null device just disappears forever! (Who said there isn't magic on Unix?) Also, its complement is /dev/zero; the zero device is a source – an infinite source of zeros. Reading from it returns zeroes (the first ASCII character, not numeric 0); it has no end-of-file!