Using for loops and Arrays
Certain languages, such as Spanish and French, have the concept of masculine and feminine nouns. For this demo, we’ll work with a list of English nouns, their Spanish equivalents, and the gender designations for the Spanish nouns.
Why is someone with a French last name creating a list of Spanish words? Well, it’s just that despite my French ancestry, I chose to learn Spanish instead of French in high school. So, I do know some Spanish, but I don’t know French. (I know, I’m weird.) Also, I realize that the Spanish word camiόn has an accent over the last syllable. Alas, inserting accents with an English-language keyboard isn’t easily done in a plain-text file, at least not without messing up how the awk
script works.
To begin, create the spanish_words.txt
file, and make it look like this:
ENGLISH:SPANISH:GENDER
cat:gato:M
table:mesa:F
bed:cama:F
bus:camion:M
house:casa:F
As you see, we’re using colons as field separators, and using either M or F to designate if a word is masculine or feminine. The first line is a header, so we’ll need to take that into account when we process the file.
Next, create the masc-fem.awk
script, like this:
#!/usr/bin/awk -f
BEGIN {FS=":"}
NR==1 {next}
$3 == "M" {masc[$2]=$1}
$3 == "F" {fem[$2]=$1}
END {
print "\nMasculine Nouns\n----";
for (m in masc)
{print m "--" masc[m]; count++}
print "\nFeminine Nouns\n----";
for (f in fem)
{print f "--" fem[f]; count2++}
print "\nThere are " count " masculine nouns and " count2 " feminine nouns."
}
In the BEGIN
section, we’re setting the :
as the field separator. The NR == 1 {next}
line means to ignore line 1 and move on to the next line. The next two lines build the masc
and fem
arrays. Any line that has an M
in field 3 goes into the masc
array, and any line that has an F
in field 3 goes into the fem
array. The END
section contains code that will run after the code in the main body has finished building the arrays. The two for
loops work the same as you saw with the normal shell scripting for
loops, except that we’re now using C language syntax. The first loop prints out the list of masculine nouns and uses the count
variable to add up the total of masculine nouns. The second loop does the same for the feminine nouns, except that it uses the count2
variable to total the number of feminine nouns. Running the script looks like this:
donnie@fedora:~$ ./masc-fem.awk spanish_words.txt
Masculine Nouns
----
gato--cat
camion--bus
Feminine Nouns
----
mesa--table
casa--house
cama--bed
There are 2 masculine nouns and 3 feminine nouns.
donnie@fedora:~$
And, that’s all there is to it. Easy, right?
For our next trick, let’s do some floating-point math.