Converting rows to columns
In most datasets, each row belongs to a different element such as a different sale or a different customer. However, there are datasets where a single row doesn't completely describe one element. Take, for example, the file from Chapter 8, Manipulating Data by Coding, containing information about houses. Every house was described through several rows. A single row gave incomplete information about the house. The ideal situation would be one in which all the attributes for the house were in a single row. With PDI, you can convert the data to this alternative format.
Converting row data to column data using the Row denormaliser step
The Row denormaliser
step converts the incoming dataset to a new dataset by moving information from rows to columns according to the values of a key field.
To understand how the Row denormaliser
works, let's introduce an example. We will work with a file containing a list of French movies of all times. This is how it looks:
... Caché Year...