Generating all possible pairs formed from two datasets
This is a quick recipe that teaches you how to do a Cartesian product between datasets. A Cartesian product is created by taking all rows from one dataset, all rows from another dataset, and generating a new dataset with all the possible combinations of rows.
This particular recipe is, in fact, the implementation of the
Community Acronym Generator (CAG) as proposed by Nicholas Goodman (@nagoodman
) on Twitter:
@webdetails @pmalves @josvandongen How about CAG? Community Acronym Generator? A project to generate new acronyms for community projects?!
There are already several community projects around Pentaho such as CDF (Community Dashboard Framework), CDE (Community Dashboard Editor), or CDA (Community Data Access). Why don't we follow Nicholas's suggestion and develop the CAG as follows?:
Given two lists of words, the Kettle transformation will generate all combinations of words that lead to potential community projects.
How to do it...
Perform...