Joining two or more streams based on given conditions
There are occasions where you will need to join two datasets. If you are working with databases, you could use SQL statements to perform this task, but for other kinds of input (XML, text, Excel), you will need another solution.
Kettle provides the Merge Join step to join data coming from any kind of source.
Let's assume that you are building a house and want to track and manage the costs of building it. Before starting, you prepared an Excel file with the estimated costs for the different parts of your house. Now, you are given a weekly file with the progress and the real costs. So, you want to compare both to see the progress.
Getting ready
To run this recipe, you will need two Excel files, one for the budget and another with the real costs. The budget.xls
has the estimated starting date, estimated end date, and cost for the planned tasks. The costs.xls
has the real starting date, end date, and cost for tasks that have already started.
You...