In the previous chapters, you learned to transform your data in many ways. Now suppose you have to collect results from a survey. You receive several files with the data and those files have different formats. You have to merge those files somehow and generate a unified view of the information. Not only that, you want to remove the rows of data whose content is irrelevant. Finally, based on the rows that interest you, you want to create another file with some statistics. This kind of requirement is very common, but requires more background in Pentaho Data Integration (PDI).Â
This chapter will give you the tools for implementing flows of data similar to the samples explained atop. In particular, we will cover the following topics:
- Filtering data
- Copying, distributing, and partitioning data
- Splitting the stream based on conditions
- Merging...