Using the name of a file (or part of it) as a field
There are some occasions where you need to include the name of a file as a column in your dataset for further processing. With Kettle, you can do it in a very simple way.
In this example, you have several text files about camping products. Each file belongs to a different category and you know the category from the filename. For example, tents.txt
contains tent products. You want to obtain a single dataset with all the products from these files including a field indicating the category of every product.
Getting ready
In order to run this exercise, you need a directory (campingProducts
) with text files named kitchen.txt
, lights.txt
, sleeping_bags.txt
, tents.txt
, and tools.txt
. Each file contains descriptions of the products and their price separated with a |
. For example:
Swedish Firesteel - Army Model|$19.97 Mountain House #10 Can Freeze-Dried Food|$53.50 Coleman 70-Quart Xtreme Cooler (Blue)|$59.99 Kelsyus Floating Cooler|$26.99 Lodge LCC3...