Learning basic operations with Pentaho Data Integration
The following recipe is aimed at showing you the basic building blocks that you can use for the rest of the recipes in this chapter. We recommend that you work through this simple recipe before you tackle any of the others. If you want, PDI also contains a large selection of sample transformations for you to open, edit, and test. These can be found in the sample directory of PDI.
Getting ready
Before you can begin this recipe, you will need to make sure that the JAVA_HOME environment variable is set properly. By default, PDI tries to guess the value of the JAVA_HOME environment variable. Note that for this book, we are using Java 1.7. As soon as this is done, you're ready to launch Spoon, the graphical development environment for PDI. To start Spoon, you can use the appropriate scripts located at the PDI home folder. To start Spoon in Windows, you will have to execute the spoon.bat script in the home folder of PDI. For Linux or Mac, you will have to execute the spoon.sh bash script instead.
How to do it…
First, we need configure Spoon to be able to create transformations and/or jobs. To acclimatize to the tool, perform the following steps:
- Create a new empty transformation:
- Click on the New file button from the toolbar menu and select the Transformation item entry. You can also navigate to File | New | Transformation from the main menu. Ctrl + N also creates a new transformation.
- Set a name for the transformation:
- Open the Transformation settings dialog by pressing Ctrl + T. Alternatively, you can right-click on the right-hand-side working area and select Transformation settings. Or on the menu bar, select the Settings... item entry from the Edit menu.
- Select the Transformation tab.
- Set Transformation Name to
First Test Transformation
. - Click on the OK button.
- Save the transformation:
- Click on the Save current file button from the toolbar. Alternatively, from the menu bar, go to File | Save. Or finally, use the quick option by pressing Ctrl + S.
- Choose the location of your transformation and give it the name chapter1-first-transformation.
- Click on the OK button.
- Run a transformation using Spoon.
- You can run the transformation by either of these ways: click on the green play icon on the transformation toolbar and navigate to Action | Run on the main menu or simply press F9.
- You will get an Execute a transformation dialog. Here, you can set parameters, variables, or arguments if they are required for running the transformation.
- Run the transformation by clicking on the Launch button.
- Run the transformation in preview mode using Spoon.
- In the Transformation debug dialog, select the step you want to preview the output data.
- After selecting the desired output step, you can preview the transformation by either clicking on the magnify icon on the transformation toolbar, going to Action | Preview on the main menu, or simply pressing F10.
- You will get a Transformation debug dialog that you can use to define the number of rows you want to see, breakpoints, and the step that you want analyze.
- You can click on the Configure button to define parameters, variables, or arguments. Click on the Quick Launch button to preview the transformation.
How it works…
In this recipe, we just introduced the Spoon tool, touching on the main basic points for you to manage ETL transformations. We started by creating a transformation. We gave a name to the transformation, First Test Transformation
in this case. Then, we saved the transformation in the filesystem with the name chapter1-first-transformation
.
Finally, we ran the transformation normally and in debug mode. Understanding how to run a transformation in debug mode is useful for future ETL developments as it helps you understand what is happening inside of the transformation.
There's more…
In the PDI home folder, you will find a large selection of sample transformations and jobs that you can open, edit, and run to better understand the functionality of the diverse steps available in PDI.