Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Pentaho Data Integration Beginner's Guide - Second Edition

You're reading from   Pentaho Data Integration Beginner's Guide - Second Edition Get up and running with the Pentaho Data Integration tool using this hands-on, easy-to-read guide with this book and ebook

Arrow left icon
Product type Paperback
Published in Oct 2013
Publisher Packt
ISBN-13 9781782165040
Length 502 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
María Carina Roldán María Carina Roldán
Author Profile Icon María Carina Roldán
María Carina Roldán
Arrow right icon
View More author details
Toc

Table of Contents (21) Chapters Close

Preface 1. Getting Started with Pentaho Data Integration 2. Getting Started with Transformations FREE CHAPTER 3. Manipulating Real-world Data 4. Filtering, Searching, and Performing Other Useful Operations with Data 5. Controlling the Flow of Data 6. Transforming Your Data by Coding 7. Transforming the Rowset 8. Working with Databases 9. Performing Advanced Operations with Databases 10. Creating Basic Task Flows 11. Creating Advanced Transformations and Jobs 12. Developing and Implementing a Simple Datamart A. Working with Repositories B. Pan and Kitchen – Launching Transformations and Jobs from the Command Line C. Quick Reference – Steps and Job Entries D. Spoon Shortcuts E. Introducing PDI 5 Features F. Best Practices G. Pop Quiz Answers Index

Time for action – starting and customizing Spoon

In this section, you are going to launch the PDI graphical designer, and get familiarized with its main features.

  1. Start Spoon.
    • If your system is Windows, run Spoon.bat

      Tip

      You can just double-click on the Spoon.bat icon, or Spoon if your Windows system doesn’t show extensions for known file types. Alternatively, open a command window—by selecting Run in the Windows start menu, and executing cmd, and run Spoon.bat in the terminal.

    • In other platforms such as Unix, Linux, and so on, open a terminal window and type spoon.sh
    • If you didn’t make spoon.sh executable, you may type sh spoon.sh
    • Alternatively, if you work on Mac OS, you can execute the JavaApplicationStub file, or click on the Data Integration 32-bit.app, or Data Integration 64-bit.app icon
  2. As soon as Spoon starts, a dialog window appears asking for the repository connection data. Click on the Cancel button.

    Note

    Repositories are explained in Appendix A, Working with Repositories. If you want to know what a repository connection is about, you will find the information in that appendix.

  3. A small window labeled Spoon tips... appears. You may want to navigate through various tips before starting. Eventually, close the window and proceed.
  4. Finally, the main window shows up. A Welcome! window appears with some useful links for you to see. Close the window. You can open it later from the main menu.
  5. Click on Options... from the menu Tools. A window appears where you can change various general and visual characteristics. Uncheck the highlighted checkboxes, as shown in the following screenshot:
    Time for action – starting and customizing Spoon
  6. Select the tab window Look & Feel.
  7. Change the Grid size and Preferred Language settings as shown in the following screenshot:
    Time for action – starting and customizing Spoon
  8. Click on the OK button.
  9. Restart Spoon in order to apply the changes. You should not see the repository dialog, or the Welcome! window. You should see the following screenshot full of French words instead:
Time for action – starting and customizing Spoon

What just happened?

You ran for the first time Spoon, the graphical designer of PDI. Then you applied some custom configuration.

In the Option… tab, you chose not to show the repository dialog or the Welcome! window at startup. From the Look & Feel configuration window, you changed the size of the dotted grid that appears in the canvas area while you are working. You also changed the preferred language. These changes were applied as you restarted the tool, not before.

The second time you launched the tool, the repository dialog didn’t show up. When the main window appeared, all of the visible texts were shown in French which was the selected language, and instead of the Welcome! window, there was a blank screen.

You didn’t see the effect of the change in the Grid option. You will see it only after creating or opening a transformation or job, which will occur very soon!

Spoon

Spoon, the tool you’re exploring in this section, is the PDI’s desktop design tool. With Spoon, you design, preview, and test all your work, that is, Transformations and Jobs. When you see PDI screenshots, what you are really seeing are Spoon screenshots. The other PDI components which you will learn in the following chapters, are executed from terminal windows.

Setting preferences in the Options window

In the earlier section, you changed some preferences in the Options window. There are several look and feel characteristics you can modify beyond those you changed. Feel free to experiment with these settings.

Note

Remember to restart Spoon in order to see the changes applied.

In particular, please take note of the following suggestion about the configuration of the preferred language.

Tip

If you choose a preferred language other than English, you should select a different language as an alternative. If you do so, every name or description not translated to your preferred language, will be shown in the alternative language.

One of the settings that you changed was the appearance of the Welcome! window at startup. The Welcome! window has many useful links, which are all related with the tool: wiki pages, news, forum access, and more. It’s worth exploring them.

Tip

You don’t have to change the settings again to see the Welcome! window. You can open it by navigating to Help | Welcome Screen.

Storing transformations and jobs in a repository

The first time you launched Spoon, you chose not to work with repositories. After that, you configured Spoon to stop asking you for the Repository option. You must be curious about what the repository is and why we decided not to use it. Let’s explain it.

As we said, the results of working with PDI are transformations and jobs. In order to save the transformations and jobs, PDI offers two main methods:

  • Database repository: When you use the database repository method, you save jobs and transformations in a relational database specially designed for this purpose.
  • Files: The files method consists of saving jobs and transformations as regular XML files in the filesystem, with extension KJB and KTR respectively.

It’s not allowed to mix the two methods in the same project. That is, it makes no sense to mix jobs and transformations in a database repository with jobs and transformations stored in files. Therefore, you must choose the method when you start the tool.

Note

By clicking on Cancel in the repository window, you are implicitly saying that you will work with the files method.

Why did we choose not to work with repositories? Or, in other words, to work with the files method? Mainly for two reasons:

  • Working with files is more natural and practical for most users.
  • Working with a database repository requires minimal database knowledge, and that you have access to a database engine from your computer. Although it would be an advantage for you to have both preconditions, maybe you haven’t got both of them.

There is a third method called File repository, that is a mix of the two above—it’s a repository of jobs and transformations stored in the filesystem. Between the File repository and the files method, the latest is the most broadly used. Therefore, throughout this book we will use the files method. For details of working with repositories, please refer to Appendix A, Working with Repositories.

Creating your first transformation

Until now, you’ve seen the very basic elements of Spoon. You must be waiting to do some interesting task beyond looking around. It’s time to create your first transformation.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image