Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Pentaho Data Integration 4 Cookbook

You're reading from   Pentaho Data Integration 4 Cookbook Over 70 recipes to solve ETL problems using Pentaho Kettle

Arrow left icon
Product type Paperback
Published in Jun 2011
Publisher Packt
ISBN-13 9781849515245
Length 352 pages
Edition 1st Edition
Tools
Arrow right icon
Toc

Table of Contents (17) Chapters Close

Pentaho Data Integration 4 Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
1. Working with Databases FREE CHAPTER 2. Reading and Writing Files 3. Manipulating XML Structures 4. File Management 5. Looking for Data 6. Understanding Data Flows 7. Executing and Reusing Jobs and Transformations 8. Integrating Kettle and the Pentaho Suite 9. Getting the Most Out of Kettle Data Structures Index

Getting data from a database


If you're used to working with databases, one of your main objectives while working with PDI must be getting data from your databases for transforming, loading in other databases, generating reports, and so on. Whatever operation you intend to achieve, the first thing you have to do after connecting to the database, is to get that data and create a PDI dataset. In this recipe you will learn the simplest way to do that.

Getting ready

To follow these instructions you need to have access to any DBMS.

How to do it...

  1. Create a transformation and drop into the canvas a Table Input step. You will find it in the Input category of steps.

  2. From the Connection drop-down list select the connection to the database where your data resides, or create it if it doesn't exist.

  3. In the SQL text area, type the SQL statement that returns the data you need. So far you should have something like this:

  4. Click on Preview. This will bring a sample list of rows so you can confirm that the data is as expected.

  5. Press OK to close the Table Input configuration window, and you'll be ready to use the data for further manipulation.

How it works...

The Table Input step you used in the recipe is the main PDI step to get data from a database. When you run or preview the transformation, Kettle executes the SQL and pushes the rows of data coming from the database into the output stream of the step. Each column of the SQL statement leads to a PDI field and each row generated by the execution of the statement becomes a row in the PDI dataset.

Once you get the data from the database, it will be available for any kind of manipulation inside the transformation.

There's more...

In order to save time, or in case you are not sure of the name of the tables or columns in the database, instead of typing the SQL statement press the Get SQL select statement... button. This will bring the Database Explorer window. This window allows you to explore the selected database. By expanding the database tree and selecting the table that interests you, you will be able to explore that table through the different options available under the Actions menu as shown below:

Double-clicking the name of the table will generate a SELECT statement to query that table. You will have the chance to include all the field names in the statement, or simply generate a SELECT * statement. After bringing the SQL to the Table Input configuration window, you will be able to modify it according to your needs.

Note

By generating this statement you will loose any statement already in the SQL text area.

See also

  • Connecting to a database. In order to get data from a database, you need to have a connection to it. This recipe explains how to do this.

  • Getting data from a database by providing parameters. This recipe explains a more flexible way to run database queries.

  • Getting data from a database by running a query built at runtime. This recipe explains an even more flexible method.

You have been reading a chapter from
Pentaho Data Integration 4 Cookbook
Published in: Jun 2011
Publisher: Packt
ISBN-13: 9781849515245
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime