Introducing Pentaho Data Integration
As part of Pentaho's suite of products, Pentaho offers open source data integration. With an intuitive, graphical, drag and drop design environment and a proven, scalable, standards-based architecture, data integration is increasingly the choice for organizations over traditional, proprietary ETL or data integration tools. Pentaho Data Integration in its initial versions was called Kettle, so during your investigations, you may easily find it also named as PDI, DI, or Kettle.
You probably know that ETL refers to the following processes: data extraction is where the data is extracted from homogeneous or heterogeneous data sources, data transformation is where the data is transformed for storing in the proper format or structure for the purposes of querying and analysis, and data loading is where the data is loaded into the final target repository. You will not learn in this book the basics of ETL, but you can easily find lot of literature dedicated to this...