PDI, also known as Kettle, is a very powerful tool. It can be used for performing typical Extract, Transform, and Load (ETL) processes. PDI gets data from different sources and manipulates it in many ways (deduplicating, filtering, cleaning, and formatting, among others), saving the data in different formats and destinations. The following diagram illustrates a very simple example of an ETL process designed with PDI:
ETL process
Aside from the preceding processes, PDI serves to migrate data between applications, access and manipulate real-time data, access data in the cloud, orchestrate administrative tasks, and more.