Transforming XML to CSV
Let's start with a simple file format transformation. Many modern applications use Extensible Markup Language (XML) formats to get data in and out. Other, often simpler, systems use a Comma Separated Format (CSV). Common, desktop-based systems, such as Excel and Access, have wizards for taking data in the CSV format. We'll work through the process of taking a simple XML file and extracting its data into a comma-separated format.
Before we dive in and actually start to configure a the Studio job, let's look at the data that we want to transform. Our input file is an XML product catalogue named catalogue.xml
, which is present in the datafiles of this chapter. Open this in the XML viewer of your choice. You can see that the data is pretty self-explanatory. The file contains data about Stock Keeping Units (SKUs). There are a number of repeating SKU elements, each containing an skuid, skuname, size, colour, and price.
We want to extract this data into a spreadsheet-style...