Defining source and target datasets
Datasets are created in a pipeline in order to identify data stored in various data sources in different formats, such as tables, files, folders, documents, and so on. A dataset can be used by multiple activities or pipelines.
Before we start adding some transformations onto the data, we should have the required datasets in place. So, follow these instructions to create a dataset for the source:
- Go to the Data tab in Synapse Studio and click on + on the Data canvas, as highlighted in the following screenshot:
- Select Integration dataset from the dropdown, and select the required data store from the list of all available data stores appearing in the Integration dataset window. In this example, we are going to select Azure Data Lake Storage Gen2 as our data store, and then click on Continue.
- Select the DelimitedText format for your data from the list of all available options...