Firstly, we need to get our data into Azure so that HDInsight can see it. We can upload data directly to Azure Storage, or we can use functionality in SQL Server Integration Services (SSIS). SSIS has the capability of connecting to Azure Blob Storage and Azure HDInsight. It enables you to create integration service packages that transfer data between an Azure Blob Storage and the on-premise data source. Then, the Azure HDInsight process can conduct processing on the data.
In order to get the data into HDInsight using SSIS, it's necessary to install the Azure Feature Pack. The Microsoft SSIS Feature Pack for Azure provides SQL Server Integration Services with the capability to connect to many Azure services, such as Azure Blob Storage, Azure Data Lake Store, Azure SQL Data Warehouse, and Azure HDInsight. It is a separate install, and you...