Introducing data engineering in Azure
In recent years, Microsoft Azure has added several powerful services to its arsenal that seamlessly collect, store, process, and publish data for both batch and streaming workloads. Gone are the days where choices for storage and compute were severely limited among cloud vendors. As a user, you simply needed to conform with the supplied tools and services: now, your options are more extensive.
Today, the cloud ecosystem looks very different from what it did previously. The growth of cloud services allows users to choose from a variety of storage, compute, and deployment options. As an example, if I want to run a Spark program, I can choose from at least four different options in Microsoft Azure. The real question is, if all four options are running Apache Spark, then why are these options even required?
Important Note
The array of options available on the cloud are not limited to compute only: the same variety exists for data collection...