You're reading from Limitless Analytics with Azure Synapse An end-to-end analytics service for data processing, management, and ingestion for BI and ML

Product type Paperback

Published in Jun 2021

Publisher Packt

ISBN-13 9781800205659

Length 392 pages

Edition 1st Edition

Languages

Python

Tools

Azure

Concepts

Data Processing

Authors (2):

Saranya Ravichander

Prashant Kumar Mishra

View More author details

Table of Contents (20) Chapters

Preface

1. Section 1: The Basics and Key Concepts

2. Chapter 1: Introduction to Azure Synapse FREE CHAPTER

3. Chapter 2: Considerations for Your Compute Environment

4. Section 2: Data Ingestion and Orchestration

5. Chapter 3: Bringing Your Data to Azure Synapse

6. Chapter 4: Using Synapse Pipelines to Orchestrate Your Data

7. Chapter 5: Using Synapse Link with Azure Cosmos DB

8. Section 3: Azure Synapse for Data Scientists and Business Analysts

9. Chapter 6: Working with T-SQL in Azure Synapse

10. Chapter 7: Working with R, Python, Scala, .NET, and Spark SQL in Azure Synapse

11. Chapter 8: Integrating a Power BI Workspace with Azure Synapse

12. Chapter 9: Perform Real-Time Analytics on Streaming Data

13. Chapter 10: Generate Powerful Insights on Azure Synapse Using Azure ML

14. Section 4: Best Practices

15. Chapter 11: Performing Backup and Restore in Azure Synapse Analytics

16. Chapter 12: Securing Data on Azure Synapse

17. Chapter 13: Managing and Monitoring Synapse Workloads

18. Chapter 14: Coding Best Practices

19. Other Books You May Enjoy

Understanding Spark pool

Apache Spark is a very fast unified analytics engine for big data and machine learning.

Synapse Spark Pool is one of Microsoft's implementations of Apache Spark in Azure. Synapse Analytics workspace has a Spark engine built in, along with Notebook support. Because Synapse Spark supports C#, we can write Spark .NET directly within notebooks. You can also write your code in Python, Scala, C#, and SQL.

One Spark pool can be accessed by multiple users, but for every user, one new Spark instance will be created. A Spark instance is also dependent on the Spark pool capacity: if there is enough capacity in the pool to run multiple queries, the existing instance will be able to process the job; otherwise, a new instance will be created to process the job.

The following diagram displays different components of Apache Spark on Azure Synapse: