Query execution essentials
Query execution is driven by the Relational Engine in the SQL Database Engine. This means executing the plan that resulted from the optimization process. In this section, we will focus on the highlighted parts of the following diagram that handle query execution:
Figure 1.6: States of query processing related to query execution
Before execution starts, the Relational Engine needs to initialize the estimated amount of memory needed to run the query, known as a memory grant. Along with the actual execution, the Relational Engine schedules the worker threads (also known as threads or workers) for the processes to run on and provides inter-thread communication. The number of worker threads spawned depends on two key aspects:
- Whether the plan is eligible for parallelism as determined by the Query Optimizer.
- What the actual available degree of parallelism (DOP
)
is in the system based on the current load. This may differ from the estimated DOP, which is based on the server configuration max degree of parallelism(
MaxDOP)
. For example, the MaxDOP may be 8 but the available DOP at runtime can be only 2, which impacts query performance.
During execution, as parts of the plan that require data from the base tables are processed, the Relational Engine requests that the Storage Engine provide data from the relevant rowsets. The data returned from the Storage Engine is processed into the format defined by the T-SQL statement, and returns the result set to the client.
This doesn’t change even on highly concurrent systems. However, as the SQL Database Engine needs to handle many requests with limited resources, waiting and queuing are how this is achieved.
To understand waits and queues in the SQL Database Engine, it is important to introduce other query execution-related concepts. From an execution standpoint, this is what happens when a client application needs to execute a query:
Figure 1.7: Timeline of events when a client application executes a query
Tasks and workers can naturally accumulate waits until a request completes – we will see how to monitor these in Building diagnostic queries using DMVs and DMFs. These waits are surfaced in each request, which can be in one of three different statuses during its execution:
Figure 1.8: States of task execution in the Database Engine
Running
: When a task is actively running within a scheduler.Suspended
: When a task that is running in a scheduler finds out that a required resource is not available at the moment, such as a data page, it voluntarily yields its allotted processor time so that another request can proceed instead of allowing for idle processor time. But a task can be in this state before it even gets on a scheduler. For example, if there isn’t enough memory to grant to a new incoming query, that query must wait for memory to become available before starting actual execution.Runnable
: When a task is waiting on a first-in first-out queue for scheduler time, but otherwise has access to the required resources such as data pages.
All these concepts and terms play a fundamental role in understanding query execution and are also important to keep in mind when troubleshooting query performance. We will further explore how to detect some of these execution conditions in Chapter 3, Exploring Query Execution Plans.