Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Microsoft Power BI Cookbook

You're reading from   Microsoft Power BI Cookbook Gain expertise in Power BI with over 90 hands-on recipes, tips, and use cases

Arrow left icon
Product type Paperback
Published in Sep 2021
Publisher Packt
ISBN-13 9781801813044
Length 656 pages
Edition 2nd Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Greg Deckler Greg Deckler
Author Profile Icon Greg Deckler
Greg Deckler
Brett Powell Brett Powell
Author Profile Icon Brett Powell
Brett Powell
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Configuring Power BI Tools 2. Accessing and Retrieving Data FREE CHAPTER 3. Building a Power BI Data Model 4. Authoring Power BI Reports 5. Working in the Service 6. Getting Serious with Date Intelligence 7. Parameterizing Power BI Solutions 8. Implementing Dynamic User-Based Visibility in Power BI 9. Applying Advanced Analytics and Custom Visuals 10. Administering and Monitoring Power BI 11. Enhancing and Optimizing Existing Power BI Solutions 12. Deploying and Distributing Power BI Content 13. Integrating Power BI with Other Applications 14. Other Book You May Enjoy
15. Index

Using DirectQuery

One of the most valuable features of Power BI is its deep support for real-time and streaming datasets, with the ability to provide immediate visibility to business processes and events as this data is created or updated. As Power BI Desktop's data modeling engine reflects the latest Analysis Services features, it becomes feasible to design DirectQuery models or composite models (DirectQuery and import) in Power BI Desktop, and thus avoid the scalability limitations and scheduled refresh requirements of models based on importing data.

The three most common candidates for DirectQuery or composite model projects are as follows:

  • The data model would consume an exorbitant amount of memory if all tables were fully loaded into memory. Even if the memory size is technically supported by large Power BI Premium capacity nodes, this would be a very inefficient and expensive use of company resources as most BI queries only access aggregated data representing a fraction of the size. Composite models which mix DirectQuery and Dual storage mode tables with in-memory aggregation tables is the recommended architecture for large models going forward.
  • Access to near-real-time data is of actionable or required value to users or other applications, such as is the case with notifications. For example, an updateable Nonclustered Columnstore index could be created on OLTP disk-based tables or memory-optimized tables in SQL Server to provide near-real-time access to database transactions.
  • A high-performance and/or read-optimized system is available to service report queries, such as a SQL Server or Azure SQL Database, with the Clustered Columnstore index applied to fact tables.

This recipe walks through the primary steps in designing the data access layer that supports a DirectQuery model in Power BI Desktop. As these models are not cached into memory and dynamically convert the DAX queries from report visualizations to SQL statements, guidance is provided to maintain performance. Additional details, resources, and documentation on DirectQuery's current limitations and comparisons with the default import mode are also included to aid your design decision.

Getting ready

  1. Choose a database to serve as the source for the DirectQuery data model.
  2. Create a logical and physical design of the fact and dimension tables of the model including the relationship keys and granularity of the facts. The AdventureWorksDW database is a good example of data designed in this manner.
  3. Determine or confirm that each fact-to-dimension relationship has referential integrity. Providing this information to the DirectQuery model allows for more performant inner join queries.
  4. Create view objects in the source database to provide efficient access to the dimensions and facts defined in the physical design.

Be aware that DirectQuery models are limited to a single source database and not all databases are supported for DirectQuery. If multiple data sources are needed, such as SQL Server and Oracle, or Teradata and Excel, then the default Import mode model, with a scheduled refresh to the Power BI Service, will be the only option.

How to use DirectQuery

For this recipe, we will use the AdventureWorksDW2019 database that has been used thus far in this chapter. To implement this recipe, follow these steps:

  1. Create a new Power BI Desktop file.
  2. From the Home tab, click on Get Data in the ribbon and then SQL Server.
  3. In the Data Connectivity mode section, choose the DirectQuery radio option.

    Figure 2.14: Creating a DirectQuery data source

  4. Select a table or view to be used by the model via the Navigator dialog, such as the FactResellerSales table, and then click the Transform Data button.
  5. Duplicate the initial query and revise the Navigation step to reference an additional view supporting the model, such as the DimReseller. This can be done by editing the Item in the formula bar or by clicking on the gear icon next to the Navigation step under APPLIED STEPS in the Query Settings pane. Also, rename this query to reflect the data being referenced.

    Figure 2.15: Editing the Navigation step in the formula bar

  6. Repeat step 5 for all required facts and dimensions. For example:
    • DimEmployee
    • DimPromotion
    • DimCurrency
    • DimSalesTerritory
  7. Click the Close and Apply button.

The Report Canvas view will confirm that the model is in DirectQuery mode via the status bar at the bottom right (see Figure 2.16). In addition, the Data view in the left-hand pane, which is visible for import models, will not be visible.

Figure 2.16: DirectQuery Status in Power BI Desktop

How it works

The M transformation functions supported in DirectQuery are limited by compatibility with the source system. The Power Query Editor will advise when a transformation is not supported in DirectQuery mode, per Figure 2.17.

Figure 2.17: A warning in Query Editor that the IsEven function is not supported in DirectQuery mode

Given this limitation and the additional complexity the M-based transforms would add to the solution, it is recommended that you embed all the necessary logic and transforms in the source relational layer. Ideally, the base tables in the source database themselves would reflect these needs. As a secondary option, a layer of views can be created to support the DirectQuery model.

If the database objects themselves cannot be revised, the Value.Native M function can be used to directly pass the SQL statement from Power BI Desktop to the source database, as per Figure 2.18.

Figure 2.18: The Value.Native function used to pass a SQL statement to a source system

As report visualizations are refreshed or interacted with in Power BI, the DAX queries from each visualization are translated into SQL statements, utilizing the source SQL statements to return the results. Be aware that Power BI does cache query results with DirectQuery models. Therefore, when accessing a recently utilized visual, a local cache may be used rather than a new query sent to the source.

The SQL statements passed from Power BI to the DirectQuery data source include all columns from the tables referenced by the visual.

For example, a Power BI visual with SalesAmount from the FactResellerSales table grouped by ResellerName from DimReseller would result in a SQL statement that selects the columns from both tables and implements the join defined in the model. However, as the SQL statement passed embeds these source views as derived tables, the relational engine is able to generate a query plan that only scans the required columns to support the join and aggregation.

There's more...

The performance and scalability of DirectQuery models are primarily driven by the relational data source. A denormalized star schema with referential integrity and a system that is isolated from OLTP workloads is recommended if near real-time visibility is not required. Additionally, in-memory and columnar features available to supported DirectQuery sources are recommended for reporting and analytical queries.

By default, DirectQuery models generate outer join SQL queries to ensure that measures return the correct value even if there's not a related dimension. However, you can configure DirectQuery models to send inner join queries. This is done by editing the relationship between tables in the modeling view by checking the Assume referential integrity setting (see Figure 2.19). Along with source system resources, this is one of the top factors contributing to the DirectQuery model's performance.

Figure 2.19: Activating referential integrity assumption in relationships

Of course, you should ensure that there is referential integrity in the source before enabling this setting; otherwise, incorrect results could be returned.

The design of the source relational schema and the hardware resources of this system can, of course, greatly impact the performance of DirectQuery models.

A classic star-schema design with denormalized tables is recommended to reduce the required join operations at query time. Optimizing relational fact tables with column store technologies such as the Clustered Columnstore Index for SQL Server and table partitions will also significantly benefit DirectQuery models.

See also

  • Power BI Desktop DirectQuery documentation: http://bit.ly/2nUoLOG
  • The Power BI data sources documentation provides a detailed list of data sources broken down by the connectivity options supported: http://bit.ly/30N5ofG
You have been reading a chapter from
Microsoft Power BI Cookbook - Second Edition
Published in: Sep 2021
Publisher: Packt
ISBN-13: 9781801813044
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image