Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Microsoft Power BI Cookbook

You're reading from   Microsoft Power BI Cookbook Gain expertise in Power BI with over 90 hands-on recipes, tips, and use cases

Arrow left icon
Product type Paperback
Published in Sep 2021
Publisher Packt
ISBN-13 9781801813044
Length 656 pages
Edition 2nd Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Greg Deckler Greg Deckler
Author Profile Icon Greg Deckler
Greg Deckler
Brett Powell Brett Powell
Author Profile Icon Brett Powell
Brett Powell
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Configuring Power BI Tools 2. Accessing and Retrieving Data FREE CHAPTER 3. Building a Power BI Data Model 4. Authoring Power BI Reports 5. Working in the Service 6. Getting Serious with Date Intelligence 7. Parameterizing Power BI Solutions 8. Implementing Dynamic User-Based Visibility in Power BI 9. Applying Advanced Analytics and Custom Visuals 10. Administering and Monitoring Power BI 11. Enhancing and Optimizing Existing Power BI Solutions 12. Deploying and Distributing Power BI Content 13. Integrating Power BI with Other Applications 14. Other Book You May Enjoy
15. Index

Diagnosing Queries

The Power Query M engine is an extremely powerful and fast data transformation and data preparation engine used across an array of products, including:

  • Excel for Windows
  • Excel for Mac
  • Power BI
  • Power Apps
  • Power Automate
  • Azure Data Factory
  • SQL Server Integration Services
  • SQL Server Analysis Services
  • Dynamics 365 Customer Insights

While both fast and powerful, there are times when you may find that a particular query is not as performant as desired. In these instances, Query Diagnostics can help you pinpoint problematic expressions and better understand what Power Query is doing in order to identify areas for query optimization. This recipe demonstrates how the user can use Query Diagnostics to troubleshoot a query and identify how the query might be optimized to be more performant.

Getting ready

To get ready, import the FactCallCenter table from the AdventureWorksDW2019 database by doing the following:

  1. Open Power BI Desktop and choose Transform data from the ribbon of the Home tab to open the Power Query Editor.
  2. Create an Import mode data source query called AdWorksDW. This query should be similar to the following:
    let
        Source = Sql.Database("localhost\MSSQLSERVERDEV", "AdventureWorksDW2019")
    in
        Source
    
  3. Isolate this query in a query group called Data Sources.
  4. Right-click AdWorksDW and choose Reference; select the FactCurrencyRate table in the data preview area, and rename the AdWorksDW query to FactCurrencyRate.
  5. Move the FactCurrencyRate query to the Other Queries group.
  6. In the Power Query Editor, select the Tools tab and then Diagnostic Options.
  7. Ensure that the radio button for Enable in Query Editor (does not require running as admin) is selected and check all the boxes under the Diagnostics Level and Additional Diagnostics sub-headings.

Figure 2.46: Query Diagnostic options

For additional details on performing these steps, see the Managing Queries and Data Sources recipe in this chapter.

How to Diagnose Queries

To implement this recipe, perform the following steps:

  1. Select the FactCurrencyRate query, open Advanced Editor, and modify the query to calculate the daily change in currency rates.
    let
        Source = AdWorksDW,
        dbo_FactCurrencyRate = Source{[Schema="dbo",Item="FactCurrencyRate"]}[Data],
        Sort = 
            Table.Sort(
                dbo_FactCurrencyRate,
                {{"CurrencyKey", Order.Ascending}, {"DateKey", Order.Ascending}}
            ),
        Index = 
            Table.TransformColumnTypes(
                Table.AddIndexColumn(Sort, "Row Index", 1, 1),
                {{ "Row Index", Int64.Type }}
            ),
        PrevIndex = 
            Table.TransformColumnTypes(
                Table.AddIndexColumn(Index, "Prev Index", 0, 1),
                {{ "Prev Index", Int64.Type }}
            ),
        SelfJoin = 
            Table.NestedJoin(
                PrevIndex, {"Prev Index"}, PrevIndex, {"Row Index"}, 
                "NewColumn", JoinKind.LeftOuter
            ),
        PrevColumns = 
            Table.ExpandTableColumn(
                SelfJoin, "NewColumn", 
                {"EndOfDayRate", "CurrencyKey"}, {"PrevRate", "PrevKey"}
            ),
        AddChange = 
            Table.AddColumn(
                PrevColumns, "Daily Change", each
                if [CurrencyKey] = [PrevKey] then [EndOfDayRate] - [PrevRate]
                else null, type decimal
            ),
        SelectColumns = 
            Table.SelectColumns(
                AddChange,
                {"CurrencyKey", "EndOfDayRate", "Date", "Daily Change"}
            )
    in
        SelectColumns
    
  2. You may have to wait several minutes for the preview data to display. To investigate what is going on, click on the Tools tab and then Start Diagnostics.
  3. Click back on the Home tab and click Refresh Preview.
  4. Wait for the preview data to refresh and then click back on the Tools tab and Stop Diagnostics.

    Figure 2.47: Diagnostics processing display

  5. When the Diagnostics processing completes, you will have a new query group called Diagnostics containing four queries for Diagnostics_Counters, Diagnostic_Detailed, Diagnostic_Aggregated, and Diagnostic_Partitions. Each of these queries is suffixed with a date and time stamp of when the diagnostics were run, and each has loading disabled.

    Figure 2.48: Diagnostics queries

  6. Click on the Diagnostics_Aggregated query. Select the Step, Category, Start Time, End Time, Exclusive Duration (%), and Exclusive Duration columns and remove the other columns.
  7. Sort the Exclusive Duration column in descending order.

    Figure 49: Diagnostic results

  8. Look for large jumps in the Exclusive Duration column; we see two such jumps between lines 3 and 4, and 4 and 5, with the larger jump being between lines 3 and 4. Also, note the high Exclusive Duration (%) value for row 3.
  9. Look at the Start Time and End Time columns for row 3 and note that this operation took 4 minutes and 26 seconds to complete. The Step column shows that the query step for row 3 is the PrevColumns step.
  10. Having zeroed in on the PrevColumns step of the query, click on the Diagnostics_Detailed query and filter the Step column to just show PrevColumns. If the PrevColumns value does not appear in the filtering dialog, use the Load More link or edit the query in Advanced Editor to add the Table.SelectRows expression manually: Table.SelectRows(#"Changed Type", each [Step] = "PrevColumns").
  11. Sort the query by the Exclusive Duration (%) column in Descending order.
  12. Note that the Operation column for the top row is DbDataReader.

At this point, it should be evident that this query would likely be better done in the source database system as a view or stored procedure.

How it works

The FactCurrencyRate query joins the base table with itself in order to compare subsequent rows of data. The goal is to compare the values of one row with the previous row in order to compute a value for a change in currency rate between days. This is done by sorting the table by first the currency and then the date using the Table.Sort function. Next, the Table.AddIndexColumn function is used twice, once to add an index column starting from 1 (Row Index column) and a second time to add an index column starting from 0 (Prev Index column). These two index columns are then used in a Table.NestedJoin function to join the table to itself. The Table.ExpandTableColumn function is used to expose the previous row's CurrencyKey and EndOfDayRate columns of the previous row. This information can then be used to create the Daily Change column using the Table.AddColumns function, and finally, only the essential columns are selected using the Table.SelectColumns function.

When you start Query Diagnostics, query diagnostic information is logged to JSON and CSV files stored in the application's directory. These files can be located by looking at the Source step for aggregated, detailed, and partition diagnostic queries and the CsvFiles step of the Counters diagnostic query. For the Power BI Desktop Store App, the path should be similar to the following:

C:\Users\[user]\Microsoft\Power BI Desktop Store App\Traces\Diagnostics

These files record diagnostic information performed by the query engine during processes such as refreshing the preview data. Stopping Query Diagnostics ends logging of the diagnostic data and generates queries for each enabled Query Diagnostics feature: Aggregated, Detailed, Performance counters, and Data privacy partitions. As diagnostic logging is costly in terms of performance and system resources, it is recommended to only use Query Diagnostics when troubleshooting a query's performance. In addition, only enable the minimal amount of diagnostic logging required to identify the problem—for example, often just starting with the Aggregated diagnostic data is enough to identify the problematic step(s).

There's more...

There is also a Diagnose Step feature available for Query Diagnostics. To see how Diagnose Step can be used, follow these steps:

  1. Open the Diagnostic Options from the ribbon of the Tools tab and uncheck the Performance counters and Data privacy partitions.
  2. In the Query Settings pane, select the SelfJoin step. You can now either right-click the SelfJoin step and select Diagnose or select Diagnose Step in the ribbon of the Tools tab.
  3. Once complete, two additional queries are added to the Diagnostics query group, FactCurrencyRate_SelfJoin_Detailed and FactCurrencyRate_SelfJoin_Aggregated, each suffixed with a date and time stamp.
  4. Click on the FactCurrencyRate_SelfJoin_Aggregated query.
  5. Add an Index column.
  6. Sort the Exclusive Duration column in descending order.
  7. Here we can see that the most expensive operation occurs early on in the process, at index 15 out of 3,000+ rows, and appears to be the initial selection of columns with the Data Source Query being the following:
    select [_].[CurrencyKey],
        [_].[DateKey],
        [_].[AverageRate],
        [_].[EndOfDayRate],
        [_].[Date]
    from [dbo].[FactCurrencyRate] as [_]
    order by [_].[CurrencyKey],
            [_].[DateKey]
    

It should be evident that diagnosing a single step of a query is faster and consumes fewer system resources than analyzing the entire query. Thus, it is recommended that you run only Aggregated diagnostics over an entire query to identify problematic steps, and then run Detailed diagnostics on those steps individually. If you look at the FactCurrencyRate_SelfJoin_Detailed, the most expensive operation is on line 60, and it is the DbDataReader operation. The SQL statement identified is actually on line 59 with the operation Execute Query. Thus, we can conclude that the most expensive operation performed was not in executing the query, but rather reading the data generated by the query.

See Also

You have been reading a chapter from
Microsoft Power BI Cookbook - Second Edition
Published in: Sep 2021
Publisher: Packt
ISBN-13: 9781801813044
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image