What this book covers
Chapter 1, Setting Targets and Identifying Problem Areas, describes a Power BI solution as a stream of data from multiple sources reaching consumers in a consolidated fashion. We look at how data can be stored in Power BI and the different paths it can take before reaching a user. Many of the initial architectural design choices made in the early stages of the solution are very difficult and costly to switch later. That’s why it’s important to have a solid grasp of the implications of those choices and how to decide what’s best at the start.
Chapter 2, Exploring Power BI Architecture and Configuration, looks at data storage modes in Power BI and how the data reaches the data model while giving some general guidance to improve throughput and latency. The storage mode chosen can limit size and data freshness. It also covers how to best deploy Power BI gateways, which are commonly used to connect to external data sources. This is important because users often demand up-to-date data, historical data, and aggregated data.
Chapter 3, Learning the Tools for Performance Tuning, explores how the easiest way to see where time is being spent in reports is to use the desktop Performance Analyzer to get detailed breakdowns for every user action, on a per-visual basis. Queries from this tool can be run in DAX Studio for server timing breakdown and better analysis. In addition, Tabular Editor can be used to examine measures for properties and syntax for performance tuning.
Chapter 4, Analyzing Logs and Metrics, describes how performance can only be improved if it can be measured objectively. Therefore, this chapter covers all the sources of performance data and how to make sense of the information provided to identify the parts of the solution that are bottlenecks. This includes useful native and third-party utilities. We also provide guidelines to help monitor and manage performance continuously.
Chapter 5, Optimization for Storage Models, describes how, with the proliferation of data lakes, more options are available for performance improvements with DirectQuery or DirectLake. Synapse has brought Massively Parallel Processing (MPP) from big data to analytical databases. DirectQuery can use the column store type tables in Synapse and other MPPs in the cloud. The use of aggregations with DirectQuery external data sources has become a common choice for large fact tables. There are optimizations that can be made in both Power BI and external sources to avoid hitting limits too quickly.
Chapter 6, Third-Party Utilities, covers a few popular third-party utilities that are effective in performance investigation and tuning and walks through typical use cases around connecting them to Power BI, collecting metrics, and what to look for when diagnosing performance problems.
Chapter 7, Performance Governance and Framework, talks about how the metrics and tools covered in earlier chapters are essential building blocks for performance management. However, success is more likely with a structured and repeatable approach to build performance-related thinking into the entire Power BI solution lifecycle. This chapter provides guidelines to set up data-driven processes to avoid sudden scale issues for new content and prevent degradations for existing content.
Chapter 8, Loading, Transforming, and Refreshing Data, explains how loading data periodically is a critical part of any analytical system, and in Power BI, this applies to Import mode semantic models. Data refresh operations in Import mode are CPU- and memory-intensive, which can lead to long delays or failures, especially with large semantic models. This can leave users with stale data or slow down development significantly, which is why it should be designed with performance in mind.
Chapter 9, Report and Dashboard Design, covers reports and dashboards, which are the “tip of the iceberg” in a Power BI solution since they are what consumers interact with regularly. This chapter covers important considerations and practices to apply regarding visual layout, configuration, and slicing/filtering. It also looks at paginated reports, which behave differently from interactive reports and have special performance considerations.
Chapter 10, Dimensional Modeling and Row-level Security, describes how the Power BI semantic model is where data lands after being shaped, and where data is retrieved for analysis. Hence, it is arguably the most critical piece, at the core of a Power BI solution. Power BI’s feature richness and modeling flexibility provide alternatives when modeling data. Some choices can make development easier at the expense of query performance and/or semantic model size. This chapter provides guidance on model design, size reduction, and faster relationships.
Chapter 11, Improving DAX, covers DAX formulas, which allow BI developers to add a diverse range of additional functionality into the model. The same correct result can be achieved by writing different DAX formulas without realizing that one version may be significantly slower in certain query or visual configurations. This chapter highlights common DAX issues and recommended practices to get calculations performing at their best. It will also contain the definitions and examples for computed columns and measures with a dive into the filter context.
Chapter 12, High-Scale Patterns, explains how the amount of data organizations collect and process is increasing all the time. Even with Power BI’s data compression technology, it isn’t always possible to load and store massive amounts of data in an Import mode model in a reasonable amount of time. This problem is worse when you must support hundreds or thousands of users in parallel. This chapter covers the options available to deal with such issues by leveraging Azure technologies and Power BI aggregations and composite models. In addition, Fabric and Synapse will be utilized for speed improvements in data sources including the lakehouse.
Chapter 13, Working with Capacities, covers working with and monitoring capacity. Power BI offers dedicated capacity, higher limits, and many additional capabilities such as paginated reports and AI. This does, however, require diligent capacity management to prevent resource exhaustion. This chapter covers each of the available workload settings in detail. We then look at ideal to extreme usage/load scenarios and how the capacity manages its memory in each case. We also look at the Microsoft-provided template apps to monitor capacities.
Chapter 14, Performance Needs for Fabric Artifacts, talks about how Fabric options bring new artifacts into the capacity and some that are updated. Performance of the capacity will be affected by pipelines, Lakehouse/warehouse structures, as well as a destination added for Dataflow Gen2. These all have resource requirements and many people will be guided toward a different capacity for using Fabric features.
Chapter 15, Embedding in Web Apps, teaches how embedding Power BI content in a custom web app is a great way to expose data analytics within a completely customized UI experience, along with other no-Power BI-related content. This pattern does introduce additional considerations since the Power BI application is hosted externally via API calls. This chapter looks at how to do this efficiently and then measure performance.