Power BI licensing considerations
At this point, you may be wondering how Power BI licensing affects data modeling. It does, as each licensing tier comes with a set of features that can potentially affect the data modeling. Nevertheless, regardless of the licensing tier, Power BI Desktop is free. This section explains some licensing considerations related to data modeling.
The following table is a simplified version of the Power BI feature comparisons published on the Microsoft website separately based on different licenses:
Figure 1.21: A simplified version of Power BI feature comparisons
The following few sections briefly explain each feature.
Maximum size of an individual dataset
As the preceding table shows, we are limited to 1 GB for each dataset published to the Power BI Service under Free or Professional licensing. Therefore, managing the file size is quite important. There are several ways to keep the file size just below the limit, as follows:
- Import the necessary columns only.
- Import just a portion of data when possible. Explain the technology limitation to the business and ask whether you can filter out some data. For instance, the business may not need to analyze 10 years of data, so filter older data in Power Query.
- Use aggregations. In many cases, you may have the data stored in the source at a very low granularity. However, the business requires data analysis on a higher grain. Therefore, you can aggregate the data to a higher granularity, then import it into the data model. For instance, you may have data stored at a minute level. At the same time, the business only needs to analyze that data at the day level.
- Consider disabling the auto date/time settings in Power BI Desktop.
- Consider optimizing the data types.
We cover all the preceding points in the upcoming chapters.
Incremental data load
One of the most remarkable features available in Power BI is the ability to set up incremental data loads. Incremental data loading in Power BI is inherited from SSAS to work with large models. Power BI does not truncate the dataset and re-import all the data from scratch when the incremental data load is set up correctly. Instead, it only imports the data that has been changed since the last data refresh. Therefore, incremental data load can significantly improve the data refresh performance and decrease the processing load on the Power BI tenant. Incremental data load is available in both Professional and Premium licenses.
Hybrid tables
Microsoft announced a new feature in December 2021 called Hybrid tables, which takes the incremental data refresh capability to the next level. A hybrid table is a regular table with incremental data refresh where the table has one or more partitions in Import mode and another (the last partition) in DirectQuery mode. Therefore, we get all the performance benefits of the two worlds; the historical data is imported into the dataset and is available in memory, while we also have the real-time data in place as the last partition is in DirectQuery mode to the source system. The hybrid table capability is only available in Premium licenses.
Calculation groups
Calculation groups are similar to calculated members in Multi-Dimensional eXpressions (MDX) in SQL Server Analysis Services Multi-Dimensional (SSAS MD). Calculation groups were initially introduced in SSAS Tabular 2019. They are also available in Azure Analysis Services (AAS) and all Power BI licensing tiers.
It is a common scenario that we create (or already have) some base measures in the Power BI model. We then create multiple time intelligence measures on top of those base measures. Suppose we have three measures, as follows:
Product cost = SUM('Internet Sales'[TotalProductCost])
Order quantity = SUM('Internet Sales'[OrderQuantity])
Internet sales = SUM('Internet Sales'[SalesAmount])
Imagine a scenario when the business requires the following time intelligence calculations on top of the preceding measures:
- Year to date
- Quarter to date
- Month to date
- Last year to date
- Last quarter to date
- Last month to date
- Year over year
- Quarter over quarter
- Month over month
We have nine calculations to be built on top of the three measures in our model. Hence, we have 9 x 3 = 27 measures to build in our model. You can imagine how quickly the number of measures can increase in the model, so do not get surprised if someone says they have hundreds of measures in their Power BI model.
Another common scenario is when we have multiple currencies. Without calculation groups, we need to convert the values into strings and use the FORMAT()
function in DAX to represent the numbers in the currency format. Now, if you think about the latter point, combined with time intelligence functions, you see how serious the issue can get very quickly.
Calculation groups solve those sorts of problems. We cover calculation groups in Chapter 10, Advanced Data Modeling Techniques.
Shared datasets
As the name implies, a shared dataset is used across various reports in a workspace within the Power BI Service. Therefore, it is only available in the Power BI Professional and Power BI Premium licenses. This feature is quite crucial to data modelers. It provides more flexibility in creating a generic dataset, covering the business requirements in a single dataset instead of having several datasets that may share many commonalities.
Power BI Dataflows
Dataflows, also called Power Query Online, provide a centralized data preparation mechanism in the Power BI Service that other people across the organization can take advantage of. Like using Power Query in Power BI Desktop for data preparation, we can prepare, cleanse, and transform the data in dataflows. Unlike Power Query queries in Power BI Desktop that are isolated within a dataset after being published to the Power BI Service, with dataflows, we can share all data preparations, cleansing, and transformation processes across the organization.
We can create Power BI dataflows inside a workspace, which is available to Professional and Premium users. We cover Power BI dataflows in Chapter 13, Introduction to Dataflows.
Power BI Datamarts
The Datamart capability is a new feature announced in May 2022. The primary purpose of datamarts is to enable organizations to build self-service, no-code/low-code analytical solutions connecting to multiple data sources, creating ETL (Extraction, Transformation, and Load) pipelines with Power Query, and then loading the data into an Azure SQL Database. The datamart capability is currently available for Premium users only. We cover the datamarts capability in Chapter 15, New Options, Features, and DAX Functions.
We discussed the Power BI licensing considerations for data modeling in Power BI; the following chapter describes the iterative approach to data modeling.