In this article, we will look at the basic concept of data modeling, its various types, and learn which technique is best suited for Qlik Sense dashboards. We will also learn about the methods for linking data with each other using joins and concatenation.
For this article, we will use the app created earlier in the book, as a starting point with a loaded data model. You can find it in the book's GitHub repository. You can also download the initial and final version of the application from the repository.
After downloading the initial version of the application, perform the following steps:
Data modeling helps business in many ways. Let's look at some of the advantages of data modeling:
There are various techniques in which data models can be built, each technique has its own advantages and disadvantages. The following are two widely-used data modeling techniques.
The entity-relationship modeling (ER modeling) technique uses the entity and relationships to create a logical data model. This technique is best suited for the Online Transaction Processing (OLTP) systems. An entity in this model refers to anything or object in the real world that has distinguishable characteristics. While a relationship in this model is the relationship between the two or more entities.
There are three basic types of relationship that can exist:
The dimensional modeling technique uses facts and dimensions to build the data model. This modeling technique was developed by Ralf Kimball. Unlike ER modeling, which uses normalization to build the model, this technique uses the denormalization of data to build the model.
Facts, in this context, are tables that store the most granular transactional details. They mainly store the performance measurement metrics, which are the outcome of the business process. Fact tables are huge in size, because they store the transactional records. For example, let's say that sales data is captured at a retail store. The fact table for such data would look like the following:
A fact table has the following characteristics:
The dimension table stores the descriptive data, describing the who, what, which, when, how, where, and why associated with the transaction. It has the maximum number of columns, but the records are generally fewer than fact tables. Dimension tables are also referred to as companions of the fact table. They store textual, and sometimes numerical, values. For example, a PIN code is numeric in nature, but they are not the measures and thus they get stored in the dimension table.
In the previous sales example that we discussed, the customer, product, time, and salesperson are the dimension tables. The following diagram shows a sample dimension table:
The following are the characteristics of the dimension table:
There are two types of dimensional modeling techniques that are widely used:
The advantages of the star schema model include the following:
The following diagram shows an example of the star schema model:
The following diagram shows an example of the snowflake schema model:
When it comes to data modeling in Qlik Sense, the best option is to use the star schema model for better performance. Qlik Sense works very well when the data is loaded in a denormalized form, thus the star schema is suitable for Qlik Sense development. The following diagram shows the performance impact of different data models on Qlik Sense:
Now that we know what data modeling is and which technique is most appropriate for Qlik Sense data modeling, let's look at some other fundamentals of handling data.
While working on data model building, we often encounter a situation where we want to have some fields added from one table into another to do some sort of calculations. In such situations, we use the option of joining those tables based on the common fields between them.
Let's understand how we can use joins between tables with a simple example. Assume you want to calculate the selling price of a product. The information you have is SalesQty in Sales Table and UnitPrice of product in Product Table. The calculation for getting the sales price is UnitPrice * SalesQty. Now, let's see what output we get when we apply a join on these tables:
There are various kinds of joins available but let's take a look at the various types of joins supported by Qlik Sense. Let's consider the following tables to understand each type better:
OrderNumber | Product | CustomerID | OrderValue |
100 | Fruits | 1 | 100 |
101 | Fruits | 2 | 80 |
102 | Fruits | 3 | 120 |
103 | Vegetables | 6 | 200 |
CustomerID | Name |
1 | Alex |
2 | Linda |
3 | Sam |
4 | Michael |
5 | Sara |
When you want to get the data from both the tables you use the Join keyword. When you just use only Join between two tables, it is always a full outer join. The Outer keyword is optional. The following diagram shows the Venn diagram for the outer join:
Now, let's see how we script this joining condition in Qlik Sense:
As the output of Outer Join, we got five fields, as shown in the preceding screenshot. You can also observe that the last two rows have null values for the fields, which come from the Order table, where the customers 4 and 5 are not present.
When you want to extract all the records from the left table and matching records from the right table, then you use the Left Join keyword to join those two tables. The following diagram shows the Venn diagram for left join:
Let's see the script for left join:
When you want to extract all the records from the right table and the matching records from the left table, then you use the right join keyword to join those two tables. The following diagram shows the Venn diagram for right join:
Let's see the script for right join:
When you want to extract matching records from both the tables, you use the Inner Join keyword to join those two tables. The following diagram shows the Venn diagram for inner join:
Let's see the script for inner join:
Sometimes you come across a situation while building the data model where you may have to append one table below another. In such situations, you can use the concatenate function. Concatenating, as the name suggests, helps to add the records of one table below another. Concatenate is different from joins. Unlike joins, concatenate does not merge the matching records of both the tables in a single row.
When the number of columns and their naming is same in two tables, Qlik Sense, by default, concatenates those tables without any explicit command. This is called the automatic concatenation. For example, you may get the customer information from two different sources, but with the same columns names. In such a case, automatic concatenation will be done by Qlik, as is shown in the following screenshot:
You can see in the preceding screenshot that both the Source1 and Source2 tables have two columns with same names (note that names in Qlik Sense are case-sensitive). Thus, they are auto concatenated. One more thing to note here is that, in such a situation, Qlik Sense ignores the name given to the second table and stores all the data under the name given to the first table.
The output table after concatenation is shown in the following screenshot:
There will be some cases in which you would like to concatenate two tables irrespective of the number of columns and name. In such a case, you should use the keyword Concatenate between two Load statements to concatenate those two tables. This is called the forced concatenation.
For example, if you have sales and budget data at similar granularity, then you should use the Concatenate keyword to forcefully concatenate both tables, as shown in the following screenshot:
The output table after loading this script will have data for common columns, one below the other. For the columns that are not same, there will be null values in those columns for the table in which they didn't exist. This is shown in the following output:
You can see in the preceding screenshot that the SalesAmount is null for the budget data, and Budget is null for the sales data.
In some situations when even though the columns and their name from the two tables are the same, you may want to treat them differently and don’t want to concatenate them. So Qlik Sense provides the NoConcatenate keyword, which helps to prevent automatic concatenation.
You should handle the tables properly; otherwise, the output of NoConcatenate may create a synthetic table.
In this section, we will learn how to filter the data while loading in Qlik Sense. As you know, there are two ways in which we can load the data in Qlik Sense: either by using the Data manager or the script editor. Let's see how to filter data with each of these options.
When you load data using the Data manager, you get an option named Filters at the top-right corner of the window, as shown in the following screenshot:
This filter option enables us to set the filtering condition, which loads only the data that satisfies the condition given. The filter option allows the following conditions:
Using the preceding conditions, you can filter the text or numeric values of a field. For example, you can set a condition such as Date >= '01/01/2012' or ProductID = 80. The following screenshot shows such conditions applied in the Data load editor:
If you are familiar with the Load statement or the SQL Select statement, it will be easy for you to filter the data while loading it. In the script editor, the best way to restrict the data is to include the Where clause at the end of the Load or Select statement; for example, Where Date >= '01/01/2012'.
When you use the Where clause with the Load statement, you can use the following conditions:
When you write the Where clause with the SQL Select statement, you can use the following conditions:
The following screenshot shows an example of both the statements:
This article walked you through various data modeling techniques. We also saw different types of joins and how we can implement them in Qlik Sense. Then, we learned about concatenation and the scenarios in which we should use the concatenation option. We also looked at automatic concatenation, forced concatenation, and NoConcatenation. Further, we learned about the ways in which data can be filtered while loading in Qlik Sense.
If you found this post useful, do check out the book, Hands-On Business Intelligence with Qlik Sense. This book teaches you how to create dynamic dashboards to bring interactive data visualization to your enterprise using Qlik Sense.
5 ways to create a connection to the Qlik Engine [Tip]
What we learned from Qlik Qonnections 2018
Why AWS is the preferred cloud platform for developers working with big data