Working with advanced operations
In this section, we will focus on advanced dataset operations. We will learn to add calculated fields to our dataset and apply dataset operations such as filtering and joining. Finally, we will add security controls, including row- and column-level filters. For this section, we will use our SPICE dataset that we configured earlier in this chapter.
Adding calculated fields
QuickSight allows BI developers to add calculated fields to their datasets at the data preparation stage. This dataset, including its calculated fields, can then be used to develop multiple analyses.
There are a number of prebuilt functions with QuickSight that we can use when building a calculated field. Those functions include the following:
- Aggregate functions are used to calculate metrics over a number of values. For example, we can use the
min()
function to calculate the minimum value of a set of values. - Conditional functions allow us to use conditional logic in a calculated field, for example,
isNotNull()
returnstrue
if the argument passed is not null. - Date functions help us work with date column types. For example, we can use the
dateDiff()
function to calculate the difference between two date fields. - Numeric functions and string functions can help us work with numerical values and string values respectively. For example, we can calculate the absolute value of a given expression using the
abs()
function or we can concatenate two or more strings using theconcat()
function. - Finally, table calculations, can be used to discover how dimensions influence specific measures. For example, the
sumOver()
function calculates the sum of a measure partitioned by a list of dimensions.
To add a calculated field, we do the following:
- First, we need to open our dataset in the dataset editor.
- Then, we can simply select the Add calculated field option to get to the calculated field editor, as shown here:
- Next, let's add our first calculated field. In our dataset, while we have the pick-up time and drop-off time, it might be interesting to get the difference between these two timestamps in minutes. To calculate this, we can type the following:
dateDiff({Pick up Date},{tpep_dropoff_datetime},"MI")
- Finally, let's give a descriptive name, for example
Trip duration(mins)
, and click Save.
Now we can observe our new calculated field in the dataset editor. Next, we will see other dataset operations including filtering and joining.
Filtering and joining datasets
In this section, we will discuss how to filter and join data sources together.
Filtering datasets
There are scenarios where you might need a subset of the table from a data source. QuickSight supports data source filtering so you can import the required subset of a dataset into SPICE. That allows you to build efficient BI solutions by only using the SPICE capacity that's really required. Adding a filter in QuickSight is simple. In the following example, we will see an example of how to filter out columns and rows from the dataset we created earlier:
- First, you will need to access the dataset editor. On the left-hand side, you can find the Excluded fields and Filters menus.
- For this example, let's assume that we aren't interested in the information in the
store_and_fwd_flag
column. To exclude this column from our dataset, expand the Excluded fields option and then select store_and_fwd_flag. - Now that we have removed a column, let's apply a filter to remove rows. Let's assume that for this analysis, we don't want to include trips if the distance is shown as 0.0, and we only want data where the value is greater than 0.
- Expand the Filter menu and select the
trip_distance
column. Select Greater than from the drop-down menu and then 0. After adding those values, the filter should look as in the following screenshot: - Next, click Apply. QuickSight will trigger a refresh job for the dataset.
Now that we have learned how to apply filters on datasets, in the next section, we will enrich our dataset by joining it with reference data.
Joining datasets
Joining datasets can be used to combine datasets together and provide a single view of our data. Having relevant and good-quality data is key to the success of our BI application. For this example, we will enrich the dataset we used earlier with a reference dataset that can be downloaded from the website provider. Alternatively, we can create a lookup CSV file and upload it to QuickSight. Let's begin:
- First, let's download our small lookup table from https://s3.amazonaws.com/nyc-tlc/misc/taxi+_zone_lookup.csv.
- Next, on the dataset editor, click Add data and then upload the file.
- Upload the lookup file.
- Next, select the location ID for the join column and select Left as the join configuration. See the following screenshot for an example.
- Once applied, you can confirm the new columns in the dataset editor.
Now that we have learned how to join datasets, we will learn how to add security controls to our datasets.