Linking fields between tables
There may be a requirement to create fields in a table that contain data from a separate table. In Excel, this would usually be achieved with a VLOOKUP
function.
The sales model that has been developed in this chapter contains three tables which define Products
, Subcategory
, and Category
. When the user browses the model in a pivot table, each of these tables appear as tables in the PowerPivot Field List pane. However, in this model, the category and subcategory directly relate to the product and it is our intent to show these fields in the Products
table.
Getting ready
This recipe assumes that the sales model created in the Adding fields to tables recipe is available and that the appropriate relationships exist among the Product
, Subcategory
, and Category
tables.
How to do it…
Start by opening the PowerPivot window and then perform the following steps:
- Switch to the data view and create two new columns in the
Products
table titledCategory
andSubcategory
. In theCategory
column enter the following formula:=RELATED(Category[Category])
- In the
Subcategory
column enter the following formula:=LOOKUPVALUE (Subcategory[Subcategory] , Subcategory[product_id],Products[Product ID] )
Tip
Formulas can be multiline (just like in Excel). To move to the next line when typing simply press Alt + Enter.
Hide the Subcategory
and Category
tables in the model by right-clicking on the tables tab and selecting Hide from Client Tools from the pop-up menu. Note that the hidden tables are still visible in the data view and diagram view, although they are now more transparent.
How it works…
These two formulas achieve the same result but in different ways.
The related
function returns the specified column, based on the relationship within the data model. This can span more than one table (for example, a related table to the Category
table could be referenced from the Products
table), however, a relationship must be defined between all the linking tables that are spanned by the formula. Furthermore, because the formula relies on these relationships (that is, those defined within the model), the formula will not result in an error since the model enforces the integrity defined by model relationships.
The LOOKUPVALUE
function is quite different from the related
function because it does not utilize or rely on a relationship within the model. That is, LOOKUPVALUE
would still return the same results had the relationship not be defined between the Products
and Subcategory
tables. Furthermore, the LOOKUPVALUE
function can use multiple columns as its reference (to lookup) which may be beneficial when a desired value in another table cannot be related to the source data through a single field. Note that relationships can only be defined on single columns. However, unlike the RELATED
function, the LOOKUPVALUE
function may return an error when more than one match can be found in the lookup table.
Both formulas return results by creating a row context filter for each row in the source table.
It is considered best to utilize the relationship wherever possible. Therefore, the use of the RELATED
function is preferred over the LOOKUPVALUE
function. Furthermore, the RELATED
function makes the model simpler for others to understand. However, the LOOKUPVALUE
function does have some benefits. It allows the value to be determined, based on multiple search conditions. The syntax for LOOKUPVALUE
is defined as:
LOOKUPVALUE( <result_columnName> , <search_columnName>, <search_value> [, <search_columnName>, <search_value>] …)
Here, a
result_columnName
column is returned from a target table where search conditions are satisfied. These conditions are defined by a search_columnName
parameter and a search_value
parameter. This means that we specify the column (in the lookup table) and the value that should be searched for—this is the field in the current table.