Basic sorting and ranking
Sorting and ranking are very common requirements in business analysis, and MDX provides several functions for this purpose. They are:
TopCount
andBottomCount
TopPercent
andBottomPercent
*
TopSum
andBottomSum
ORDER
- Hierarchize
RANK
All of these functions operate on sets of tuples, not just on one-dimensional sets of members. They all, in some way, involve a numeric expression, which is used to evaluate the sorting and the ranking.
Getting ready
We will start with the classic top five (or top-n) example using the TopCount()
function. We will then examine how the result is already pre-sorted, followed by using the ORDER()
function to sort the result explicitly. Finally, we will see how we can add a ranking number by using the RANK()
function.
Here is the classic top five example using the TopCount()
function:
TopCount ( [Product].[Subcategory].children, 5, [Measures].[Internet Sales Amount] )
It operates on a tuple; ([Product].[Subcategory].children
, [Measures].[Internet Sales Amount])
.
The result is the five [Subcategory]
that have the highest [Internet Sales Amount]
.
The five subcategory members will be returned in order from the largest [Internet Sales Amount]
to the smallest.
How to do it...
In SSMS, let us write the following query in a new Query Editor, against the Adventure Works DW 2016 database. Follow these steps to first get the top-n members:
- We simply place the earlier
TopCount()
expression on the rows axis. - On the columns axis, we are showing the actual
Internet Sales Amount
for each product subcategory. - In the slicer, we use a tuple to slice the result for the year
2013
and theSouthwest
only. - The final query should look like the following query:
SELECT [Measures].[Internet Sales Amount] on 0, TopCount ( [Product].[Subcategory].children, 5, [Measures].[Internet Sales Amount] ) ON 1 FROM [Adventure Works] WHERE ( [Date].[Calendar].[Calendar Quarter].&[2013]&[1], [Sales Territory].[Sales Territory Region].[Southwest] )
- Run the query. The following screenshot shows the top-n result:
- Notice that the returned members are in order from the largest numeric measure to the smallest.
Next, in SSMS, follow these steps to explicitly sort the result:
- This time, we will put the
TopCount()
expression in theWITH
clause, creating it as aNamed Set
. We will name it[Top 5 Subcategory]
. - On the rows axis, we will use the
ORDER()
function, which takes two parameters: which members we want to return and what value we want to evaluate on for sorting. The named set[Top 5 Subcategory]
is what we want to return, so we will pass it to theORDER()
function as the first parameter. The.MemberValue
function gives us the product subcategory name, so we will pass it to theORDER()
function as the second parameter. Here is theORDER()
function expression we would use:ORDER ( [Top 5 Subcategory], [Product].[Subcategory].MEMBERVALUE ) Here is the final query for sorting the result: -- Order members with MemberValue WITH SET [Top 5 Subcategory] as TopCount ( [Product].[Subcategory].CHILDREN, 5, [Measures].[Internet Sales Amount] ) SELECT [Measures].[Internet Sales Amount] on 0, ORDER ( [Top 5 Subcategory], [Product].[Subcategory].MEMBERVALUE ) ON 1 FROM [Adventure Works] WHERE ( [Date].[Calendar].[Calendar Quarter].&[2013]&[1], [Sales Territory].[Sales Territory Region].[Southwest] )
- Executing the preceding query, we get the sorted result as the screenshot shows:
Finally, in SSMS, follow these steps to add ranking numbers to the top-n result:
- We will create a new calculated measure,
[Subcategory Rank]
using theRANK()
function, which is simply putting a one-based ordinal position of each tuple in the set,[Top 5 Subcategory]
. Since the set is already ordered, the ordinal position of the tuple will give us the correct ranking. Here is the expression for theRANK()
function:RANK ( [Product].[Subcategory].CurrentMember, [Top 5 Subcategory] )
- The following query is the final query. It is built on top of the first query in this recipe. We have added the earlier
RANK()
function and created a calculated measure[Measures]
.[Subcategory Rank]
, which is placed on the columns axis along with theInternet Sales Amount
:WITH SET [Top 5 Subcategory] AS TopCount ( [Product].[Subcategory].children, 5, [Measures].[Internet Sales Amount] ) MEMBER [Measures].[Subcategory Rank] AS RANK ( [Product].[Subcategory].CurrentMember, [Top 5 Subcategory] ) SELECT { [Measures].[Internet Sales Amount], [Measures].[Subcategory Rank] } ON 0, [Top 5 Subcategory] ON 1 FROM [Adventure Works] WHERE ( [Date].[Calendar].[Calendar Quarter].&[2013]&[1], [Sales Territory].[Sales Territory Region].[Southwest] )
- Run the preceding query. The ranking result is shown in the following screenshot:
How it works...
Sorting functions, such as TopCount()
, TopPercent()
, and TopSum()
, operate on sets of tuples. These tuples are evaluated on a numeric expression and returned pre-sorted in the order of a numeric expression.
Using the ORDER()
function, we can sort members from dimensions explicitly using the .MemberValue
function.
When a numeric expression is not specified, the RANK()
function can simply be used to display the one-based ordinal position of tuples in a set.
There's more...
Like the other MDX sorting functions, the RANK()
function, however, can also operate on a numeric expression. If a numeric expression is specified, the RANK()
function assigns the same rank to tuples with duplicate values in the set.
It is also important to understand that the RANK()
function does not order the set. Because of this fact, we tend to do the ordering and ranking at the same time. However, in the last query of this recipe, we actually used the ORDER()
function to first order the set of members of the subcategory. This way, the sorting is done only once and then followed by a linear scan, before being presented in sorted order.
As a good practice, we recommend using the ORDER()
function to first order the set and then ranking the tuples that are already sorted.