Thursday, July 14, 2016

Set operation within MDX and months of cooperation case

When it comes to MDX optimization most likely you have to avoid Set functions, really. Sometimes it's impossible, but avoiding them as often as they show up is the key to reach good performance. The idea behind SSAS is every measure has a coordinate (tuple) and you can't put there more then one member. For example, if you have the following calculation:
([Date].[Day].&[20141001],[Measures].[Invoice Amount])
It returns amount for the hard-coded day, but what are you going to do when you want, say, the same for two days? Probably first your guess is appending missing member:
([Date].[Day].&[20141001],[Date].[Day].&[20141002],[Measures].[Invoice Amount])
The result is NULL. Why? You put here two members in one tuple. It's like you set the coordinate (X1,X2,Y1,Z1) which is invalid, because there can be the only one X. In order to calculate it you should tell what to do with this members. You may aggregate it by native aggregation function:
Aggregate(
    {[Date].[Day].&[20141001],[Date].[Day].&[20141002]},
    [Measures].[Invoice Amount]
)
Now it results properly as you wished. Note that, you cannot use Aggregate function with calculated measures, you have to set certain one instead (Sum, Avg, Min, etc).

Let's get to performance testing to feel the difference. What we get when recalculate the same measure by days within one months or using only on month member.

The difference is huge and when you have to recalculate the measure for every tuple. It's non so noticeable for small number of tuples, but when your cube is huge, your performance will be very low. So as I said at the beginning — avoid using Set functions within MDX. Let's get to the real example.

Months of cooperation case

Let's say, we want to analyze how many months the client work with us. In order to do so we have to add two extra measures min and max date by invoices.
Since the measures are done we get the result in YYYYMMDD format and can't get count of months
arithmetically. Actually we can get it by using Set functions without extra measures: 

As you see it's memory high-cost operation. Let's use new measures without Set operations. We have to find diffs between years and months:

The code:
 (
    Round(
        ([Measures].[Invoice Last Date] - [Measures].[Invoice First Date]) / 10000 -- (20151009 - 20140906) / 10000 = 1 (years diff)
    ) * 12 -- 1 * 12 = 12 (convert years count to months count)
)
+
(
    Round([Measures].[Invoice Last Date] / 100)
    -
    Round([Measures].[Invoice Last Date] / 10000) * 100
) -- 201510 - 201500 = 10 (last months count)
-
(
    Round([Measures].[Invoice First Date] / 100)
    -
    Round([Measures].[Invoice First Date] / 10000) * 100
) -- 201409 - 20140900 = 9 (first months count)
+
1 -- extra missing month

Also you can do the same with cropping value using left and right functions and a little bit slower than previous code.


The code:
(
    Left([Measures].[Invoice Last Date],4)
    -
    Left([Measures].[Invoice First Date],4)
) * 12
+
Right(Left([Measures].[Invoice Last Date],6),2)
-
Right(Left([Measures].[Invoice First Date],6),2)
+
1
To sum it up once again: try to avoid Set MDX functions for performance reasons. Yo can find list of Set functions here.

No comments:

Post a Comment