A Guide to Currency Conversions in SSAS

In this post I will try to describe different ways we can do currency conversions in SSAS, which should cover most, if not all, requirements. Hopefully, it will also provide some best practice advice in the process, as well.

As a background and reference, I will use some other blog posts, most prolific of which are Christian Wade’s:

http://consultingblogs.emc.com/christianwade/archive/2006/08/24/currency-conversion-in-analysis-services-2005.aspx

http://consultingblogs.emc.com/christianwade/archive/2006/07/25/Measure-expressions_3A00_-how-performant-are-they_3F00_.aspx

The first one deals with currency conversions and compares the Business Intelligence Wizard approach and Measure Expressions, clearly favouring the Measure Expressions (MEs). The second post explores MEs and their strengths as opposed to MDX calculations. Both posts are very interesting and useful. A slight inaccuracy can be found in the summary section of the first post, which describes the MEs as stored on disk, which is untrue. In fact MEs are not stored on disk and are calculated at runtime. Teo Lachev explains their behavior here:

http://prologika.com/CS/forums/p/835/3064.aspx

And another reference to MEs can be found in the SQL Server 2005 Performance Guide:

http://download.microsoft.com/download/8/5/e/85eea4fa-b3bb-4426-97d0-7f7151b2011c/SSAS2005PerfGuide.doc

Last evidence, and possibly most helpful for me was the confirmation about their behavior I got from Gerhard Brueckl, Philip Stephenson and Darren Gosbell in this MSDN Forum thread:

http://social.msdn.microsoft.com/Forums/en/sqlanalysisservices/thread/61cc5840-f8f1-45b6-9a9b-f9af4b21513e

Darren Gosbell also emailed me with another little clarification, which could have big impact on your solutions – and that is the fact that no aggregations are used for a measure group in SSAS where at least one measure has a ME defined for it. This could be very important in some cases. Teo Lachev has blogged about this limitation here:

http://prologika.com/CS/blogs/blog/archive/2010/05/22/in-search-of-aggregations.aspx

Since we have some background knowledge of how currency conversions can be done, I will continue with a brief discussion of how currency amounts can be stored in a data mart.

In the vast majority of cases, a business works with a “base currency”, which is the default currency used to conform all currency amounts throughout all transactions. When a currency conversion needs to be made, typically we would have to multiply or divide the “base currency amount” by some “currency rate”, which will give us as a result the amount in a non-base currency amount. To implement this approach, we could just follow Christian Wade’s ideas of using Measure Expressions, which would give us the best performance (keeping in mind, of course, the limitations of using Measure Expressions).

Another approach is to store both base currency amount, as well as an amount for all the most commonly used currencies throughout the organisation as physical measures. As a result we end up with a few measure columns corresponding to the different currencies (e.g. USDAmount, EURAmount, AUDAmount). Then we just add these to our measure group and we can build a SCOPE statement, which gives us the correct measure when using our Currency dimension. If we want to convert to a currency other that the ones we have already converted, we need to resort to the previously mentioned approach, accepting one of these currencies as a base currency. Because we work with physical measures in the majority of cases, this implementation solves some problems with performance. However, it suffers from increased storage space requirements, which could (for a very large implementation) be severe. Also, if we have multiple measures we need to convert, we need to store [Number Of Measures] x  [Number of Frequently Used Currencies – 1] more measure columns in our database, and subsequently in our cube. When I am saying “solves some problems with performance”, in fact our problems are solved only when we use the currencies we have the converted amounts for. In all other cases, we are at the worst possible case – complete conversion calculation of our measures.

There is a third tactic, which I have recently been working on. Together with the previous two it could potentially yield best possible results. The idea is to store a base currency amount and a local currency amount in another column, as well as adding a Local Currency dimension to the cube. The Local Currency measure contains the base amount converted to the currency, which is “local” for the data. In example, if we have a company which has offices in Sydney and London, the local amounts stored against the Australian offices (based on business unit or geography) will be in AUD, while the amounts for the English ones will be in GBP. However, the base currency could be AUD, in which case in our BaseAmount column the amounts will always be in AUD. Once we have set this up we can do the following:

  1. For reports in the base currency: Use the base currency amount only
  2. For reports in non-base currency:
    1. Take the LocalAmount from the fact table, where the local currency is the selected currency
    2. Convert the BaseAmount from the fact table, where the local currency is not the selected currency
    3. Sum the amounts we get from the previous two steps

In my current solution I have the following cube (showing only the Dimension Usage tab, which I believe illustrates the structure best):

 

Note that I have a Currency Conversion measure group, which stores conversion rates from base currency for each date. It has a many-to-many relationship with the other two measure groups in my cube, while these two measure groups have a Local Currency dimension related to them as described above.

Then the MDX in the cube script for my Deal measure (the physical measure is called [Amount – Trade]) is:

/* This calculation will pick the Local Currency Amount from the fact table with no currency
     conversions applied */
CREATE MEMBER CURRENTCUBE.[Measures].[Local Trade Currency Amount]
AS
    (LinkMember([Currency].[Currency Code].CurrentMember,
                [Local Currency].[Currency Code]),
    [Currency].[Currency Code].[AUD],
    [Measures].[Local Trade Amount]),
ASSOCIATED_MEASURE_GROUP = ‘Deal’,
VISIBLE = 0;

/* Here we pick all other amounts in Base Currency (other than the one selected by the user)
     and then we pull the converted amounts for them (calculation done in a Measure Expression)*/
CREATE MEMBER CURRENTCUBE.[Measures].[Converted Local Trade Currency Amount]
AS
    SUM([Local Currency].[Currency Code].[Currency Code] –
            LinkMember([Currency].[Currency Code].CurrentMember,
                        [Local Currency].[Currency Code]),
        [Measures].[Trade Amount]),
ASSOCIATED_MEASURE_GROUP = ‘Deal’,
VISIBLE = 0;

/* In this combined measure we combine the previous two to retrieve the overall amount
     in the selected currency */
CREATE MEMBER CURRENTCUBE.[Measures].[Amount – Trade]
AS
    SUM({[Measures].[Local Trade Currency Amount],
         [Measures].[Converted Local Trade Currency Amount]}),
    FORMAT_STRING = “Currency”,
    ASSOCIATED_MEASURE_GROUP = ‘Deal’,
    VISIBLE = 1;

When we generate a report in the base currency, the performance is best-possible as no conversions are made (if we utilize the Direct Slice property as described in Christian Wade’s blog posts above). The same holds true for reports in a non-base currency, where 100% of the data can be retrieved from the LocalAmount measure. The worst-case comes when we request a currency, which is not a Local Currency in any sub-slice to the one requested by our query. In this case, we resort to the first approach, where we convert every amount and then sum it up. If some of the data can be retrieved as a LocalAmount, and some of the data cannot, we are in between the best and worst cases and performance will depend on the amount of conversions which need to be performed.

I consider this a useful solution as in many cases reports will either be specific to an office/country and are generated in a local currency only; or are “global” and are using only the base currency for the organisation. Therefore, using this approach, we get best possible performance with no severe trade-off in storage space.

In the end, it depends on the specific organisation and requirements, but we can always combine the second and the third solutions, storing a few converted amounts as physical measures, including a LocalAmount as another one, and converting to a different to these currency with Measure Expressions only when absolutely necessary. This way, we essentially take a “best of both worlds” approach and can obtain best possible results from our solution.

Project Phoenix – Spreading the Word

If you have not heard about it – Project Phoenix is a global progam supported by Microsoft MVPs, who are contributing their MSDN subscriptions to noteworthy causes. Recently Rod Colledge has joined the effort and has brought Project Phoenix Down Under. A brief description of the project:

“…The idea is to provide the recipient access to all of the tools needed to improve his/her skills, an opportunity to gain practical experience, the potential to earn a recommendation and/or referral –and to positively contribute to society as a form of ‘give-back’. No free lunch, just sweat equity –the kind that makes us all feel good for the effort…”

I have always been in favour of such initiatives and I would like to see more MVPs doing what Rod is doing, as well as companies and IT professionals joining in and contributing more towards the cause.

For more information, please refer to:

http://www.rodcolledge.com/rod_colledge/project-phoenix.html

And I will also ask you to help by either contacting Rod, or myself with more info, or at least by spreading the word:

1.      If you know of a suitable unemployed or under-employed software developer that would appreciate and benefit from this initiative,
2.      If you are aware of a non-profit organisation in need of a custom software solution

At least you can help by promoting the project in any way you can.

Tech Ed Online Video Released – Budgeting, Forecasting and Performance Management with the Microsoft BI Stack

Just a quick one – I just noticed that most of the sessions from Microsoft Tech Ed Australia 2010 have been released, including the one from my session. If you don’t mind thick accents (French and Bulgarian) feel free to have a look:

http://www.msteched.com/2010/Australia/DAT307

Ad-Hoc Ranges in SSAS

We can easily build ranges in MDX with the range operator “:”. We can also easily create a Range dimension in SSAS and use it to slice our data. This post is not about either of those. I would like to discuss the scenario where we need to restrict an ad-hoc query (e.g. PivotTable in Excel) by a range of values. Usually, we would tell our users to just select the values they need. This works. However, if our users do not want to be selecting/deselecting many values, we can provide an easier way to do this.

Let’s assume we have an Age dimension for a Kindergarten cube. The Age dimension contains the ages of the children, which can be from 1 to 10. Our requirement is to be able to select low and high limits of ages for a Pivot Table in Excel, so that the data in the Pivot Table is sliced for the range of ages between those limits.

To implement this in practise, we can build two extra dimensions – Age – Low Limit and Age – High Limit, which contain the same members as the Age dimension and then use them to slice our cube. Because the data is the same for all three dimensions, we can use a couple of database views on top of the Ade dimension table, thus ensuring that all three are in sync:

After that, we build two bridging tables BridgeLowLimit and BridgeHighLimit between Age and Age – High Limit, as well as between Age – Low Limit:

The data in these Bridging tables maps each Low and High limit to all Age members which are either lower (for High limit) or higher (for Low Limit) than the limit members:

 

Now, we can define many-to-many relationships between the FactAmount (our fact table), through the Age dimension and the Bridging tables to our limit dimensions as follows:

After this, we can hide the two measure groups for the Bridge tables from the users:

Now, we are ready to process our SSAS database. After that, we get the following in Excel:

If we place the Low and High limits in the Report Filter, Age on rows and our Amount measure on columns we can limit the Age members displayed by changing the filter members. Note that only the lowest member in the Age – Low Limit dimension and the highest in the Age – High Limit dimension matter – everything in between those (in case of multi-selects) effectively get ignored.

There are certain problems with this solution. If we place the limit dimensions on rows and we select multiple members from each dimension, we get the following:

This can be confusing for the users if they want to get distinct ranges like 1-3, 3-6, 6-10. Unfortunately, it is not possible to build a cube calculation which hides the irrelevant members as we do not know what the users have selected in Excel. From there, we cannot determine what members are in the query scope, and from there, we can’t pick only the ones we need (e.g. the ones with the lowest distance between the two limit members).

If we place the two dimensions on Rows and Columns, we can get a nice matrix and this makes a bit more sense:

 

Also, for large dimensions (e.g. Date), this can be quite slow, as the number of rows in the Bridge tables will grow. In example, if we have 10 years in our Date dimension, and we map them the way I just showed we will end up with approximately 6-7 million rows in each Bridge table, which can be quite prohibitive from performance point of view. However, for smaller dimensions (in my opinion everything under 1000 members would be ok as it would generate approximately up to 500,000 rows in each Bridge table). Therefore, if our users insist on this functionality – especially when they have a flat list of 100-1000 members, and they frequently select ranges out of this list – we have a way of answering their need.

How to use ITEM and when ITEM(0).ITEM(0) is redundant

In MDX we have the Item function which can be used in a number of ways. It is important to understand how it works and how it can be used to our advantage.

As a start, we can call Item over a set or over a tuple:

{set}.Item(0) or (tuple).Item(0)

It may be important to note that when we call Item over a set, we get a tuple out of it (sets are collections of tuples), while if we call it over a tuple we get a member.

If we use Item with a tuple, we must specify as an argument the integer position of the member within the tuple which we want. However, when we works with sets, we can either do the same, or specify a number of strings, which identify specific tuples. Some examples:

Item with a tuple:

(a,b).Item(0) = a

SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 ([Product].[Category].&[4], [Date].[Calendar].[Calendar Year].&[2008]).Item(0)
} ON 1
FROM [Adventure Works]

(a,b).Item(1) = b

SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 ([Product].[Category].&[4], [Date].[Calendar].[Calendar Year].&[2008]).Item(1)
} ON 1
FROM [Adventure Works]

Item with a set:

{a,b,c}.Item(0) = a

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0)
} ON 1
FROM [Adventure Works]

{a,b,c}.Item(“a”) = a

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item("([Product].[Category].&[4],
             [Date].[Calendar].[Calendar Year].&[2008])")
} ON 1
FROM [Adventure Works]

{(a1,b1),(a2,b2),(a3,b3)}.Item(“a1″,”b1”) = (a1,b1)

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item("[Product].[Category].&[4]",
           "[Date].[Calendar].[Calendar Year].&[2008]")
} ON 1
FROM [Adventure Works]

When we specify a number of strings as arguments, we get the tuple which is defined by these strings/coordinates.

Now, let’s see what happens when we have a set of tuples and we use Item on it with a single argument:

{(a1,b1),(a2,b2),(a3,b3)}.Item(0) = (a1,b1)

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0)
} ON 1
FROM [Adventure Works]

We get a tuple back. Therefore, if we use a second Item function over the first one, we will get the member on that position from the tuple:

{(a1,b1),(a2,b2),(a3,b3)}.Item(0).Item(0) = (a1,b1).Item(0) = a1

To illustrate the concept:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(0)
} ON 1
FROM [Adventure Works]

This gives us the whole amount for Accessories, while:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(1)
} ON 1
FROM [Adventure Works]

gives us the total amount for 2008.

Even if we do:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(0).Item(0).Item(0).Item(0).Item(0).Item(0).Item(0)
} ON 1
FROM [Adventure Works]

we still get the amount for accessories.

What happens here  is:

  • With the first call of Item(0) over SET1 we get the first tuple from the set (in our case it is ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008])).
  • Then with the second call, we get the first member of this tuple – [Product].[Category].&[4].
  • Now, with the third call of Item(0) over this member, we get the first member from the implicitly converted to tuple member from the previous step. Therefore, we pull out the first member from it which is ([Product].[Category].&[4]).
  • From here onwards we flip between a tuple and a member as a result every time we call Item(0).

But if we do:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(1).Item(1).Item(0).Item(0).Item(0).Item(0).Item(0)
} ON 1
FROM [Adventure Works]

we get nothing back. This is because there is no element on the second position/coordinate of the tuple ([Date].[Calendar].[Calendar Year].&[2008]).

Therefore calling Item(0) after another Item(0) is rarely necessary and should be done only if we need it, because we could either get wrong results or possibly hurt our query performance.

Note: Please read the comments below, where you can find an in-depth discussion about the concepts in this article.

Sneak Peak at my session at Microsoft Tech Ed 2010 Australia

Without revealing too much, if you turn up for the “Budgeting, Planning and Performance Management with the Microsoft BI Stack” on this year’s Tech Ed on the Gold Coast you can expect to see the following:

  1. SharePoint 2010 + PerformancePoint Services + Excel Services in action
  2. Excel 2010 (What-If Analysis in Pivot Tables sourced from write-enabled SSAS cubes)
  3. An enterprise model showing some concepts and ideas
  4. New and enhanced features of the 2010 releases of SharePoint and Excel

Some background knowledge of the Microsoft BI Stack is very much preferable as it can get a bit complex at times, however if you are wondering how Microsoft Business Intelligence can deliver on the keywords in the presentation title you will definitely see some ideas.

  • Seasoned BI professionals will see some new features
  • Inexperienced BI professionals will see a lot of new features and an overall solution architecture which may help them to get further in the area
  • Other IT professionals will see how Microsoft BI is on the way of making their daily jobs obsolete in some areas
  • Business people (especially ones involved in planning and performance management in their organisation) will see how Microsoft can enhance/simplify/make exciting their everyday lives
  • An especially strong warning for senior executives: You may feel like you are currently missing out and experience a sudden urge to allocate more funds to your IT department

Number of Weekdays Between Two Dates

There was an old post here describing some T-SQL code for finding the number of weekdays between two dates, which I wrote. It was working fine, so if you have implemented it you have not done anything wrong. However, Jeff Moden from SQL Server Central has written a post a while ago about this same problem and his implementation is a bit cleaner, and thus I would consider it better than mine. So, here is the link:

http://www.sqlservercentral.com/articles/Advanced+Querying/calculatingworkdays/1660/