When Not To Write MDX and When Not To Use Analysis Services

MDX is a great way to achieve some objectives easily and efficiently. However, there are some things better done in other ways. I will first discuss three distinct mistakes, which designers and developers tend to make when working on a typical BI solution.

1. Leaf-Level Calculations

This is by far the most common one. Both on the MSDN SQL Server forums, and in practice – developers try building calculations in MDX on the leaf level of some dimensions and usually hit severe performance problems. While it is usually possible to build an MDX expression to achieve the goal, it is usually much simpler and way better for performance to just do the calculation either in the ETL, or in the DSV (either as a Named Calculation, or as a part of a Named Query). This avoids the need for the query engine to perform a large number of calculations every time we request an aggregate.

2. Mocking Joins

I have discussed this in a previous post, where I am explaining how we can access a measure group through a dimension, which is not directly related to it (but rather related to it through another measure group and another dimension). Well, instead of doing this, we can simply build a many-to-many relationship between the two and avoid the whole MDX bit.

3. Business Logic over a large dimension

MDX is brilliant for implementing business logic. Unless it needs to operate over millions of dimension members every time a calculation is being requested. In example, recently I tried building a bit of business logic, which needed to order a dimension over a measure, and get the member with a largest value for each member of another dimension with hundreds of thousands of members. On top of it there were other calculations doing similar logic and the end result was not quite what was expected. Even though the MDX was really neat and achieved the purpose in 3-4 lines, I moved the logic back to the ETL (which was quite a bit more complex) because of performance. So, in general, I would not advise in favour of using MDX when to retrieve the result, the query engine needs to go through a lot of cells (in my case quite a few million), especially when ordering is required.

A more fundamental mistake is using Analysis Services in a solution that does not really need it. Two severe and common, in my opinion mistakes are:

1. Data Dumps

Why build a cube when the sole purpose of the project is to allow users to access the underlying data? Unfortunately, sometimes Analysis Services is seen as a silver bullet for everything. If the end report contains a massive amount of data and a key requirement is for it to export nicely to CSV, why bother – just export the data to CSV, zip it up and let the users download it. As far as I know, this can be achieved very easily in a number of other ways. Especially considering the amount of effort and skills needed to build and maintain a SSAS solution.

2. No Aggregations

Another way SSAS gets misused is when a lot of textual data gets stored in a large number of big dimensions, and those get linked in a “fact table”. I have previously worked on a solution where there were no measure columns in the fact table at all and the cube was used to retrieve information about dimension members of the largest dimension called “Member”, containing 4-5 million customers. The rest were dimensions like “Sign Up Date”, “Date Suspended”, “Country of Birth”, “Age Band”, etc. In the end, the main report consisted of the information about the members. No data was aggregated apart from a simple count. The entire OLAP solution could have been replaced by a SQL query with a WHERE clause and an index.

I am sure that there are many other cases when SSAS gets misused. A solution utilising SSAS and MDX properly can be very powerful and elegant. However, sometimes because of poor design choices it gets discarded as inadequate. Don’t blame the tools and the technology if your cube is slow – it is most likely a problem with either your design or the way you have developed your solution.

Advertisements

2 thoughts on “When Not To Write MDX and When Not To Use Analysis Services”

  1. Most of these would not be an issue for MDX if SSAS had one feature…the concept of persisting the results of calculations at the time of cube processing, as opposed to caching the results only when those intersections were part of a query. SSAS 2005 has this just prior to SP1, but they yanked the feature out probably due to time pressure and budget. Too bad, that one missing feature gives MDX, a perfectly logical and sensible multi-dimensional expression syntax, the reputation of a far less utilitarian query language. Too bad because with OLAP/MOLAP, it's all about adhoc query speed. Maybe DAX in PowerPivot does not or will not continue this shortcoming…if the PP PM's are out there, keep this in mind for PP2.0…

    Like

  2. I’m very nuisant into the practice of theory:

    MS pushed analysis services to be the single version of truth data-source for all of your BI.
    I noticed various approaches like :
    – reporting with SQL
    – sole analysis with cubes

    Or reporting and analysis on the cube.

    I think using two approaches in enterprise BI, might be difficult as you need to rebuild business logic and have 2 places to maintain. When you chose not to go strong for one approach you might face a lot of scope creep regarding your insights (where to put what), and besides of that:
    – have a difficult solution to maintain
    – will not be able to reproduce analytically (with olap browser) what is reported through static reports

    And it’s just the latter which should make modern BI distinguished from the old days

    Like

Comments are closed.