Excel and MDX – Report Authoring Tips

Excel is a powerful BI tool which is often overlooked as an inferior alternative to Reporting Services for distributing reports because of its potential for causing chaotic mess in organisations. As a number of speakers on the Australian Tech.Ed 2008 pointed out, replacing Excel is often the goal of many organisations when implementing BI solutions and using Excel as a main reporting tool can be often misunderstood by prospective clients. SharePoint Excel services go a long way in helping organisations control the distribution of Excel files and these in combination with a centralised Data Warehouse and Analysis Services cubes on top of it will no doubt be a competitive solution framework, especially when considering the familiarity of business users with the Microsoft Office suite.

During a recent implementation of a BI solution I was approached with the request to provide certain reports to a small set of business users. As the project budget did not allow for a full Reporting Services solution to be built for them, Excel was appointed as the end-user interface to the OLAP cube. The users were quite impressed by the opportunities that direct OLAP access present to them but were quite unimpressed by the performance they were getting when trying to create complex reports.

After investigating the problem I noticed a pattern – the more dimensions they put on a single axis the slower the reports were generated. Profiling the SSAS server showed that Excel generates quite bad MDX queries. In the case where we put one measure on rows and multiple dimensions on columns the Excel-generated MDX query looks like this:

SELECT {[Measures].[Amount]} ON ROWS,
{NON_EMPTY(

[Time].[Month].ALLMEMBERS*

[Business].[Business Name].ALLMEMBERS*

[Product].[Product Name].ALLMEMBERS)} ON ROWS

FROM OLAP_Cube

What this means is that Excel performs a full cross-join of all dimension members and then applies the NON_EMPTY function on top of this set. If our dimensions are large, this could cause significant issues with the performance of the reports.

Unfortunately it is not possible to replace the query in Excel as it is not exposed to us, and even if we could replace it, it would be pointless as changing the user selections of the dimensions to be displayed would cause it to fail. There are some Excel add-ons  available for changing the query string but issues such as distribution of these add-ons and the inability of business users to edit MDX queries diminish the benefits of using them.

While waiting for an optimised query generator in Excel, we can advise business users of ways to optimise their reports themselves. In my case these were:

  1. Consider using report parameters instead of placing all dimensions on rows or columns in a pivot table. This will place them in the WHERE clause of the query instead of the SELECT clause and will not burden the cross-join part of it.
  2. Spread the dimensions as evenly as possible between rows and columns. Having 6 dimensions on one row is worse than having 3 on rows and 3 on columns as the cross-joins will generate smaller sets for NON_EMPTY to filter and ultimately will improve performance.
  3. Consider using less dimensions – if possible split the reports on multiple sheets. This is not always possible but it is better for the users to keep their report-making simple.

In Reporting Services we can write our own query which can be optimised by “cleaning” the cross-join set of empty members in the following manner:

NONEMPTY(NONEMPTY(DimA * DimB) * DimC))

As we cannot do this in Excel, the best way to improve performance is to advise the users against overcomplicating their reports.

Advertisements