This post is an attempt to dispel a few myths which seems to get repeated over and over among SSAS developers. While the truths are nothing new and have been documented in multiple sources like BOL, SQL CAT whitepapers, books and blog posts, they seem to consistently escape the attention of the wider public.
1 SSAS pre-aggregates data by default
While it is true that SSAS can pre-aggregate data this does not happen by default. SSAS compresses data, indexes data and caches data but it does not pre-aggregate data unless we define aggregations. When I am saying pre-aggregate I mean that SSAS does not automatically know what the Internet Sales Amount for Australia in 2007 is. It needs to get the leaf-level data and sum it up; unless we have built an aggregation on Country and Year for the partition containing the Internet Sales Amount measure. In that case the data is pre-aggregated and ready for retrieval.
2 We can emulate in MDX at no cost Enterprise Edition functionality in Standard Edition
Well, we can’t. We can emulate certain features like LastNonEmpty aggregation functions for example, but it comes at a cost. The cost usually relates to Storage Engine (multi-threaded) vs Formula Engine (single-threaded) execution.
3 SSAS is always faster than SQL Server RDBMS
While it is true that SSAS is faster than SQL Server RDBMS in many cases, this does not always hold true. A particular area in which the relational engine beats SSAS is the retrieval and processing of low-level granular data. SSAS usually beats the RDBMS when it comes to ad-hoc access to aggregated data.
4 MOLAP is always faster than ROLAP
If you read SQL CAT’s “Analysis Services ROLAP for SQL Server Data Warehouses” whitepaper you can see that after careful tuning ROLAP can be faster than MOLAP. Not always, but sometimes – enough to claim that it is not true that MOLAP is always faster than ROLAP. This ties a bit to the previous myth and proves that a well tuned RDBMS can perform very well with aggregates.
From the paper:
“At last, SQLCAT’s redesign and optimization efforts paid off. The ROLAP cube was finally ready for performance testing, and thanks to the amazingly fast performance of the relational SQL Server engine on top of a super-fast storage subsystem, the results looked better than expected. To everybody’s surprise, the ROLAP cube outpaced the MOLAP cube in 45 percent of all queries right from the start (see Figure 14). Only 39 percent of the queries showed substantially slower response times in ROLAP mode (more than twice the amount of MOLAP time) and 16 percent showed moderate performance degradation (less than twice the amount of MOLAP time).”
5 Usage Based Optimisations do not work well
In SQL Server Analysis Services 2008 the Usage Based Optimisation (UBO) algorithm has been redesigned. Now it works, and it works well. It does not create redundant aggregations and in general performs much better. Building UBO aggregations has always been recommended by Microsoft and even more so now.
6 Rigid attribute relationships boost performance
Whether an attribute relationship is Rigid or Flexible does not actually improve performance at all. Not query performance. A wrong choice here only affects processing of partition indexes. If an attribute relationship is static, setting it to Rigid means that you do not have to process partition indexes when you update the dimension. This is all the benefit you get from Rigid relationships. Going too far and marking changing relationships to Rigid may have a very negative impact as a change will prompt a complete process of the partition data and indexes, which will take much longer than updating just the indexes. Either way, there is no difference during query execution.
7 MDX and DAX are hard
I believe that this particular myth stems from the fact that we get to compare MDX and DAX to sweet and fluffy languages like SQL and C#. It all depends on the vantage point. Take the following “Hello world!” program in Malbolge for comparison purposes:
(‘&%:9]!~}|z2Vxwv-,POqponl$Hjig%eB@@>}=<M:9wv6WsU2T|nm-,jcL(I&%$#”
`CB]V?Tx<uVtT`Rpo3NlF.Jh++FdbCBA@?]!~|4XzyTT43Qsqq(Lnmkj”Fhg${z@>
MDX is not all that bad from a Malbolge developer’s point of view, is it?
Nice post
LikeLike
Excellent post. One other thing I would add is that I no longer need a date dimension in my relational data warehouse since SSAS can create a server based time dimension. Can’t tell you how much I hate that feature.
LikeLike
Martin,
It’s not all that bad if you have no SQL script and don’t want to bother with building the dim in Excel. However, it takes a fair bit of customisation/acrobatics after materialising it to get it going properly. Of course not materialising it is a bad practice and I wholeheartedly agree with you there.
LikeLike
Great Post! Very Helpful! Thanks, JT
LikeLike
Good post Boyan.I would only like to add that the Windows Function in TSQL OVER() PARTION BY() also have good speed but it burns down to query patterns.
LikeLike
Boyan,
Thank you for your assistance with the dynamic weather feed in SSRS. I know I thanked you in our email conversations but I just wanted to thank you publicly. The last part of the dynamic weather feed is being implemented, I will inform you about its success or failure.
Thanks again.
LikeLike
Hi Boyan,
Nice post about Myths . Boyan i request you please clarify my question in SSAS .
i have a cube bulid using SSAS . But in each dimension of it i am getting a attribute called “unknown” .I dont have null values in those dimensions for SSAS to convert them to “unknown” ..
When i had seen the dimension properties of that dimension there is a property called UNKNOWN MEMEBER.
that property was set to true .If we change that to none UNKNOWN MEMBER issue would be solved , But what’s confusing me really is , How that UNKNOWN members is getting generated when the dimension is not having null values for any of the attributes..?..
Hope i am clear in explanation .
LikeLike
Hi Rajesh,
Are your columns nullable? If so you’d get the unknowns automatically regardless of whether there are any NULLs in the data.
LikeLike
HI Boyan,
Yes my columns are nullable . But now i changed to not null ,even though it is throwing the same error . The below is the script of my product dimension population .
CREATE TABLE [dbo].[product](
[prodid] [int] NOT NULL,
[proddesc] [nvarchar](max) NOT NULL,
[prodcat] [nvarchar](max) NOT NULL,
[prodsubcat] [nvarchar](max) NOT NULL
) ON [PRIMARY]
Even though it was coming . can u help me ..
LikeLike