December 2010 – Boyan Penev on Microsoft BI

2011: Merry Christmas and Happy New Year

I am on my way out of the office, coming back in early January. Until then I will be camping around Victoria, Australia. It is time for me to recap 2010 and start forecasting and planning for 2011.

Many interesting things happened this year in the world of Microsoft BI and in my personal bubble. We saw the beginning of a new trend around in-memory/self-service BI, the foundations of a new visualisation product – Crescent, SQL Server 2008 R2, the Denali CTP1, Office 2010, etc, etc, etc. 2011 looks promising and exciting in the end of 2010.

In my personal world I moved to Melbourne, presented at Tech Ed Australia, managed to continue my engagement in the Microsoft MSDN forums, learned a lot, became an MCTS and MCTIP in BI, etc, etc, etc. This blog reached over 250 subscribers, 2500 unique visitors per month, more than 20000 raw requests a week and the comments start exceeding the posts with a factor of 2. It is a pleasure to see that you, as my readers, enjoy the content I am sharing and I hope that the upwards trend will continue in 2011.

A very special “thank you” this year go to Darren Gosbell, Paul Hales, Lionel Gomes Da Rosa, Sirish Korada, Nick Barclay, and Maxim Yefremov. All of these guys in some way contributed to making my world a better place during 2010.

To all my readers – Merry Christmas and a Really Happy New Year!!!

Line Breaks in SSRS

There are a number of ways we can add a line break to a Reporting Services string and these have been described in multiple articles online. Doing this we can get multiple lines per cell. The following content is intended to be a quick reference rather than a description of a new or better way to do it.

SSRS Expressions

The easiest is to add a vbCrLf to an expression like this:

=”Line 1″ + vbCrLf + “Line 2”

An easy way to remember the syntax is vb for VB, Cr for Carriage Return and Lf for Line Feed.

Alternatively, you can replace any character (in example a pipe), which exists in your data with vbCrLf and are not bound to CHAR(10) only – in case your query gets simple because you already have a suitable line break string in it. If you have a string like: “A|B|C”, and you want to replace the “|” character with a line break, you can do the following:

=Replace(“A|B|C”,”|”,vbCrLf)

The result is A, B and C on a separate line.

SQL

A more flexible and many times desirable way to do it is to add the line break to the data query. In SQL this would be by adding a CHAR(10) for the same purpose since CHAR(10) is Line Feed in SQL. Have a look at this query:

SELECT ‘Line 1’ + CHAR(10) + ‘Line 2’

If you use this in SSRS you will get the same output as with vbCrLf. This way we can construct our reports with line breaks in the database, giving us some flexibility.

MDX

In MDX we can also do something like this:

WITH
MEMBER [Measures].[Split Members] AS
“Line 1” + Chr(10) + “Line 2”
SELECT
{
[Measures].[Split Members]
} ON 0
FROM [Adventure Works]

This is much like in SQL – we just use the Chr() function to do the same.

Custom Code

We can also do the same with custom code. By using custom code we can split strings and create additional custom operations. For more info about using custom code for this purpose you can have a look at the following article:

http://www.kodyaz.com/articles/reporting-services-add-line-break-between-words-custom-code.aspx

Note that the operations that the author is performing can be done in SSRS natively with its Replace() function. Nevertheless, the article shows the fundamentals in a simple way.

Another interesting thing is that SSRS recognises CHAR(10) as both Carriage Return and Line Feed. In my testing (on SQL Server 2008 R2), CHAR(10) or Chr(10) does exactly the same as CHAR(10)+CHAR(13). Therefore we do not need to worry about adding a Carriage Return (CHAR(13)) in front of CHAR(10) in SSRS.

If for some reason you have troubles with CHAR(10) from SQL or other sources and you know it is in your character string, you can try replacing it with vbCrLf in SSRS.

3×3 Worst Development Practices 2010 with SSIS, SSAS and SSRS

There are some practices which are outright bad and some which are sometimes bad. I will try to make a list of top three outright worst practices per the main components of SSIS, SSAS and SSRS which I have annoyed me most throughout 2010 (and previous years). I am sure there are many more, but these are ones which I have witnessed often enough to deserve some attention.

SSIS

Using default names for all components in a control flow and a data flow

I really dislike opening projects and seeing only generic names. I have built some in the past because of insufficient time to document everything, but I hated creating them and later leaving them like that. I have also seen developers with time on their hands carelessly omitting them. The worst practice is to actually rename them to an equally unintelligible, if better, language. Many ETL-out-of-the-box tools actually do that and make subsequent maintenance much harder.

Performing joins with Merge Join

Chaining 5 Merge Joins and placing Sort components between them is something SQL Server does faster and with much less resources. Therefore doing joins in the database is a no-brainer even if your preference lies with a fully SSIS-idised process.

Not using configurations

Lucky, in Denali we are getting a completely revamped project configurations approach. Even if imperfect, Package Configurations are a must for every SSIS project. Migrating projects, changing databases, setting variables are just too good to miss.

SSAS

Using MDX for everything

MDX has an aura of a language which allows you to do everything much faster than SQL. Well, this is not entirely true and in the majority of cases a change in the design will allow avoidance of MDX, which in turn removes a whole layer of complexity allowing tremendous improvements over any programmatical approach.

Importing too much dimensional data

Often developers underestimate the impact of unnecessarily large dimensions with many useless attributes. Here the best approach is to add more as you need them rather than import everything because we can. Cubes are smaller and less complex with less attributes, processing is faster, and the most important thing is that the users actually like having less clutter.

Leaving NULLs in the relational database

SSAS does not like NULLs. I have not seen a case when NULLs are necessary in dimension tables. While I would leave a NULL for a measure when there is no data for it, too many of these can be quite harmful and it may be a case of a bloated measure group, as well.

SSRS

Storing business logic in report queries

While we can, we should never store SQL and MDX logic which can be implemented in stored procedures or calculated measures in the reports. Maintaining SSRS queries is much harder and less transparent. There are performance implications around caching, too.

Merging data sets in SSRS

This is hard to be done, and when it is, it is always bad. Data sets should be merged in the queries – not in the reports.

Building massive reports

In many organizations reports are seen as a transport mechanism to Excel. Users generate large, hundreds of thousands of rows report with a large amount of columns and sometimes even basic calculations. All this can be done simply and easily on the database level without going through SSRS.

Why Choose PowerPivot?

If you are on the market for self-service, or in-memory BI tools you have some options. You have to consider functionality, cost and the future. If you are a SQL Server and/or SharePoint and/or Microsoft Office user, PowerPivot should be a top-of-the-list contender. I will discuss a few points to show why.

Cost

First and foremost, it is cheap. To utilise PowerPivot you need Excel and possibly SharePoint. If you need to empower your users with the capability to expand Excel and do heavy analytics on a workstation PC, PowerPivot is essentially free for Excel. If you want to let them collaborate and share their work, then SharePoint comes to the mix. If you do have SharePoint Enterprise in your organisation, then PowerPivot is, again, free. A free self-service BI platform – not a bad option, is it? Surely the cheapest.

Functionality

Functionality-wise, the outstanding “feature” is the integration with Excel. How many other self-service BI tools out there allow you a seamless integration with Excel? When Excel users become PowerPivot users, they have all the capabilities of Excel, plus PowerPivot. They can pick Top/Bottom 10, flick to percentage representation of values, use the charting functionality of Excel the way they are used to; they can also utilise the rest of the Excel functionality they love. With a little bit of DAX knowledge they can build new calculations on top of massive data sets. Writing [Quantity]*[Price] gives us [Sales Amount]. Simple, isn’t it? Furthermore, if you prepare a nice, well-referenced datamart for them you do not need to worry about the lack of knowledge of SQL – all modelling gets done in Excel in a very familiar for users environment – spreadsheets with rows and columns. Data can be previewed, filtered and ordered; new columns can be added with Excel-like syntax – a paradise for moderately Excel-savvy users.

Once ready with the model, if users want to share their work they can simply publish to SharePoint. From there other users can either browse the workbooks (if they have a browser – right…), or if they are interested in more on-the-fly analysis they can connect to the workbooks through Excel and slice/dice the data just the way they do with SSAS cubes. No need for client installations and no need for powerful workstations. In fact, to connect to published models they only need Excel 2003 and Windows XP. The minimum hardware requirements for those are, well…minimum.

As for IT Services departments – they still can manage the situation. They can monitor, advise and service – precisely their purpose. While a standalone, isolated and incompatible server could be a problem, the sort of manageability and visibility BISM and PowerPivot offer will, no doubt, appeal to ITS.

The Future

Let’s zap to the future. Microsoft has made a strong commitment for a multitude of future enhancements. The models will be available in SQL Server Analysis Services and DAX will get massively enhanced. There will be numerous enhancements on the modelling side, querying side, engine side, etc, etc – all in all – BISM and PowerPivot have a great future. In fact, from what I can sense, BISM and DAX will become more and more powerful and if you commit to spending your money on another product I can guarantee that you will be thinking back and regret this step, especially if you like Excel and SQL Server. Just think back of where SSAS was in 2000 and where it is now. Well, by what it looks like at this moment, we’ll have a similar situation after Denali and ahead of it. Better than ever will be the integration between the components in the Microsoft BI stack, too. With Crescent and SSRS reading BISM models easier, the pegs will fit together even more seamlessly.

Am I advertising Microsoft BI? Yes. Am I objective – maybe no, but allow me to be excited about it. From the poll on the top right of this blog, where I asked how you feel about the recent BISM announcements, I noticed that many people are either “Angry”, or “Excited” about the new developments around SSAS. I wonder what will the reactions be when Denali ships and more people get hand-on experience with BISM. Will there be as many “Angry” people out there? I doubt it. Will everyone get ecstatic – well, maybe not, but I believe that a lot of users will get more Excited/Happy about it.

Examining SSAS Write-Back – Updates and Transactions

SSAS write-back is a very under-used feature of SSAS in my view. It can be very powerful when we have a good understanding of its capabilities. A good first step is to explore how it works through some MDX.

For the purpose of this article, I am using a very simple cube, which has two dimensions and one measure. I have enabled MOLAP write-back for its only partition and I have processed it, thus creating its Write Table. I will explore what happens when we issue different transaction management and UPDATE CUBE statements in varying order.

The first thing we should understand is that SSAS utilises transaction management for the write-back operations. There are three statements which we can use for this:

BEGIN TRAN[SACTION]

COMMIT TRAN[SACTION]

ROLLBACK TRAN[SACTION]

The [SACTION] part is optional and we can freely omit it.

To perform the actual updates, we use an UPDATE statement. You can find its definition here:

http://technet.microsoft.com/en-us/library/ms145488.aspx

It is worth to note that the BEGIN TRAN statement is not necessary since UPDATE issues a BEGIN TRAN statement implicitly. However, to complete a transaction we must either execute a COMMIT or a ROLLBACK statement. Also, important note here is that the transaction scope is within a session. An uncommitted transaction is visible only in its session. Once a COMMIT is called, the transaction gets written to the Write Table, and becomes visible for all cube statements (i.e. to all sessions).

So, let’s work with an MDX sample statement:

BEGIN TRANSACTION

UPDATE CUBE [Service Request] SET
([Service Start Date].[FiscalYear].[Fiscal Period].&[2011]&[2011-Q4]&[2011-13],
[Service End Date].[FiscalYear].[Fiscal Period].&[2011]&[2011-Q4]&[2011-13],
[Measures].[Measure 1 Sum])=10
USE_EQUAL_ALLOCATION

SELECT
{
([Service Start Date].[FiscalYear].[Fiscal Period].&[2011]&[2011-Q4]&[2011-13],
[Service End Date].[FiscalYear].[Fiscal Period].&[2011]&[2011-Q4]&[2011-13],
[Measures].[Measure 1 Sum])
} ON 0
FROM [Service Request]

ROLLBACK TRAN

COMMIT TRAN

Note that I am running this in an SSMS query window and I cannot execute multiple statements. Instead, we can select each statement and then execute it. Since we are in the same session, we can do this without losing the effect of the previously executed one.

First, I run the BEGIN TRANSACTION statement. It executes successfully:

Then, I run the UPDATE CUBE statement. It also executes successfully. Now, when I verify its success with the SELECT, I get 10 as the value for the measure. After this, we can check if anything got to the Write Table:

It is empty, because the transaction has not yet been committed. Also, if we browse the cube in BIDS we will not see the value:

This shows us that the value is only there in the current session where out transaction is active.

If we then call the COMMIT TRAN statement, we get a slight delay and we are notified that the statement has been executed successfully. At the same time, I run SQL Server Profiler and I captured the trace of what happened. As we can see, (because the write-back partition is MOLAP), the statement got written to our table and the write-back partition got processed. If we have another look at the Write Table we will notice that we have one row there. Also, we can see the value for the measure in BIDS:

ROLLBACK is less interesting – it just discards the UPDATEd values and the cube goes back to the same state it was in before we started manipulating it.

Another interesting thing which I already mentioned is that if we run UPDATE before we have issues a BEGIN TRANSACTION (note I am using TRAN and TRANSACTION interchangeably in my script and text), a transaction is implicitly started but not committed. Therefore, we cannot just call UDPATE and hope to see the changes in the Write Table. We must call COMMIT first.

This behaviour explains how Excel 2010 does write-back. When we change values it issues UDPATE statements and once we “Publish” the changes it issues a COMMIT TRAN. This is where we have the largest wait overhead in MOLAP mode. The comparison between MOLAP and ROLAP write-back is really for another article, whcih I hope to have the time to write soon.