Running PowerShell Scripts in SQL Agent Using CmdExec

If you have tried to execute PowerShell scripts through SQL Server Agent you may have noticed that the “version” (technically it’s not a version – see the links below for more info) of PowerShell run by the agent is not the same as the command-line PowerShell you may have tested your scripts with. The differences are explained in quite some depth by Chad Miller in his “The Truth about SQLPS and PowerShell V2” post and in his shorter answer to a question on StackOverflow. In brief, in all versions of SQL Server (including 2012), SQL Server Agent runs PowerShell scripts through a sqlps.exe process instead of the powershell.exe we get on the command prompt. Most annoyingly there are core syntax differences which do cause issues when running scripts with SQL Agent over sqpls. For example, if you have a construct like $(), or a Get-Date call, SQL Server Agent will not recognise the syntax and you’ll get a syntax error. It looks like SQL Agent’s Powershell task doesn’t like statements which can be interpreted as sqlcmd ones. Luckily, there is a very simple workaround if we want to circumvent this behavior. All we have to do is save the script somewhere on the filesystem (which SQL Server Agent’s service account can access, of course) and execute it through an Operating System (CmdExec), instead of a PowerShell step:

There we can simply call our script by:

powershell <script>

For example:

powershell “C:test.ps1”

This will execute the test.ps1 script. All we have to do is make sure that the SQL Server Agent account can see the script (permissions) and also that it can read/write to the locations touched by the PowerShell script. This way if you test scripts through the powershell.exe PowerShell environment you will get the same results in SQL Server Agent. And if you want to have the SQL Server cmdlets available, you can load them as a module as described here.

Load Testing BI Solutions – When?

This year I came across two very different BI projects which had the common non-functional requirement to prove that they would handle an expected spike in the report generation load. Funny enough, in both cases the project teams got very concerned and came up with wildly inaccurate predictions of how many concurrent users we should be testing for. In the first case the problem was with the perception of “thousands of users”, while in the second, the team interpreted “monthly users” as “concurrent users”. The annoying part was that in the first case the team planned on building an ultra-massively overcomplicated queuing system to handle those spikes, and in the second case they were thinking of completely scrapping the ad-hoc functionality in the solution and resorting to report extracts distributed by email. The unreasonable expectations of the load lead to bad design choices – this is why it is important to remain calm and first check whether there is a problem at all.

Firstly, let’s agree that we are measuring report requests. To begin, we should know how many requests we get per a period of time (e.g. a month), and then how long it takes to generate a report. A typical scenario would be:

  • 1,000,000 report requests per month
  • 2 seconds to generate a report on average

What we need to do now if apply a bit of math:

1,000,000 / 20 = 50,000 requests per day (on average)

50,000 / 8 = 6,250 requests per hour (8 hours in a working day)

Since a report takes 2 seconds to generate, we can generate 1,800 reports in one hour. Therefore, with 6,250 requests, we would have 3.47 average concurrent users. Of course, this would be the case if we have a very uniformly split load. In reality this would not happen – instead, we will have peaks and dips in usage. A moderate peak is typically around 3x the average, while a heavy one would be at around 6x the average. To ensure that we can handle such peak periods, we should multiply our average concurrent users by 3 or by 6 depending on our load analysis. Let’s assume we have a very high peak load of 3.47 * 6 = 20.82, or approximately 21 concurrent users. This is the number we need to test in our case. Note that we had 1,000,000 report requests per month, but in our highest peak we expect to have only 21 concurrent users. I have not actually had a project where we have expected to have such a load (in both cases which prompted me to write this post we had between 2000-10000 users per month).

The moral of the story – don’t panic. In most reporting projects the user load is not high enough to warrant a full-scale load testing exercise; next time you hear talking about something like that, instead of rushing to cover unreasonable scenarios, try to calculate and confirm the need first.

DataMarket Updates: Speed, Portal and DateStream

It has been an eventful week for the Azure DataMarket. We had three new and exciting (for geeks like me) things happening in that corner of the Microsoft universe:

1. Speed!

There was an update to the Azure DataMarket a few days ago. It was, in my opinion, the best thing Microsoft could have done to their offering – tremendously increase its performance. While the DataMarket was previously plagued by unacceptably slow download speed, now it’s for feed standards blazingly fast. For comparison sake, I used to wait for more than 40 minutes when downloading an approximately 70k rows feed from the DataMarket prior to the update. Now, it is on my machine in around 5 – 8-fold increase in performance! Rumours have it that on faster-than-my-home-ADSL2+-networks we will be experiencing up to 20x better performance. It would be good to hear if this is actually correct for developers on such networks (please comment).

Next, range queries, hopefully…

2. Portal

While before the last couple of days anyone who wanted to publish data on the DataMarket had to contact the Microsoft team via email and ask how to get it done, we have just moved into the self-service space with a new portal allowing publishers to create and manage their feeds. The link to this new portal is:

https://publish.marketplace.windowsazure.com/

And, you can find some very helpful documentation about it here:

http://msdn.microsoft.com/en-us/library/windowsazure/hh563871.aspx

3. DateStream

Finally, I am proud to announce that the great DateStream feed got translated in four more languages:

Hebrew and Danish – thanks to Rafi Asraf

German

Bulgarian

The Italian translation (thanks to Marco Russo) is coming soon too, but missed this release unfortunately.

Feel free to explore them and let me know if anything needs to be changed to make them more correct/useful.

Melbourne SQL Server Social Event: Short Notice!

A short notice for SQL Server enthusiasts/professionals in Melbourne. Stephen Few is in town and will attend the SQL Server Social Event tomorrow (17 Nov) at the Sherlock Holmes Inn in the CBD (415 Collins Street). Feel free to come and meet the community, talk about SQL Server and information visualisation..and possibly get your Stephen Few books signed.

Link for the event: http://sqlserversocial.eventbrite.com/

See you there!

Two New Books Definitely Worth Buying

I will try to provide a concise and useful insight into two books from the world of SQL Server. The first one is available now, while the second one is coming out very soon (in the next couple of weeks).

MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook (link)
by Tomislav Piasevoli

 

I received an invite to review Tomislav’s book a few weeks ago and despite the fact that I am quite busy these days, I decided to have a look mainly because I expected it to be very good as I had heard about it months ago by the author and the reviewers (Darren Gosbell, Chris Webb, Greg Galloway, Marco Russo and Deepak Puri); and because I was promised a free copy, so I wouldn’t have to buy the book myself (which I would have done anyway)J. Tomislav has been one of the most prominent SSAS MVPs, quite active on the MSDN Forums and writing interesting posts on his blog. I was not surprised that he has been able to deliver excellent practical advice for his readers in this cookbook and I expect to be using it often during SSAS implementations. Let’s see what you could expect from this book if you buy it.

Firstly, it is a cookbook – Tomislav has written recipes for a large amount of real-world scenarios. I must admit that I did not read absolutely all of them. I went through the ones I think I know well and compared Tomislav’s versions to the ones I would typically use. The topics are very well covered and follow a pattern of: Getting Ready -> How to do it -> How it works -> There’s more -> See also. First, we get an introduction, and then we prepare to get the work done. After that we get a step-by-step walkthrough for the actual solution. I liked the next section “How it works”. Here we get an explanation of why we get the result letting us get some more insight rather than blindly typing code. I find it a very nice touch and I applaud the author for spending the time to include this invaluable bit of information. The “There’s more” section after that expands a bit the topic, trying different options and showing what happens when we apply them. In the end of each chapter we have a section showing which other topics are related to the one we are currently reading. All in all, I think that Tomislav’s cookbook is very, very, well written.

In general, I would say that if you want to learn MDX and how it works, you can start with a textbook-style book, which shows the basics, explains them and then builds up on them. However, if you need to work with MDX now, or if you do have some knowledge but you lack practical skills, then this book is a real gem. I would definitely recommend buying Tomislav’s cookbook, and if you are interested in getting more theoretical detail on why things work the way they do, either do a lot of blog reading, or buying another MDX book to serve as a companion to this one.

In brief, this is the best MDX cookbook out there and offers excellent practical advice over a variety of topics.

MVP Deep Dives vol.2 (link)

 

The second book I would like to introduce is another “must buy”. Imagine the effort required to assemble a book from 53 different authors who live in different parts of the world and are not all native English speakers (like me). Well, Kalen Delaney has made the effort…twice…and this is one of the reasons for having this book published in the next couple of weeks. Another reason is the motivation which the authors found in donating their work to Operation Smile. We are not profiting from it and we hope that you will also buy the book not only because of the outstanding content, but also because the money you spend will be used to help children with facial deformities all around the world.

The list of authors speaks for itself and I am very flattered to be able to participate with a chapter entitled “Managing Context in MDX”. The book will be out by the beginning of PASS Summit 2011 and there will be a book signing session, where you can get your copies signed by some of the authors. Come, drop by if you are there to say hello and have a chat!

A Closer Look at PALO and GPGPU

Last week I had the pleasure to meet a friend of mine, who formed a company I wrote about a year or two ago. His business has grown nicely since then and they have become the number one PALO partner in Australia. For those who are not aware of Jedox and PALO, I would recommend visiting their website at www.jedox.com – it is an open source BI suite very similar to SQL Server, minus the relational part. Since I was given a private show (no, nothing immoral here) in their corporate setup, I thought it may be interesting to discuss what I saw here in this post.

There are a few interesting and vastly different aspects of PALO when compared to the SQL Server BI stack:

GPGPU

For me the best feature they have is the General-Purpose GPU support in the OLAP server. While the OLAP components can be queried through MDX much like SSAS, they solve query bottlenecks with raw power. As far as I am aware, PALO supports CUDA, or the NVIDIA implementation (ATI have their own) of the GPGPU vision. If this all sounds a bit foreign, have a look at Tom’s Hardware article “The Advent of GPGPU“, where the concept of using the GPU for computational purposes is explained in a fair bit of detail. In short, by harnessing the power of NVIDIA GPUs, the processing power of a PC jumps from a few GFLOPs (50-60 GFLOPs on my i7 2600K OC-ed to 4.5Ghz) to 1500-1800 GFLOPs on my NVIDIA GTX 570 GPU. This means that for GPU optimised calculations, a PC gets a boost of a factor of 30. Both NVIDIA and ATI can see the potential and have been working hard in the last few years to get better drivers and better support for such applications. PALO in particular prefers the NVIDIA Tesla GPU. Note that a Tesla does not even have video output – it is used only for calculations, supports ECC memory (thus making itself ready for enterprise environments), and has been designed from the ground up for CUDA.

In terms of PALO, I got told that when they have an optimised query performing badly, adding a new Tesla unit in the server solves the problem. Their experience shows that the servers scale up linearly with every new GPU, and since NVIDIA’s SLI allows multiple GPUs running in parallel, adding 2-4 such units is all it takes to create a very, very fast computational workhorse.

Tablet Apps

Another area where I was impressed was the way PALO does mobile. They have free apps for the iPad (which I saw in action), as well as the iPhone and Android. Their vision is that information dashboards are best seen, and mostly required on the go when BI users have limited ability to browse around and get a deeper insight. I tend to agree to some extent. In my experience, the information dashboard is a slightly overrated concept. Having it on your phone or tablet where you can easily connect to you corporate environment and check some numbers quickly is a nice idea and I hope we see it becoming a part of the Microsoft stack sooner rather than later. The application which PALO have is quite nice minus the pies, allows any form of touch experience (multi-touch included) and allows easy slicing and dicing of data – just how it should be.

Open Source Software Compatibility

The last bit I would offer as an impressive and different to other not-open source vendors is the openness and compatibility of PALO with other open-source tools. Their stack components are easily replaceable. The ETL component can be changed to Pentaho’s Kettle, or JasperSoft’s ETL software which can load data directly in PALO’s cubes. A bit like loading a SQL Server data mart with Informatica, but seemingly better and tighter as the interfaces between the components are, apparently, completely open.

Apart from these areas, I think that the Microsoft stack has a nicer UI, allows easier development, and is richer (with MDS, QDS coming up, Data Mining, etc.). PALO has its own ETL tool, which is not graphical and relies on drop-downs and various windows to get the work done, the OLAP server seems to support many features out of the box, allows querying through MDX and supports write-back, but in general seems quite barren from SSAS point of view. The front-end is either Excel through a plug-in allowing the creation of reports through formulas, Open Office, Libre Office, and PALO’s own web-based spreadsheet environment. Once a report is created in either of those it can be published to a web portal for sharing with other users.

All in all, PALO is a neat, free BI suite, which comes for very cheap initially. There is an enterprise version, which is not free and, of course, any new customers will have to pay for someone to install it, configure it, and implement their requirements which will add to the total cost but these expenses are there for any other set of tools (although, a decent argument can be lead on which suite allows faster and cheaper development). The features listed in this article definitely appeal to some and I am very impressed by the innovative GPGPU capability, which has a lot of potential and I can easily think of a few areas where a 30-fold improvement in computational power will benefit SQL Server BI.

7 Tools You Want on Your BI Dev Workstation

Without promising gimmicks like 80% reduction in development effort and time to deliver (as if…), the following tools are either free or relatively cheap and offer a productivity boost while allowing (and actually in some cases promoting) good development practises. Of course there are many others available but the following are my personal preferences, which I have found to be particularly valuable while developing (in no particular order).


BIDS Helper

When it comes to free tools for Microsoft BI development BIDS Helper usually tops the list. It is a very powerful tool making many advanced development tasks easy.

Who makes it: Darren Gosbell, Greg Galloway, John Welch, Darren Green, Scott Currie
License: Free
Link:
http://bidshelper.codeplex.com/


SQL Compare and SQL Data Compare

Red Gate is renowned for their great products and the top two in my opinion are SQL Compare and SQL Data Compare. These two allow robust and simplified migration between environments and help with creating transactional scripts for deploying to servers guarded by zealous DBAs.

Who makes it: Red Gate
License (1 Developer): SQL Compare (Standard) – $395; SQL Data Compare (Standard) – $395
Trial: Yes
Links:
http://www.red-gate.com/products/sql-development/sql-compare/
http://www.red-gate.com/products/sql-development/sql-data-compare/


BI Documenter

For those who do not like writing lengthy documents describing their work Pragmatic Works offers a great tool which extracts metadata from the various BI components in a solution and organises it very neatly in a cross-referenced (think HTML links between pages) fashion. It does it so well that I have had clients completely satisfied in terms of documentation by its output.

Who makes it: Pragmatic Works
License (1 Developer): $395
Trial: Yes
Link:
http://pragmaticworks.com/Products/Business-Intelligence/BIDocumenter/Default.aspx


AS Performance Workbench

For load testing of SSAS instances and cubes the free AS Performance Workbench provides a simple, easy and visual interface and eliminates the need to create a complex testing framework.

Who makes it: Rob Kerr
License: Free
Link:
http://asperfwb.codeplex.com/


ASCMD

ASCMD is for SSAS what SQLCMD is for SQL Server – a command-line utility allowing execution of XMLA and MDX scripts. It provides a rich set of functionality for performing maintenance and testing tasks.

Who makes it: Microsoft
License: Free
Link:
http://msftasprodsamples.codeplex.com/


MDX Studio

Started by Mosha Pasumansky, MDX Studio is the most powerful MDX analysis tool available. It provides best practise advice and includes many MDX tuning features not available in the toolset included in SQL Server and Windows.

Who makes it: Microsoft
License: Free
Link:
http://cid-74f04d1ea28ece4e.skydrive.live.com/browse.aspx/MDXStudio/v0.4.14


Team Foundation Server

TFS is a must if development is done by more than one person. While merging changes in BI projects is not as useful (or even possible) as in .NET coding projects, maintaining versions, branching, etc is still a must. TFS also includes bug tracking, requirements management, and a bunch of other features well worth exploring.

Who makes it: Microsoft
License: 5 Developers – $499
Trial: Yes
Link:
http://www.microsoft.com/visualstudio/en-us/products/2010-editions/team-foundation-server/overview