Ordering Dimensions: Recursion vs Loops in T-SQL

Recently I had to build a function, which returned dimension members from a parent-child relational table. I needed to be able to order the members hierarchically and I decided to use a recursive CTE call.
Usually, we have a parent-child dimension table with structure similar to this:
Dim_Business
Business_Skey int
Parent_Business_Skey int
 
with a root node with Skey of -3 and Parent_Business_Skey of -3.
Let’s assume we have a hierarchy like this:
All (Id = -3, P_ID = -3)
-GLOBAL (Id = 1, P_ID = -3)
–Europe (Id = 2, P_ID = 1)
—UK (Id = 3, P_ID = 2)
—France (Id = 4, P_ID = 2)
—Spain (Id = 5, P_ID = 2)
–North America (Id = 6, P_ID = 1)
—USA (Id = 7, P_ID = 6)
—Canada (Id = 8, P_ID = 6)
—Mexico (Id = 9, P_ID = 6)
 
If we do something like:
SELECT Business_Skey
FROM Dim_Business
ORDER BY Parent_Business_Skey ASC, Business_Skey ASC
 
We will get:
-3 (All)
1 (GLOBAL)
2 (Europe)
6 (North America)
3 (UK)
4 (France)
5 (Spain)
7 (USA)
8 (Canada)
9 (Mexico)
Obviously, this hierarchy is incorrect, because we want to see the leaf nodes under their respective parents.
We can recursively create order, which concatenates the parent ids:
WITH b_ord(bid, bord)
AS
(
SELECT  Business_Skey AS bid
 , CONVERT(nvarchar(1000), Business_Skey) AS bord
FROM Dim_Business
WHERE Business_Skey = -3

UNION ALL

SELECT  Business_Skey AS bid
 , CONVERT(nvarchar(1000), bord + ‘|’ + CONVERT(nvarchar, Business_Skey))
FROM Dim_Business db
 INNER JOIN b_ord bo
 ON db.Parent_Business_Skey = bo.bid
WHERE db.Business_Skey <> bo.bid
)
SELECT *
FROM b_ord
ORDER BY bord ASC
 
The result of the CTE query is:
-3 -3 (All)
1 -3|1 (GLOBAL)
2 -3|1|2 (Europe)
3 -3|1|2|3 (UK)
4 -3|1|2|4 (France)
5 -3|1|2|5 (Spain)
6 -3|1|6 (North America)
7 -3|1|6|7 (USA)
8 -3|1|6|8 (Canada)
9 -3|1|6|9 (Mexico)
 
and the order is correct.
 
Because the code needed to go in a function, invoked by a number of stored procedures, .NET application and various reports, I needed the code to be quick and 
light. As some dimensions had a large number of members (50000+), which could grow with time, the code needed to implemented in a careful way. So, I decided 
to compare the recursive CTE function to a WHILE loop and a temporary table implementation:
DECLARE @c int
DECLARE @num_of_nodes int

SET @num_of_nodes = (SELECT  COUNT(*) FROM Dim_Business)

CREATE TABLE #order(
  skey int
 , ord nvarchar(1000)
 , lvl int
)

INSERT INTO #order
SELECT  Business_Skey
 , CONVERT(nvarchar(1000), Business_Skey)
 , 1
FROM Dim_Business
WHERE Business_Skey = -3

SET @c = 2

WHILE @c > 0
BEGIN
 INSERT INTO #order
 SELECT  Business_Skey
 , CONVERT(nvarchar(1000), ord + ‘|’ + CONVERT(nvarchar, Business_Skey))
 , @c
 FROM Dim_Business db
 INNER JOIN #order o
 ON db.Parent_Business_Skey = o.skey
 AND o.lvl = @c – 1
 WHERE db.Business_Skey <> o.skey

 SET @c = @c + 1
 
 IF (SELECT COUNT(*) FROM #order) = @num_of_nodes
 SET @c = 0
END

SELECT  skey AS bid
 , ord AS bord
FROM #order
ORDER BY ord ASC

DROP TABLE #order
 
After comparing the results in Client Statistics and the IO reads, the 1st recursive query performs more than 20% worse than the WHILE loop query. It also trails the non-recursive query in the count of logical reads,read-ahead reads and scan counts.
It seems like in SQL Server 2005 calling WITH recursively does not work as good as coding set-based operations through WHILE loops.

Fun with T-SQL

As a tribute to some fairly restrictive .NET design I once fell victim to, I wrote this piece of code:

Use [Database]
Select [Select]
From [From]
Where [Where] = ‘Where’
And [nvarchar(50)] = ‘nvarchar(50)’
And [int] = 1
The CREATE and INSERT statements for the above command:
 
CREATE TABLE [dbo].[From](
 [Id] [int] NOT NULL,
 [Where] [nvarchar](50) NOT NULL,
 [On] [nvarchar](50) NOT NULL,
 [Select] [nvarchar](50) NOT NULL,
 [From] [nvarchar](50) NOT NULL,
 [int] [int] NOT NULL,
 [nvarchar(50)] [nvarchar](50) NOT NULL
)

INSERT INTO [From] (  [Id]
, [Where]
, [On]
, [Select]
, [From]
, [int]
, [nvarchar(50)])
SELECT 1, ‘Where’, ‘On’, ‘:-)’, ‘From’, 1, ‘nvarchar(50)’

Enjoy!

  
 

EXECUTE AS someone else – impersonation in stored procedures

Recently, I had to implement a stored procedure, which is executed under a service account context. It had to be invoked by an Informatica workflow and its purpose was to clean a cache database table and to update all statistics on a database through invoking the sp_updatestats. As the Informatica user did not have enough permissions to perform these tasks, I had to use the EXECUTE AS in the stored procedure. The EXECUTE AS statement can also effectively be used to limit the execution scope through allowing limited permissions to the user account executing the stored procedure:
CREATE PROCEDURE [dbo].[usp_Update_Statistics] 
WITH EXECUTE AS user
AS
BEGIN
 clear cache
 EXEC sp_UpdateStats
END
Immediately I encountered various problems with the execution of the stored procedure. After some research it turned out that a few preconditions need to be satisfied before we can execute it:
1. TRUSTWORTHY setting on the database needs to be turned on:
     ALTER DATABASE database SET TRUSTWORTHY ON;
2. The owner of both the context database and the master database must be the same. To set this property we need to:
   2.1 Find who is the owner of master:
           exec sp_helpdb
   2.2 Set the context database owner to the same owner:
           ALTER AUTHORIZATION ON DATABASE::database TO user
3. The account under which we EXECUTE AS needs to be a database/server principal. The database and server principals can be found in the sys.database_principals and the sys.server_principals tables.
4. The account needs to be granted impersonate permissions:
     GRANT IMPERSONATE ON USER::user in EXECUTE AS TO user executing the sproc
5. If a server login is specified (instead of a database user), it needs to be mapped to a database user.
Also, if SQL Server is running under a local account/service it is not possible to use the EXECUTE AS statement.
The same statement can be used to execute SQL Server Agent jobs under different context – something particularly useful when trying to run them through Integration Services.