One of my pet hates is SQL Server cursors. Yes, I know that they are part of the product for a reason, and I know that they are very handy. So why do I hate them, and what can be done to avoid them? Are they a necessary evil? I developed a technique to avoid them a few years ago, and some others people have since designed some techniques as well, such as on this post. These other techniques are great, but my style of doing things has the one advantage that it can be repeated and nested in complex ways because of the use of variable tables... as you'll see below.
Well, lets first create some test data. I used a server with both SQL2000 and SQL2005 server on to do some tests just to show this technique works on both. I created an empty database with an empty table, and then added 1,000 random numbers between the value of 1-100 into the table. The script I used to create the table is below:
CREATE TABLE [dbo].[test] (
[testid] [int] IDENTITY (1, 1) NOT NULL ,
[randnum] [int] NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[test] WITH NOCHECK ADD
CONSTRAINT [PK_test] PRIMARY KEY CLUSTERED
(
[testid]
) ON [PRIMARY]
GO
I then constructed two very simple queries which took a number as a parameter and returned the number of columns matching that number. Yes, I know that I could have done a single SELECT to get that, but that's not the point of the exercise :) Where this technique IS essential is where you find yourself doing complex nested group-bys and start to think about pivots and the like (and especially where there's no reliable numeric ID column) Basically, where row-by-row operations are needed.
Firstly, the cursor script I used looked like this:
DECLARE @iCheckNumber int,
@iCurrentNumber int,
@iNumberCount int
SELECT @iCheckNumber = 35,
@iNumberCount = 0
DECLARE testnum CURSOR FOR
SELECT randnum FROM test
OPEN testnum
FETCH testnum INTO @iCurrentNumber
WHILE @@FETCH_STATUS = 0
BEGIN
FETCH testnum INTO @iCurrentNumber
IF @iCurrentNumber = @iCheckNumber
BEGIN
SELECT @iNumberCount = @iNumberCount + 1
END
END
CLOSE testnum
DEALLOCATE testnum
SELECT @iCurrentNumber
As you can see, it's a pretty standard cursor script. I ran it from Query Analyzer (2000) or Management Studio (2005) and it ran fine and returned the results I wanted. So how do I do the same kind of thing without cursors and without creating umpteen temporary tables? Well, I can use the T-SQL support for loops, and variable tables, as below:
DECLARE @iCheckNumber int,
@iCurrentNumber int,
@iNumberCount int,
@iCurrentRow int
SELECT @iCheckNumber = 35,
@iNumberCount = 0
DECLARE @temptest AS TABLE
(
iRowID int identity(1,1),
randnum int
)
INSERT INTO @temptest (randnum)
SELECT randnum FROM test
SELECT @iCurrentRow = 1
WHILE @iCurrentRow < (SELECT COUNT(*) FROM test)
BEGIN
IF (SELECT randnum FROM @temptest WHERE iRowID = @iCurrentRow) = @iCheckNumber
BEGIN
SELECT @iNumberCount = @iNumberCount + 1
END
SELECT @iCurrentRow = @iCurrentRow + 1
print @iCurrentRow
END
Note that I'm creating an identity column on my temporary table. This makes it ideal for writing queries where your data has no SEQUENTIAL numeric identity column already. One common use for this is with paging. Imagine you have a table with an ID and Name column. You want to pull back your names 500 at a time. You can get the first 500 no problem, just do a SELECT TOP 500. When you come to get 500-1000 you have problems - and that's where a cursor is normally used.
With my technique, you can put all your unique IDs in the temporary table, and then just iterate through the row numbers you want.
Note that the example I used here probably performs better as a cursor because it is so simple, but that's not a true reflection of cursors. They perform badly, even in 2005, and the more complex they get then, in my experience, the more useful it is to avoid