Website Hosting and Development

Sunday, September 03, 2006

How to eliminate cursor in SQL Server

We all know how cursors are slow and how badly they tend to issue a lock on part of the table, or on the whole table, during its execution. To explain exactly what is going on, and what we can do to optimize the performance, here is an example:

Table t_Customers has 3000 records.

Table t_Orders has 1 to 100 records for each record in t_Customers table, making it a table with 65000 records.

We want to print all of the orders for customers that had orders in the last month. There are several solutions that we can use and first one is usage of the cursor:

Declare
@nCustID Int
, @cCustName varchar(30)
, @nOrdNo Int

Declare curCusOrd For
Select
C.CustID
, C.CustName
, O.OrderNo
From
t_Customers C
Left Join t_Orders O
On C.CustID = O.CustID
Where
O.OrderDate >= DateAdd( month, -1, GetDate() )

Fetch Next From curCusOrd
Into
@nCustID
, @cCustName
, @nOrdNo

While @@Fetch_Status = 0
Begin

/*
Print Order
*/

Fetch Next From curCusOrd
Into
@nCustID
, @cCustName
, @nOrdNo

End

Close curCusOrd
DeAllocate curCusOrd

In example above, cursor will lock part of each table, or both tables as a whole, while orders are printed. On top of this, a lot of disk activities will be done in order to scan both t_Customers and t_Orderd tables, resulting a very slow performance.

The first alternative is to create a temporary table, that will contain all key columns, and then to use it as a pointer instead of cursor. For a small dataset we can easily eliminate a slow performing and memory intense cursor with a temporary table. Here is a same procedure done with temporary table:

Declare
@nCustID Int
, @cCustName varchar(30)
, @nOrdNo Int
, @nRowCnt Int
, @nRows Int

Create Table #tempCustOrd
(
CusOrdID IDENTITY (1, 1) Primary key Not Null
, CustID Int Not Null
, CustName varchar(30) Not Null
, OrderNo Int
)

Insert #tempCustOrd
Select
C.CustID
, C.CustName
, O.OrderNo
From
t_Customers C
Left Join t_Orders O
On C.CustID = O.CustID
Where
O.OrderDate >= DateAdd( month, -1, GetDate() )

Select
@ nRowCnt = 1
, @nRows = Count(*)
From #tempCustOrd

While @nRowCnt <= @nRows
Begin
Select
C.CustID
, C.CustName
, O.OrderNo
From #tempCustOrd
Where CusOrdID = @nRowCnt

/*
Print Order
*/

Select @nRowCnt = @nRowCnt + 1

End

Drop Table #tempCustOrd

Locks that were issued on t_Customers and t_Orders are now eliminated, but the price is a lot of disk activities, along with the potential locking of whole temporary table. All of this adds up to a lousy performance, especially on large or long-running operations.

The third alternative is to use a SQL Server Table variable. This is a variable that can be used in stored procedures, functions and batches. They have no permanent life outside of the batch that contains them. They're cleaned up automatically at the end of the batch and you don't need to worry about name conflicts with anything outside of this scope. Within its scope, a table variable behaves like a permanent table.

Now lets take a look at our example using a Table variable. It is almost identical as the temporary table example. Only difference is the table declaration and the fact that you do not need to drop it after you're done using it. So here is our code with Table variable:

Declare
@nCustID Int
, @cCustName varchar(30)
, @nOrdNo Int
, @nRowCnt Int
, @nRows Int

Declare @t_CustOrd Table
(
CusOrdID IDENTITY (1, 1) Primary key Not Null
, CustID Int Not Null
, CustName varchar(30) Not Null
, OrderNo Int
)

Insert @t_CustOrd
Select
C.CustID
, C.CustName
, O.OrderNo
From
t_Customers C
Left Join t_Orders O
On C.CustID = O.CustID
Where
O.OrderDate >= DateAdd( month, -1, GetDate() )

Select
@ nRowCnt = 1
, @nRows = Count(*)
From @t_CustOrd

While @nRowCnt <= @nRows
Begin
Select
C.CustID
, C.CustName
, O.OrderNo
From @t_CustOrd
Where CusOrdID = @nRowCnt

/*
Print Order
*/

Select @nRowCnt = @nRowCnt + 1

End

Table variable resides in memory, there is no disk I/O, it does not use tempdb, therefore eliminating all of the locks. Execution is fast; it takes 1/5 of time to complete the same job then it would using cursors.

As you can expect there are limitations:
  • Server physical memory. Always keep your server memory in mind when you work with Table variable
  • You can not create index on the Table variable. If you need to work with a large amount of data with many indexes, use a temporary table instead of a table variable
  • The table scheme on Table variable is static. If you need to modify it in the course of a batch, use a temporary table instead.
  • You can not use Select Into statement on a Table variable
As a general rule of thumb, you should use table variable any time that your temporary data is of reasonable size and is only used a few times. As your temporary data grows in size, complexity, and reuse, a temporary table will be more appropriate. Using execution plans, SQL Profiler, and performance counters available to monitor what's going on inside of your stored procedures, you can code the alternatives and test them when there's any doubt.

Happy coding, Marin



Cheers, Marin
Check for more articles on this topic at Website Hosting and Development website.

You can also visit DIMM Info Systems Inc. website to see how else I can help you achive your goals.

0 Comments:

Post a Comment

<< Home