Saturday, December 10, 2011

Optimizing query for linked servers usage

A couple of days ago I was finishing my 1500+ lines procedure that was fetching some portions of data from a linked server.

I was happy to see that my procedure runs smoothly in some 5-7 seconds, depending on the branches it takes in its run. However, this was only true for the procedure being run from the SQL Server Management Studio (SSMS). When I ran the procedure from the application (in this case a standalone C# application) the procedure seemed to run indefinitely and the program stalled. So why is this happening?

After running several searches on the Google I found this interesting article that suggested a couple of hints that brought me eventually to the solution.

The problem with long stored procedures that have heterogeneous data access over several databases and servers is that the query plan for such procedure probably sucks (this is a technological term!). Having a bad (or sucky) plan is of course not good, as the usage of table indices on that plan is minimized or discarded. It seems that SQL Server does not take those (or finds it very hard to take them) into account when performing linked queries.

So a colleague of mine had this basic idea to do as much work possible on the (remote) linked server with its own database objects, that is to write stored procedures and functions on that (remote) server, which will utilize its own tables, table indices, etc. On the (local) server, we would rewrite the procedure, so that it only calls those (remote) procedures and fills the returned data set into a local temporary table. Onward in the procedure we would work only with those temporary tables. Or to put it in the batch words:

CREATE #DATA_TABLE (
 Column definition
)


DECLARE @SQL_STMT NVARCHAR(500)
SET @SQL_STMT = N'EXEC RemoteDatabaseName.Schema.ProcedureName @ParameterName = ' + @ParameterValue


INSERT INTO #DATA_TABLE
EXEC RemoteServer.master.sp_executesql(@SQL_STMT)


SELECT ...
FROM #DATA_TABLE
  JOIN SomeLocalTable ON ...

The result was great! Except from procedure being able to run from the C# application, we also gained a performance improvement of 2-4 seconds (which is 40-57%).

So in case you are experiencing the same problems, try to divide your database object in a batch query so that they run locally for that server (if possible). If not... Well you can always write a reasonable amount of join queries across the linked server's landscape.