select into temp table is very slow
-
Friday, February 15, 2013 12:04 AM
Hi there,
I have an issue using temporary table in my SQL. The sql is quite complicated (having full outer join, left join, join) but when I run the sql directly in query analyzer, it's fairly fast - returns 727 rows in 1 second. Now I do the following:
select distinct A into #TmpTable
from
(the sql that returns 727 rows) myresult
It's pretty slow at the first time (90 seconds), then faster (60 seconds) and faster (13 seconds).
I am wondering it's because of the tempdb not being configured properly. The space of tempdb seems OK (6GB free space), but the files are on C drive.
After a little while, when I run the script again it's slow again. Seems to me that something wakes up then goes to sleep again, then wakes up again when next request comes in. I am not sure it's because the tempdb database, or at the sql server, or at the OS level.
Could someone please provide some advices?
Thanks very much.
Mike
All Replies
-
Friday, February 15, 2013 2:39 AMsql is likely doing what you told it to do. post your sql?
-
Friday, February 15, 2013 2:47 AM
The DISTINCT forces the optimizer to use a different execution plan. DISTINCT requires the data to be sorted, so either SQL Server could add a SORT operator with is expensive and also the sorting speed is dependent on the available memory. Another option is that SQL Server might use a different index to avoid that sort. That might require an index scan wich in turn requires a lot more memory. On a memory constraint machine both scenarios could lead to the behavior you are seeing, but without the actual query, the actual execution plan, server memory size, table and index sizes and some information about the other queries running at the same time, all this is only speculation.
The first thing to look at is, if your server is memory constraint. The Page Live Expectancy PerfMon counter is a starting point for that (http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/).
-
Friday, February 15, 2013 2:55 AM
Hi Mike,
This doesn't initially sound like TempDB to me. At least not as being the biggest bottleneck. It sounds like the execution plan for running this by itself differs from when it's called as a virtual table. You just need to compare the two execution plans to see what the difference is. It almost sounds like the distinct operation is happening before the inner query finishes. You may be able to force it using option (force order). But you may want to research the negatives of applying such a hint first.
So, it sounds like when you run it by itself it's touching much less data than when it's embedded. This also partially explains why when it's run consecutively it gets faster. This is because the data is loaded into the buffer cache. Since so much data is being touched the biggest bottleneck is first reading from disk. However once loaded from disk the data resides in the buffer cache. So the next run is faster. Why the 3rd run is quicker though, I can't explain without seeing your system.
If I were you I would look at the execution plan to see where the thick arrows are. This is where a lot of rows are being touched. I would work on touching less rows and eliminating as many records early in the execution as possible. Maybe break the query into multiple steps.
Derek SQLPlanet.com
-
Friday, February 15, 2013 3:25 AM
There are many possibilities -
- Are you using DISTINCT only in the #TmpTable query or in the other iteration without a temp table too? Running without DISTINCT would be faster.
- Does the (the sql that returns 727 rows) myresult have many columns? Maybe it is a large data set so takes time to DISTINCT.
- When you run it for the first time, SQL Server buffers the data in its cache in an anticipation that it might be needed again. When the same query is run again, sever can get the data from the cache without doing an IO on the disk. That is the reason that subsequent runs of the query are faster. The data is kept in cache till the time cache is needed for something else and in case something else comes up, it will flush out the older data to make room for the new data. If your server is quite busy then this flush can happen quickly.
- When I try doing something like what you are doing and look at the execution plans, the query cost relative to the batch is 20% vs 80% ... with almost 75% of the cost in the second query associated with the Table Insert operator.
select * from parent_a select distinct * into #child_a from parent_a
Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
Thanks!
Aalam | Blog (http://aalamrangi.wordpress.com)- Marked As Answer by Iric WenModerator Monday, February 25, 2013 9:25 AM

