Help Database design
-
9. dubna 2012 13:41
Hi, all,
I am new to SQL Server, and help is appreciated!
SQL Server 2008 Developer version, windows.
Table TA is with over millions records, and it is increasing every day. we need to generate reports from the table daily, the faster, the better.
What is in my mind to speed the performance: to create multiple file groups, table partition, create index. any suggestions? Thanks very much!
Oh, one more thing, I was testing a existing procedure, the first execution time was about 10 minutes, but the second execution time is about 3 minutes, I used DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHEwhy it did much faster even with cleaning buffer?
- Upravený SQLServerLearning 9. dubna 2012 13:45
Všechny reakce
-
9. dubna 2012 13:46PřispěvatelYes, start with having useful indexes on the table.... Second step could be to create partition ....
Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/
- Upravený Uri DimantMVP, Editor 9. dubna 2012 13:46
-
9. dubna 2012 13:54
Hey, thanks very much for your help!!
About the procedure execution time, the third time is less than 3 minutes, could anyone explain a little bit to me?
-
9. dubna 2012 14:01Přispěvatel
Yes , because the query is compiled and execution plan inserted into a cache. The second time you run the query , the optimizer is able to determine that there is already 'good' enough execution plan in the cache and use it.
Also read Plamen's article about parameter sniffing http://pratchev.blogspot.com/2007/08/parameter-sniffing.html
Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/
-
9. dubna 2012 14:44
Did you run DBCC DROPCLEANBUFFERS before each executions including the one that ran for less than 10 seconds? Basically it clears the cache and forces SQL Server to read all data from the disk. This could contribute to the long execution time especially if queries are not optimized and you have a lot of data/scans.
As the first step in the optimization (assuming you don't have specific configuration issues on the server), I'd look at the execution plan and IO statistics for the stored procedure. Optimize the queries if needed (add indexes, redesign queries, etc). Partitioning, filegroups, etc could help although I would not expect much until queries are properly optimized.
Thank you!
My blog: http://aboutsqlserver.com
-
9. dubna 2012 17:34
The first one did not run DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE, but the rest ones did. This is my first time using it, my understanding is whenever I run DBCC, the query will be running as it runs for the first time, is that right? If not, how can we know a query performance is improved or not?
The question I don't understand is why I did not change anything, and the procdure exectution is less and less, even with the DBCCs.
Thanks.
-
9. dubna 2012 20:21Moderátor
It is hard to explain, but time is one of the worst indicators of performance when you are tuning. The "It Works On My Machine" syndrome is basically about that. One run runs in a minute, the next slower, the next in two hours. Each of these could be the same design, the same plan and it could be blocks from other users causing you issues...
You need to look at the plans when you are running tests like this, and especially the statistics io information for logical and physical io (logical should, for the same data and plan be constant. physical ideally only occurs when the data is caching. Physical io that happens everytime a query executes often means memory pressure)
t may be that cache isn't helping you and that you are CPU bound... Query performance tuning is a really complex problem, and you may not even have enough memory to cache your data, so as it reads through the table it may end up cycling through all of the memory...
A million row table with a 10 (and even a 3) minute query leads me to believe that multiple tables are involved or lots of large rowsizes and possibly overflow pages...
As I see it, for this thread, we either need to toss it over to the Transact-SQL group, or let's look at the table design and see if it is truly a design issue.
Louis
-
10. dubna 2012 18:19
Hi, Louis,
Thanks very much for your education!
About exeution plan, would you tell me what usually should be reduced? for example, I have a simple query: select distinct A.colA, A.colB, A.colC from ta A inner join ta B
where A.colA=B.colA and where A.colD='1-1-2012' and A.colD <'2-2-2012'
the plan has parallelism(gather streams) cost 8%, distinct sort 61%, parallelism(Repartition Streams) cost:12%,Hash Match( inner join) cost 2%,clustered index scan(pk) cost 16%. And logical reads 14, physical reads 4.
For the million records table, it did not have any join there actually, and less than 40 columns. and pk is a autonumber with a datetime column. sometimes it takes a lot of time to run a query, sometimes much less, and it is hard to duplicate the query time. any idea about this?
Thanks.
-
11. dubna 2012 5:33Přispěvatel
Is that possible to re-write?
select distinct A.colA, A.colB, A.colC from ta A inner join ta B
on A.colA=B.colA
where A.colD>='1-1-2012' and A.colD <'2-2-2012'
Having CI on A.colD will speed up the query
Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/
- Označen jako odpověď Maggie LuoMicrosoft Contingent Staff, Moderator 26. dubna 2012 6:54