Odpovědět Help Database design

  • 9. dubna 2012 13:41
     
     

    Hi, all,

    I am new to SQL Server, and help is appreciated!

    SQL Server 2008 Developer version, windows.

    Table TA is with over millions records, and it is increasing every day. we need to generate reports from the table daily, the faster, the better.

    What is in my mind to speed the performance: to create multiple file groups, table partition, create index.  any suggestions? Thanks very much!

    Oh, one more thing, I was testing a existing procedure, the first execution time was about 10 minutes, but the second execution  time is about 3 minutes, I used DBCC DROPCLEANBUFFERS
    DBCC FREEPROCCACHE

    why it did much faster even with cleaning buffer?

Všechny reakce

  • 9. dubna 2012 13:46
    Přispěvatel
     
     
    Yes, start with having useful indexes on the table.... Second step could be to create partition ....

    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/


  • 9. dubna 2012 13:54
     
     

    Hey, thanks very much for your help!!

    About the procedure execution time, the third time is less than 3 minutes, could anyone explain a little bit to me?

  • 9. dubna 2012 14:01
    Přispěvatel
     
     

    Yes , because the query is compiled and  execution plan inserted into a cache. The second time you run the query , the optimizer is able to determine that there is already 'good' enough execution plan in the cache and use it.

    Also read Plamen's article about parameter sniffing http://pratchev.blogspot.com/2007/08/parameter-sniffing.html


    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/

  • 9. dubna 2012 14:44
     
     

    Did you run DBCC DROPCLEANBUFFERS before each executions including the one that ran for less than 10 seconds? Basically it clears the cache and forces SQL Server to read all data from the disk. This could contribute to the long execution time especially if queries are not optimized and you have a lot of data/scans. 

    As the first step in the optimization (assuming you don't have specific configuration issues on the server), I'd look at the execution plan and IO statistics for the stored procedure. Optimize the queries if needed (add indexes, redesign queries, etc). Partitioning, filegroups, etc could help although I would not expect much until queries are properly optimized. 


    Thank you!

    My blog: http://aboutsqlserver.com

  • 9. dubna 2012 17:34
     
     

    The first one did not run DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE, but the rest ones did. This is my first time using it, my understanding is whenever I run DBCC, the query will be running as it runs for the first time, is that right? If not, how can we know a query performance is improved or not?

    The question I don't understand is why I did not change anything, and the procdure exectution is less and less, even with the DBCCs.

    Thanks.

  • 9. dubna 2012 20:21
    Moderátor
     
     

    It is hard to explain, but time is one of the worst indicators of performance when you are tuning.  The "It Works On My Machine" syndrome is basically about that.  One run runs in a minute, the next slower, the next in two hours. Each of these could be the same design, the same plan and it could be blocks from other users causing you issues...

    You need to look at the plans when you are running tests like this, and especially the statistics io information for logical and physical io (logical should, for the same data and plan be constant. physical ideally only occurs when the data is caching. Physical io that happens everytime a query executes often means memory pressure)

    t may be that cache isn't helping you and that you are CPU bound...  Query performance tuning is a really complex problem, and you may not even have enough memory to cache your data, so as it reads through the table it may end up cycling through all of the memory...

    A million row table with a 10 (and even a 3) minute query leads me to believe that multiple tables are involved or lots of large rowsizes and possibly overflow pages...

    As I see it, for this thread, we either need to toss it over to the Transact-SQL group, or let's look at the table design and see if it is truly a design issue. 


    Louis

  • 10. dubna 2012 18:19
     
     

    Hi, Louis,

    Thanks very much for your education!

    About exeution plan, would you tell me what usually should be reduced? for example, I have a simple query: select distinct A.colA, A.colB, A.colC from ta A inner join ta B

    where A.colA=B.colA and where A.colD='1-1-2012' and A.colD <'2-2-2012'

    the plan has parallelism(gather streams) cost 8%, distinct sort 61%, parallelism(Repartition Streams) cost:12%,Hash Match( inner join) cost 2%,clustered index scan(pk) cost 16%. And logical reads 14, physical reads 4.

    For the million records table, it did not have any join there actually, and less than 40 columns. and pk is a autonumber with a datetime column. sometimes it takes a lot of time to run a query, sometimes much less, and it is hard to duplicate the query time. any idea about this?

    Thanks.

  • 11. dubna 2012 5:33
    Přispěvatel
     
     Odpovědět

    Is that possible to re-write?

    select distinct A.colA, A.colB, A.colC from ta A inner join ta B

    on A.colA=B.colA

    where A.colD>='1-1-2012' and A.colD <'2-2-2012'  

    Having CI on A.colD will speed up the query


    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/