none
T-SQL: Performance Issues with View; runs slow

    Question

  • SQL Version:  2008 R2

    Problem:  I am trying to improve the performance of this view (see code below).  It is running very slowly (7:43).  All the objects that this view depends on run very fast (2-15 seconds).  I don't understand what is slowing this up??

    Any ideas would be appreciated.

    Regards,

    ...bob sutor

    CODE FOR VIEW:

    SELECT DISTINCT
      PRTH.JCCo
     , PRTH.Job
     , FDTvJCJMPM.JobNum_JobName
     , FDTvJCJMPM.JobStatus
     , RolesSummary.ProjMgrNum
     , RolesSummary.ProjMgrName

     , (SELECT Max(PRTH2.PREndDate) FROM PRTH AS PRTH2
      Where  PRTH2.Employee = PRTH.Employee
        AND PRTH2.JCCo = PRTH.JCCo AND PRTH2.Job = PRTH.Job)
       AS EmployeeLastPRPosted
      
     , YEAR(PRTH.PREndDate) AS RptYear

     , CAST(DATEPART(yyyy,PRTH.PREndDate)AS varchar(4)) + '-' +
      CASE LEN(CAST(DATEPART(mm, PRTH.PREndDate) AS varchar(4)))
       WHEN  1 THEN '0' + CAST(DATEPART(mm,PRTH.PREndDate)AS varchar(4))
       WHEN  2 THEN CAST(DATEPART(mm,PRTH.PREndDate)AS varchar(4))
       END
      AS RptYearMonth
     , PRTH.PRCo
     , PRTH.Employee
     , HRRMName.FullName AS EmployeeName
     , CAST(Min(PRTH.PREndDate) AS Date) AS PRStartDate
     , CAST(Max(PRTH.PREndDate) AS Date) AS PREndDate

     , EmployeeReports.HireDate
     , EmployeeReports.HiredDateCutoffINFO
     , EmployeeReports.HiredDateCutoffStatus
     , EmployeeReports.InHRET_Mandatory
     , EmployeeReports.MandatoryElective
     
     
     , EmployeeReports.ExpirationStatus
     , EmployeeReports.ExpirationYear
     , EmployeeReports.ExpirationMonthYear
     , EmployeeReports.EmployeeActive

     , EmployeeReports.HRTCVendorName
     , CAST(EmployeeReports.LatestDateCompleted AS DATE) AS LatestDateCompleted
     , CAST(EmployeeReports.DueDate AS Date) AS DueDate
     , CAST(EmployeeReports.ExpirationDate AS Date) AS ExpirationDate
     , EmployeeReports.CertPeriod
     , EmployeeReports.CertPeriodUOM
     , EmployeeReports.HasExpirationDate
     , EmployeeReports.ExpDateToDueDate
     , EmployeeReports.ClassSeq
     , EmployeeReports.Instructor
     , EmployeeReports.Institution
     , CAST(EmployeeReports.CompleteDate AS Date) AS CompleteDate
     , EmployeeReports.ClassDesc
     , EmployeeReports.Hours
     
      --- CERT MASTER INFORMATION
     , EmployeeReports.CertGroup
     , EmployeeReports.CertCode
     , EmployeeReports.CertREF
     , EmployeeReports.CertREF_CertDesc
     , EmployeeReports.CertActive
     , EmployeeReports.DaysToExpiration
     , EmployeeReports.ManualSectionDescription
     , EmployeeReports.ManualSubsectionDesc
     , EmployeeReports.CertSubGroupDesc
     , EmployeeReports.ManualSection
     , EmployeeReports.ManualSubSection
     , EmployeeReports.CertFirmName
     , EmployeeReports.AuthorityDescription
     , EmployeeReports.ManualSubsecID
      
     FROM PRTH
     
      LEFT JOIN PREH
       ON PRTH.PRCo = PREH.PRCo
       AND PRTH.Employee = PREH.Employee
       AND PRTH.JCCo = PREH.JCCo
       AND PRTH.Job = PREH.Job
     
      LEFT JOIN FDTvJCJMPM
       ON PREH.JCCo = FDTvJCJMPM.JCCo
       AND PREH.Job = FDTvJCJMPM.JobNumber
     
      LEFT JOIN udvJobRoles_Summary AS RolesSummary
       ON PREH.JCCo = RolesSummary.JCCo
       AND PREH.Job = RolesSummary.Job
       AND RolesSummary.Role = 'PM'
       
      LEFT JOIN HRRMName
       ON PREH.PRCo = HRRMName.HRCo
       AND PREH.Employee = HRRMName.HRRef
       
      LEFT JOIN udvTrainEmployeeReports AS EmployeeReports
       ON PRTH.PRCo = EmployeeReports.Company
       AND PRTH.Employee = EmployeeReports.EmployeeNum
      
     WHERE
       PRTH.JCCo = 1
      AND PRTH.Job IS NOT NULL
      AND PRTH.PREndDate > = (SELECT Max(PRTH2.PREndDate) FROM PRTH AS PRTH2
        Where  PRTH2.Employee = PRTH.Employee
          AND PRTH2.JCCo = PRTH.JCCo AND PRTH2.Job = PRTH.Job)
          
      AND FDTvJCJMPM.JobStatus = 1
      AND EmployeeReports.LatestDateCompleted = EmployeeReports.CompleteDate
      
     GROUP BY
      PRTH.JCCo
     , PRTH.Job
     , FDTvJCJMPM.JobNum_JobName
     , FDTvJCJMPM.JobStatus
     , RolesSummary.ProjMgrNum
     , RolesSummary.ProjMgrName
     , PRTH.PRCo
     , PRTH.Employee

     , EmployeeReports.HireDate
     , EmployeeReports.HiredDateCutoffINFO
     , EmployeeReports.HiredDateCutoffStatus
     , EmployeeReports.InHRET_Mandatory
     , EmployeeReports.MandatoryElective

     , PRTH.PREndDate
     , HRRMName.FullName
     
     , EmployeeReports.ExpirationYear
     , EmployeeReports.ExpirationMonthYear
     , EmployeeReports.EmployeeActive

     , EmployeeReports.CertGroup
     , EmployeeReports.CertCode
     , EmployeeReports.CertREF
     , EmployeeReports.CertREF_CertDesc
     , EmployeeReports.CertActive
     , EmployeeReports.DaysToExpiration
     , EmployeeReports.ExpirationStatus
     , EmployeeReports.CertSubGroup
     , EmployeeReports.CertSubGroupDesc
     , EmployeeReports.ManualSection
     , EmployeeReports.ManualSubSection
     , EmployeeReports.ManualSectionDescription
     , EmployeeReports.ManualSubsectionDesc
     , EmployeeReports.AuthorityDescription
     , EmployeeReports.ManualSubsecID
     , EmployeeReports.CertFirmName
     , EmployeeReports.CertPeriod
     , EmployeeReports.CertPeriodUOM
     , EmployeeReports.HasExpirationDate
     , EmployeeReports.ExpDateToDueDate

     , EmployeeReports.LatestDateCompleted
     , EmployeeReports.DueDate
     , EmployeeReports.ExpirationDate

     , EmployeeReports.ClassSeq
     , EmployeeReports.Instructor
     , EmployeeReports.Institution
     , EmployeeReports.CompleteDate
     , EmployeeReports.ClassDesc
     , EmployeeReports.Hours
     , EmployeeReports.HRTCVendorName
       
    GO


    Bob Sutor

    Sunday, December 29, 2013 5:42 PM

Answers

  • The views are nested about 4 deep and were created by me.

    OK, in that case the risk is that the optimization exercise goes beyond what is possible to over a forum.

    While nesting views is appealing from the point of view of reusing code, it is not always a good thing for good query performance. I don't obviously know about this case, but often when you dig into a number of nested views you find that the some tables are referred to more than once without there being any need for it, and there many be tables that are involved that are not needed for the query at hand.

    You should be aware of that views are just a logical concept and you can think of them as macros. When SQL Server optimizes a query, it does not work with the views; it works with the expanded query.

    So that is once advice I can give: redefine the views to work from the base tables instead. In this way you can eliminate multiple references to tables which you don't need.

    Also, you said in another post: I need the distinct to eliminate duplicate records. This is often a result of an incomplete data model, or an incomplete analysis of what the query should produce. You query is on the form

    FROM A
    LEFT JOIN B ON A.col1 = B.col1
    LEFT JOIN C ON A.col2 = C.col2
    LEFT JOIN D ON A.col3 = D.col3
    ...

    When I see this, I always get suspicious. First of all, you have conditions in the WHERE clause against the left-joined tables, which in practice means that you have an inner join. This is in itself not much of a problem, but it indicates that the query writer has not analysed the query in that much detail, but only put in left join just in case.

    What can be a big problem with the above is the cardinality. If all these joins are lookups on nullable columns, there is no issue. And if one of them can add multiple rows to the result set, no problem. That is if the join to B can match multiple rows while the joins to C and D are lookups, you are still safe. But if there can be multiple matches in more than one table, your query is going down the drain. Say that for a certain row in A, you get five matches in B, three matches in C and eight matches in D. That makes a total 120 rows, that is all possible combinations of the matches in B, C and D. There are two implications of this: 1) You get en enormous number of rows and no wonder performance is bad. 2) The result is unlikely to make sense.

    Here I did not even come into the topic of performance tuning and indexing as such. My first question was whether the output is correct. And that is the starting point. There is little use in tuning queries that return incorrect results.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Proposed as answer by Naomi NModerator Monday, December 30, 2013 6:34 PM
    • Marked as answer by ConstPM Tuesday, December 31, 2013 6:33 AM
    Monday, December 30, 2013 8:27 AM

All replies

  • Two tips:

    1. Lose the DISTINCT keyword. For a query like this - depending on the amount of rows returned - it is a performance killer
    2. Check if you have useful indexes for the predicates in the WHERE clause, and for each JOIN

    The fact that a selection of one table takes up to 15 seconds, and that of another table also takes up to 15 seconds, is no guarantee that if those results are joined together that the performance is anywhere close to 30 seconds. On the other hand, eliminating some joins from the query might point to where your indexes are not very useful.


    Gert-Jan

    Sunday, December 29, 2013 6:01 PM
  • First step is to generate an estimated execution plan for this query.  Which operation is taking the most time?  Try to identify the part of this select statement that corresponds to that long running operation.

    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    Sunday, December 29, 2013 6:04 PM
  • I would drop subquery in the select list and min/max and GROUP BY in the query and rather re-write it with ROW_NUMBER(), cte and appropriate partition. You may also remove some of the extra joins and start from the basic adding extra tables one at a time.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Sunday, December 29, 2013 6:37 PM
  • Does the query produce the expected result?

    Query optimisation can be done out of thin air. Do be able to assist we need the CREATE TABLE and CREATE INDEX statements for all tables in the queries. (And if they are views, we need their definitions etc). We also need to see the actual query plan, and we need the XML, not only a screen shot of the graphics.

    With the query plan, this will be more than a mouthfull to post here, but if you can upload it do Dropbox, Skydrive a web site or whatever that would be great.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, December 29, 2013 6:58 PM
  • Hello,

    There are a number of common reasons for slow-running queries, for example,slow network communication, inadequate memory,lack of useful indexes and statistics, and so on.

    To analyze the performance of a slow-running query, just as other post above, you should review the query plan firstly.
    As for index, you can use the missing indexes feature which provide information about missing indexes that could enhance SQL Server query performance.You can refer to the missing index query satatement in the this blog and you can try to combine with SQL Server Database Engine Tuning Advisor for further evaluation.

    Regards,
    Fanny Liu

    If you have any feedback on our support, please click here.


    Fanny Liu
    TechNet Community Support

    Monday, December 30, 2013 2:59 AM
  • Naomi,

    I need the distinct to eliminate duplicate records.  I removed the Distinct and the query only ran 5 seconds faster.  

    I don't have any indexes on the objects used by this view, nor on any columns used in Where Clause predicates.  My knowledge of how to use indexes is almost non-existent.  I know how to create indexes, but I don't have a grasp on when and how to use them.  Could you suggest a reference?

    Thanks


    Bob Sutor

    Monday, December 30, 2013 4:48 AM
  • Naomi,

    I am not aware of code options in SQL 2008 R2 that would allow me to replace the subquery in my select statement with an Over Clause.  I know SQL 2012 has added capabilities to the Over Clause which may allow this, but is there another option in SQL 2008 R2?

    ...bob

    HERE IS THE SUBQUERY I'm REFERRING TO:

     , (SELECT Max(PRTH2.PREndDate) FROM PRTH AS PRTH2
      Where  PRTH2.Employee = PRTH.Employee
        AND PRTH2.JCCo = PRTH.JCCo AND PRTH2.Job = PRTH.Job)
       AS EmployeeLastPRPosted


    Bob Sutor

    Monday, December 30, 2013 4:56 AM
  • Erland,

    I'm afraid I'm a raw beginner when it comes to Query Optimisation.  Most of the tables are objects in a proprietary enterprise database and I was not involved in creating them.  Don't really understand what you mean by CREATE TABLE and CREATE INDEX statements for all tables and queries.  The views are nested about 4 deep and were created by me.

    Do you have any suggested reading for Query Optimisation?

    And yes the query does produce the expected result, it just runs slow.

    Thanks,

    ...bob


    Bob Sutor


    • Edited by ConstPM Monday, December 30, 2013 5:03 AM
    Monday, December 30, 2013 5:01 AM
  • Bob

    Just a guess, can you re-write this sub query with CROSS APPLY operator?


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Monday, December 30, 2013 6:41 AM
  • The views are nested about 4 deep and were created by me.

    OK, in that case the risk is that the optimization exercise goes beyond what is possible to over a forum.

    While nesting views is appealing from the point of view of reusing code, it is not always a good thing for good query performance. I don't obviously know about this case, but often when you dig into a number of nested views you find that the some tables are referred to more than once without there being any need for it, and there many be tables that are involved that are not needed for the query at hand.

    You should be aware of that views are just a logical concept and you can think of them as macros. When SQL Server optimizes a query, it does not work with the views; it works with the expanded query.

    So that is once advice I can give: redefine the views to work from the base tables instead. In this way you can eliminate multiple references to tables which you don't need.

    Also, you said in another post: I need the distinct to eliminate duplicate records. This is often a result of an incomplete data model, or an incomplete analysis of what the query should produce. You query is on the form

    FROM A
    LEFT JOIN B ON A.col1 = B.col1
    LEFT JOIN C ON A.col2 = C.col2
    LEFT JOIN D ON A.col3 = D.col3
    ...

    When I see this, I always get suspicious. First of all, you have conditions in the WHERE clause against the left-joined tables, which in practice means that you have an inner join. This is in itself not much of a problem, but it indicates that the query writer has not analysed the query in that much detail, but only put in left join just in case.

    What can be a big problem with the above is the cardinality. If all these joins are lookups on nullable columns, there is no issue. And if one of them can add multiple rows to the result set, no problem. That is if the join to B can match multiple rows while the joins to C and D are lookups, you are still safe. But if there can be multiple matches in more than one table, your query is going down the drain. Say that for a certain row in A, you get five matches in B, three matches in C and eight matches in D. That makes a total 120 rows, that is all possible combinations of the matches in B, C and D. There are two implications of this: 1) You get en enormous number of rows and no wonder performance is bad. 2) The result is unlikely to make sense.

    Here I did not even come into the topic of performance tuning and indexing as such. My first question was whether the output is correct. And that is the starting point. There is little use in tuning queries that return incorrect results.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Proposed as answer by Naomi NModerator Monday, December 30, 2013 6:34 PM
    • Marked as answer by ConstPM Tuesday, December 31, 2013 6:33 AM
    Monday, December 30, 2013 8:27 AM
  • Thanks so much for the info Erland.  I'll go back and rethink my design in light of your comments.  The one thing that is correct with my existing query (view) is that it delivers the correct result set.

    Thanks again.

    ...bob


    Bob Sutor

    Monday, December 30, 2013 6:04 PM
  • My suggestion is to drop this part completely for now. We're trying to find the slow part and I think it may be one of the things slowing down. However, you said you don't have indexes at all? Also, are you selecting from tables or from other views? If the latter, then no wonder the performance is dreadful.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Monday, December 30, 2013 6:32 PM
  • Just to iterate Erland's point, take a look at this blog post

    T-SQL Wednesday #21 – T-SQL That Should Have Been Flushed Down the Toilet


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Monday, December 30, 2013 6:36 PM
  • Hi Bob,

    I went to LTD website and found this interesting book review

    Book Review: Inside the SQL Server Query Optimizer

    Sounds like a good book which may be helpful.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Monday, December 30, 2013 6:39 PM