locked
my query is running slow RRS feed

  • Question

  • I have this query that is taking about 20 minutes and I just thought I would find a suggestion to optimize it from you. 

    Thank you. 

    DECLARE @BeginDate	DATETIME = '2011-01-16 00:00:00.000',
    	@EndDate	DATETIME = '2011-01-20 10:34:09.999';
    ;WITH A AS (
            SELECT	ABR.ABId, AR.ARID, AR.X, AR.D, AR.S, AR.CON, AR.SON, AR.RRON
            FROM	dbo.AR AS AR
    	JOIN	dbo.ABR AS ABR ON ABR.ARID = AR.ARID
            WHERE	AR.CON >= @BeginDate AND AR.CON <= @EndDate),
        B AS (
            SELECT	R.ARID, COUNT(DISTINCT J.ADID) NCS
            FROM	A R
    	JOIN	dbo.RUD	AS RUD ON RUD.X = R.X
    	JOIN	dbo.J ON J.ADID = RUD.ADID
    	GROUP BY R.ARID)
    --getting the final result set
    SELECT	A.ABID, A.ARID, A.X, B.NCS, A.D, A.S, A.CON, A.SON, A.RRON
    FROM	A AS A
    JOIN	B AS B ON B.ARID = A.ARID

    --table definition

    CREATE TABLE [dbo].[AR]( [ARID] [int] IDENTITY(1,1) NOT NULL, [X] [varchar](32) NOT NULL, [R] [xml] NOT NULL, [S] [varchar](16) NOT NULL, [COn] [datetime] NULL, [SOn] [datetime] NULL, [RROn] [datetime] NULL, [D] [date] NULL, CONSTRAINT [PK_AR] PRIMARY KEY CLUSTERED ([ARID] ASC) GO

    CREATE TABLE [dbo].[RUD]([X] [varchar](32) NOT NULL,[ADId] [int] NOT NULL,[D] [date] NOT NULL,
     CONSTRAINT [PK_RUD] PRIMARY KEY CLUSTERED ([ADId] ASC,[X] ASC
     )
    GO

    CREATE TABLE [dbo].[J]([JID] [int] IDENTITY(1,1) NOT NULL,[ADID] [int] NOT NULL CONSTRAINT [PK_J] PRIMARY KEY CLUSTERED ([JID] ASC))
    GO




    • Edited by In-efficient Saturday, February 13, 2016 8:24 PM
    Saturday, February 13, 2016 8:23 PM

Answers

  • What about filtering B?

    Something like

    WITH    A AS ( SELECT   ABR.ABId ,
                            AR.ARID ,
                            AR.X ,
                            AR.D ,
                            AR.S ,
                            AR.CON ,
                            AR.SON ,
                            AR.RRON
                   FROM     dbo.AR AS AR
                            INNER JOIN dbo.ABR AS ABR ON ABR.ARID = AR.ARID
                   WHERE    AR.CON >= @BeginDate
                            AND AR.CON <= @EndDate
                 )
        SELECT  R.ARID ,
                COUNT(DISTINCT J.ADID) NCS ,
                A.ABID ,
                A.X ,
                B.NCS ,
                A.D ,
                A.S ,
                A.CON ,
                A.SON ,
                A.RRON
        FROM    A R
                JOIN dbo.RUD AS RUD ON RUD.X = R.X
                JOIN dbo.J ON J.ADID = RUD.ADID
                INNER JOIN A ON R.ARID = A.ARID
        GROUP BY R.ARID ,
                A.ABID ,
                A.X ,
                B.NCS ,
                A.D ,
                A.S ,
                A.CON ,
                A.SON ,
                A.RRON

    • Proposed as answer by Naomi N Sunday, February 14, 2016 7:18 PM
    • Marked as answer by In-efficient Sunday, February 14, 2016 9:50 PM
    Saturday, February 13, 2016 9:11 PM
  • Looking closer, I think we can make the query even simpler:

    SELECT  ABR.ABId, AR.ARID, AR.X, B.NCS, AR.D, AR.S, AR.CON, AR.SON, AR.RRON
    FROM     dbo.AR AS AR
    JOIN     dbo.ABR AS ABR ON ABR.ARID = AR.ARID
    CROSS   APPLY (SELECT COUNT(DISTINCT J.DID) AS NCS
                   FROM   dbo.RUD
                   JOIN    dbo.J ON J.ADID = RUD.ADID
                   WHERE  RUD.X = AR.X) AS B
    WHERE     AR.CON >= @BeginDate   AND   AR.CON <= @EndDate

    But again, with the cryptic table and column names it is difficult to understand the original query. You need to verify that my rewrite gives the correct result.

    • Marked as answer by In-efficient Monday, February 15, 2016 7:08 PM
    Saturday, February 13, 2016 11:04 PM

All replies

  • Without knowing anything about the tables and not knowing the cardinality relations, it's a little difficult. But I believe this query is equivalent:

    SELECT  ABR.ABId, AR.ARID, AR.X, B.NCS, AR.D, AR.S, AR.CON, AR.SON, AR.RRON
    FROM     dbo.AR AS AR
    JOIN     dbo.ABR AS ABR ON ABR.ARID = AR.ARID
    CROSS   APPLY (SELECT AR.ARID; COUNT(DISTINCT J.DID) AS NCS
                   FROM   dbo.RUD
                   JOIN    dbo.J ON J.ADID = RUD.ADID
                   WHERE  RUD.X = AR.X
                   GROUP  BY AR.ARID) AS B
    WHERE     AR.CON >= @BeginDate   AND    AR.CON <= @EndDate

    Whether it is significantly faster I don't know, but it should have fewer accesses to the tables.

    • Proposed as answer by Naomi N Sunday, February 14, 2016 2:25 AM
    • Unproposed as answer by Naomi N Sunday, February 14, 2016 2:25 AM
    Saturday, February 13, 2016 8:49 PM
  • What about filtering B?

    Something like

    WITH    A AS ( SELECT   ABR.ABId ,
                            AR.ARID ,
                            AR.X ,
                            AR.D ,
                            AR.S ,
                            AR.CON ,
                            AR.SON ,
                            AR.RRON
                   FROM     dbo.AR AS AR
                            INNER JOIN dbo.ABR AS ABR ON ABR.ARID = AR.ARID
                   WHERE    AR.CON >= @BeginDate
                            AND AR.CON <= @EndDate
                 )
        SELECT  R.ARID ,
                COUNT(DISTINCT J.ADID) NCS ,
                A.ABID ,
                A.X ,
                B.NCS ,
                A.D ,
                A.S ,
                A.CON ,
                A.SON ,
                A.RRON
        FROM    A R
                JOIN dbo.RUD AS RUD ON RUD.X = R.X
                JOIN dbo.J ON J.ADID = RUD.ADID
                INNER JOIN A ON R.ARID = A.ARID
        GROUP BY R.ARID ,
                A.ABID ,
                A.X ,
                B.NCS ,
                A.D ,
                A.S ,
                A.CON ,
                A.SON ,
                A.RRON

    • Proposed as answer by Naomi N Sunday, February 14, 2016 7:18 PM
    • Marked as answer by In-efficient Sunday, February 14, 2016 9:50 PM
    Saturday, February 13, 2016 9:11 PM
  • Hi Erland

    I'm getting the following error:

    Msg 164, Level 15, State 1, Line 12
    Each GROUP BY expression must contain at least one column that is not an outer reference.
    Saturday, February 13, 2016 10:11 PM
  • Looking closer, I think we can make the query even simpler:

    SELECT  ABR.ABId, AR.ARID, AR.X, B.NCS, AR.D, AR.S, AR.CON, AR.SON, AR.RRON
    FROM     dbo.AR AS AR
    JOIN     dbo.ABR AS ABR ON ABR.ARID = AR.ARID
    CROSS   APPLY (SELECT COUNT(DISTINCT J.DID) AS NCS
                   FROM   dbo.RUD
                   JOIN    dbo.J ON J.ADID = RUD.ADID
                   WHERE  RUD.X = AR.X) AS B
    WHERE     AR.CON >= @BeginDate   AND   AR.CON <= @EndDate

    But again, with the cryptic table and column names it is difficult to understand the original query. You need to verify that my rewrite gives the correct result.

    • Marked as answer by In-efficient Monday, February 15, 2016 7:08 PM
    Saturday, February 13, 2016 11:04 PM
  • Hi Stefan,

    Thanks and it takes about 30 seconds. 

     

    Saturday, February 13, 2016 11:17 PM
  • This script does not return the correct number of rows. 

    Sorry about that Erland. I do not know if this helps but here is the table names:

    ABR = AnnualBudgetReport
    AR = AnnaulReport
    J = Journals
    RUD = ReportUnidentifiedData



    Sunday, February 14, 2016 12:13 AM
  • This script does not return the correct number of rows. 

    Sorry about that Erland. I do not know if this helps but here is the
    table names:

    ABR = AnnualBudgetReport
    AR = AnnaulReport
    J = Journals
    RUD = ReportUnidentifiedData

    I would need to more than so. Ideally, I would have CREATE TABLE scripts for all tables and INSERT statements with sample data, enough to illustrate the problem. Then I would also have the desired output given the sample. And finally I would need a short description of the business rules.
    But it seems that Stefan's rewrite works for you, why I guess we can consider the case closed.

    Sunday, February 14, 2016 11:09 AM
  • Thanks Erland. Yes, I will mark it as answered. 

    Thanks again. 

    Sunday, February 14, 2016 9:49 PM
  • This query returned the correct number of rows. And there was no need for the DISTINCT (since the J.DID is distinct) on the CROSS APPLY and the performance went down to 2 seconds. 

    Thanks for all the support. 

    In-

    • Marked as answer by In-efficient Wednesday, March 23, 2016 4:05 PM
    • Unmarked as answer by In-efficient Wednesday, March 23, 2016 4:05 PM
    Wednesday, March 23, 2016 4:05 PM