none
How to join and optimize the query

    질문

  • Hello All,

    I need help in below scenario. I am working on customer aging query. for bucketing, since data is large and it’s a daily process, I have taken two approach.

    one, calculate outstanding and buckets for all dates for all customers i.e. starting from the first date of the customer transaction dates. Which will run only periodically. second, incremental manner, where I just check for all customer, if any modification done between now and last SSIS package run, if yes, the min transaction date of modified records, and I recalculating outstanding and bucketing from the min modified transaction date.

    When I calculate using first method, i.e. for all transactions I have no issues. But When using second method, that is from intermediate date, I face issue, since for calculating buckets, I need data of all previous dates records too. e.g. for bucket 0-3, SUM(SUM(ISNULL(([A]. AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 3 preceding and 0, I need data for previous 3 days, whereas my data is only from the modified transaction date. 
    If I use Sub query for getting sum of all previous records, everything works well. Just it takes significant time e.g. it takes 2 mins 20 second using sub query, in compare to 16 seconds for some sample customer data. Since the database is huge, we need to optimize query.  

    Looking for your valuable suggestion how to improve the performance in this case. Basically, I am struggling to how to join two tables, 
    i.e. query having only data from modified transaction date with query having all data. I just want all data query for calculating previous records sums. Please note, one of the buckets is 12 or more days. I have attached partial SQL script for bucketing.

    SQL Script

    Output
    Thanks in Advance.

    Meenxi


    • 편집됨 meenxi 2018년 6월 13일 수요일 오후 1:07
    2018년 6월 13일 수요일 오전 11:23

모든 응답

  • Can you post your expected output in table format?
    2018년 6월 13일 수요일 오전 11:38
  • Post updated with output required, you may not achieve the output, I just need suggest on how to join or other ways.
    2018년 6월 13일 수요일 오후 1:08
  • Like this?

    select *
    ,ISNULL(SUM(AMOUNT) over (partition by CustomerID order by TRANSDATE desc  ROWS  BETWEEN  CURRENT ROW  and 3 FOLLOWING   ),0) b0_3
    ,ISNULL(SUM(AMOUNT) over (partition by CustomerID order by TRANSDATE desc ROWS BETWEEN 4 FOLLOWING AND 6 FOLLOWING),0)  b4_6 
    ,ISNULL(SUM(AMOUNT) over (partition by CustomerID order by TRANSDATE desc ROWS BETWEEN 7 FOLLOWING AND 9 FOLLOWING ),0)  b7_9 
    ,ISNULL(SUM(AMOUNT) over (partition by CustomerID order by TRANSDATE desc ROWS BETWEEN 10 FOLLOWING AND 12 FOLLOWING ),0)  b10_12 
    ,ISNULL(SUM(AMOUNT) over (partition by CustomerID order by TRANSDATE desc ROWS BETWEEN 13 FOLLOWING AND UNBOUNDED FOLLOWING  ),0) above_12 
     FROM dbo.Customer_Outstanding_Daily  
     Where  [TRANSDATE] >= @AgeingStartDate AND CustomerID = @AccNum
     AND TRANSTYPE = 'Debit'
     order by [TRANSDATE]
    

    2018년 6월 13일 수요일 오후 4:13
    중재자
  • this wouldn't work, as if I take records from intermediate date, then I wouldn't have earlier records. so Desc would not work.

    I have changed this to having all records in the dataset, added case statement to check if my transdate >= the min modified transaction date. this has improved performance.

    (CASE WHEN A.TRANSDATE >= @TransStartDate THEN SUM(SUM(CASE WHEN [A].TRANSTYPE = 'Debit' THEN ISNULL(([A].AMOUNTMST),0) ELSE 0 END)) over (partition by ACCOUNTNUM order by TRANSDATE rows between 3 preceding and 0 preceding) ELSE 0 END) as Bkt0_3,

    I just wanted to check if any other better way, because adding case statement will check for all records starting from the day 1 of the customer transaction date, though the actual bucket calculation will start from the modified transaction date only.

    2018년 6월 14일 목요일 오전 7:19
  • Hi meenxi,

    For such case, there is no better way to optimize SQL statements. Due to using windows aggregate functions, Almost all the records in the source table would be accessed. But we could improve performance in transmitting data. The detailed thought is like this.

    create table temp4
    	(
    		[TRANSDATE] Datetime,
    		[ACCOUNTNUM] Nvarchar(20),
    		[OdAmount] Numeric(32,16),
    		[Bkt0_3] Numeric(32,16),
    		[Bkt4_6] Numeric(32,16),
    		[Bkt7_9] Numeric(32,16),
    		[Bkt10_12] Numeric(32,16),
    		[Above12] Numeric(32,16)
    	)
    --First, you need to load all the data to output table
    DECLARE
    	@AccNum Nvarchar(20) = 'A',
    	@TransStartDate datetime = '2016-10-14', --min modified transaction date
    	@v_count int
    
    	select @v_count=count(1) from temp4
    
    	if (@v_count=0) --First, you need to load all the data to output table
    	BEGIN
    	INSERT INTO temp4
    	Select
    		[A].[TRANSDATE],
    		[A].[ACCOUNTNUM],
    		ISNULL([A].[Outstanding],0) as [OdAmount],
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 3 preceding and 0 preceding) as Bkt0_3,
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 6 preceding and 4 preceding) as Bkt4_6,
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 9 preceding and 7 preceding) as [Bkt7_9],
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 12 preceding and 10 preceding) as [Bkt10_12],
    		SUM(CASE WHEN [A].[TRANSDATE] > DATEADD(DAY,-12,[A].[TRANSDATE]) THEN SUM(ISNULL(([A].AMOUNTMST),0)) ELSE 0 END) over (partition by ACCOUNTNUM order by TRANSDATE) as [Above12]
    	FROM
    		dbo.Fact_Customer_Outstanding_Daily [A] 
    	--Where [A].[TRANSDATE] >= @TransStartDate AND ACCOUNTNUM = @AccNum AND Transtype = 'Debit'
    	Where  ACCOUNTNUM = @AccNum AND Transtype = 'Debit'
    	GROUP BY [A].[ACCOUNTNUM],[A].[TRANSDATE],[A].[Outstanding]
    	END
    	ELSE --After the first time, you could load part data into the table.(From the minimum transaction date)
    	BEGIN
    	
    	DELETE FROM temp4 WHERE  ACCOUNTNUM = @AccNum AND Transtype = 'Debit' AND [TRANSDATE] >= @TransStartDate
    	INSERT INTO temp4
    	SELECT [TRANSDATE],[ACCOUNTNUM],[OdAmount],Bkt0_3,Bkt4_6,[Bkt7_9],[Bkt10_12],[Above12]
    	FROM 
    	(Select
    		[A].[TRANSDATE],
    		[A].[ACCOUNTNUM],
    		ISNULL([A].[Outstanding],0) as [OdAmount],
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 3 preceding and 0 preceding) as Bkt0_3,
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 6 preceding and 4 preceding) as Bkt4_6,
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 9 preceding and 7 preceding) as [Bkt7_9],
    		SUM(SUM(ISNULL(([A].AMOUNTMST),0))) over (partition by ACCOUNTNUM order by TRANSDATE rows between 12 preceding and 10 preceding) as [Bkt10_12],
    		SUM(CASE WHEN [A].[TRANSDATE] > DATEADD(DAY,-12,[A].[TRANSDATE]) THEN SUM(ISNULL(([A].AMOUNTMST),0)) ELSE 0 END) over (partition by ACCOUNTNUM order by TRANSDATE) as [Above12]
    	FROM
    		dbo.Fact_Customer_Outstanding_Daily [A] 
    	--Where [A].[TRANSDATE] >= @TransStartDate AND ACCOUNTNUM = @AccNum AND Transtype = 'Debit'
    	Where  ACCOUNTNUM = @AccNum AND Transtype = 'Debit'
    	GROUP BY [A].[ACCOUNTNUM],[A].[TRANSDATE],[A].[Outstanding]
    	) t 
    	WHERE T.[TRANSDATE]>=@TransStartDate
    
    	END

    In addition, you could create indexes on condition columns to improve performance as well.

    Best Regards,

    Will


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    2018년 6월 14일 목요일 오전 9:23
    중재자
  • One question here, the table which will store the outstanding and bucket details, will keep on deleting bulk records and adding new records on a daily basis. Adding the index on such table is suggested? will it improve performance?
    2018년 6월 14일 목요일 오전 9:53
  • One question here, the table which will store the outstanding and bucket details, will keep on deleting bulk records and adding new records on a daily basis. Adding the index on such table is suggested? will it improve performance?

    "create indexes on condition columns" is for the source table "dbo.Fact_Customer_Outstanding_Daily", not for the output table.

    As you know, indexes can be helpful for a variety of queries that contain SELECT, UPDATE, DELETE, or MERGE statements. But it is not suitable for inserting tables. Whether the table could be created indexes or not depends on the number of table records to be operated.

    Best Regards,

    Will


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    2018년 6월 15일 금요일 오전 6:42
    중재자