Sql Tuning
-
Tuesday, April 10, 2012 8:30 PM
I am running the below query on sql server 2008 and it's taking lot of time . I think the problem is dateDiff as it's negating index on date_view and doing scan . Can some one suggest other way of writing this code
SELECT Distinct p.Person_ID, p.Email,MAX(ul.date_view) AS dtLastLogin, DateDiff(d,MAX(ul.date_view),getdate()) AS nDays,l.Lender_ID,l.parent_lender,l.Name
FROM Person p WITH (NOLOCK)
INNER JOIN Lender_vendor_role_person_XREF lvrpx WITH (NOLOCK) ON lvrpx.Person_ID = p.Person_ID
INNER JOIN Lender l WITH (NOLOCK) ON (l.Lender_ID = lvrpx.Lender_ID AND l.isActive = 1)
INNER JOIN Role r WITH (NOLOCK) on (r.Role_ID = lvrpx.Role_ID AND r.login_type_id=14)
left JOIN user_log ul WITH (NOLOCK) ON ul.Person_ID = p.Person_ID
WHERE lvrpx.isPersonActive =1
GROUP by p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name
HAVING DateDiff(d,MAX(ul.date_view),getdate())>=(CASE WHEN l.parent_lender = 1018 THEN 30 ELSE 120 END)
ORDER by l.parent_lender- Moved by Tom PhillipsModerator Wednesday, April 11, 2012 8:08 PM TSQL question (From:SQL Server Database Engine)
All Replies
-
Tuesday, April 10, 2012 8:32 PM
Hi,
Can you post the execution plan please? Also DDL with some sample data + the expected output may help us to provide quality help.
Thanks,
Janos
There are 10 type of people. Those who understand binary and those who do not.
My Blog -
Tuesday, April 10, 2012 9:37 PMWe only want some alternative for DateDiff that is causing problems
-
Tuesday, April 10, 2012 9:43 PMModeratorhaving max(ul.date_view) >=dateadd(day, -1* (case when l.Parent_lender = 1018 then 30 else 120 end), CURRENT_TIMESTAMP)
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Tuesday, April 10, 2012 9:52 PM
Naomi ,
with your code I am getting different results
-
Tuesday, April 10, 2012 9:53 PM
you can try replacing the filter in the having clause by using an in-line queue so that this particular filter moves to the outer query and your inner query can use the index you mention.
Another option is to use a CTE defined as the query you got without the HAVING clause. Then use the filter in the outer query.
However, if you performance issue is arising out of the fact that SQL Server is doing an index scan instead of a seek, leading to too many reads - you can't avoid it. This is because when you use aggregate functions like MAX, MIN , AVG , etc - the optimizer will choose to do a scan ( full or range ). you can try to tune this by making your WHERE clause more restrictive and selecting only the bare minimum required data.
Hope this helps !
Sanil Mhatre | Database Developer | MCTS | If you find my reply useful in any way, please vote it as helpful. If it has helped answer your question, please mark it as Answer.
-
Tuesday, April 10, 2012 10:01 PMModerator
Try:
;with cte as (SELECT Distinct p.Person_ID, p.Email,MAX(ul.date_view) AS dtLastLogin, DateDiff(d,MAX(ul.date_view),getdate()) AS nDays,l.Lender_ID,l.parent_lender,l.Name FROM Person p WITH (NOLOCK) INNER JOIN Lender_vendor_role_person_XREF lvrpx WITH (NOLOCK) ON lvrpx.Person_ID = p.Person_ID INNER JOIN Lender l WITH (NOLOCK) ON (l.Lender_ID = lvrpx.Lender_ID AND l.isActive = 1) INNER JOIN Role r WITH (NOLOCK) on (r.Role_ID = lvrpx.Role_ID AND r.login_type_id=14) left JOIN user_log ul WITH (NOLOCK) ON ul.Person_ID = p.Person_ID WHERE lvrpx.isPersonActive =1 GROUP by p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name) select * from cte where dtLastLogin <= dateadd(day, - 1* (CASE WHEN parent_lender = 1018 THEN 30 ELSE 120 END), CURRENT_TIMESTAMP) ORDER by parent_lender
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Tuesday, April 10, 2012 10:05 PM
Naomi ,
I am getting 1 record less with your query
-
Tuesday, April 10, 2012 10:14 PMModerator
Actually, why you're not using nDays which you already calculated in the cte?
;with cte as (SELECT p.Person_ID, p.Email,MAX(ul.date_view) AS dtLastLogin, DateDiff(d,MAX(ul.date_view),getdate()) AS nDays,l.Lender_ID,l.parent_lender,l.Name FROM Person p WITH (NOLOCK) INNER JOIN Lender_vendor_role_person_XREF lvrpx WITH (NOLOCK) ON lvrpx.Person_ID = p.Person_ID INNER JOIN Lender l WITH (NOLOCK) ON (l.Lender_ID = lvrpx.Lender_ID AND l.isActive = 1) INNER JOIN Role r WITH (NOLOCK) on (r.Role_ID = lvrpx.Role_ID AND r.login_type_id=14) left JOIN user_log ul WITH (NOLOCK) ON ul.Person_ID = p.Person_ID WHERE lvrpx.isPersonActive =1 GROUP by p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name) select * from cte where nDays >= CASE WHEN parent_lender = 1018 THEN 30 ELSE 120 END ORDER by parent_lender
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Tuesday, April 10, 2012 10:17 PMYeah , That's correct
-
Wednesday, April 11, 2012 4:35 PMThis is is still taking more time . Any suggestions?
-
Wednesday, April 11, 2012 5:23 PMModeratorWhy do you need an INNER JOIN with Role table? What will happen if you remove it?
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Wednesday, April 11, 2012 5:39 PMWe need that to filter the resultsets. The problem lies in the aggregate functions on the user_log table it's a very huge table . Can we do something about this ?
-
Wednesday, April 11, 2012 6:05 PMModeratorHow about you select everything except the user_log first into a temp table (with indexes), then get aggregates for user_log with the exist check to get only needed users and finally join that with that temp table to get the required result back?
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Wednesday, April 11, 2012 8:23 PM
I tried the below code on your recommendation but its taking more time than the previous one
create Table #temp (Person_ID INT,Email Varchar(50),Lender_ID int,Parent_Lender int,Name Varchar(50))
INSERT INTO #temp
SELECT Distinct p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name
FROM Person p WITH (NOLOCK)
INNER JOIN Lender_vendor_role_person_XREF lvrpx WITH (NOLOCK) ON lvrpx.Person_ID = p.Person_ID
INNER JOIN Lender l WITH (NOLOCK) ON (l.Lender_ID = lvrpx.Lender_ID AND l.isActive = 1)
INNER JOIN Role r WITH (NOLOCK) on (r.Role_ID = lvrpx.Role_ID AND r.login_type_id=14)
WHERE lvrpx.isPersonActive =1
GROUP by p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name
SELECT MAX(ul.date_view) AS dtLastLogin, DateDiff(d,MAX(ul.date_view),getdate()) AS nDays,t.Person_ID ,t.Email ,t.Lender_ID ,t.Parent_Lender ,t.Name
From #temp t
inner JOIN user_log ul ON t.Person_ID =ul.Person_ID
GROUP by t.Person_ID, t.Email,t.Lender_ID,t.parent_lender,t.Name
having DateDiff(d,MAX(ul.date_view),getdate())>=(CASE WHEN t.parent_lender = 1018 THEN 30 ELSE 120 END)
-
Wednesday, April 11, 2012 8:42 PMModerator
Try:
SELECT p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name into #TempInfo FROM Person p WITH (NOLOCK) INNER JOIN Lender_vendor_role_person_XREF lvrpx WITH (NOLOCK) ON lvrpx.Person_ID = p.Person_ID INNER JOIN Lender l WITH (NOLOCK) ON (l.Lender_ID = lvrpx.Lender_ID AND l.isActive = 1) INNER JOIN Role r WITH (NOLOCK) on (r.Role_ID = lvrpx.Role_ID AND r.login_type_id=14) WHERE lvrpx.isPersonActive =1 GROUP by p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name create index idxTempInfo_PersonID ON #TempInfo (Person_ID) ;with cte as (select TI.*, ul.date_view, datediff(day, ul.date_view, CURRENT_TIMESTAMP) as nDays, ROW_NUMBER() over (partition by ul.Person_ID order by date_view DESC) as Rn from #TempInfo TI inner join dbo.user_log ul on TI.Person_ID = ul.Person_ID) select * from cte where Rn = 1 and nDays >=case when Parent_Lender = 1018 then 30 else 120 end)
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Wednesday, April 11, 2012 8:55 PMI am getting different resultsets and its taking more time .
-
Wednesday, April 11, 2012 8:56 PM
Another option might be the following, assuming the person table is relatively small, and having an index on the user_log table on Person_id and date_view. Not tested so might contain typo's.
SELECT Distinct PTable.Person_ID, p.Email, dtLastLogin AS dtLastLogin, DateDiff(d,dtLastLogin,getdate()) AS nDays,l.Lender_ID,l.parent_lender,l.Name FROM (SELECT p.Person_ID, (SELECT TOP (1) date_view FROM user_log ul WHERE ul.Person_ID = p.Person_ID ORDER BY date_view DESC) as dtLastLogin
FROM Person p INNER JOIN Lender_vendor_role_person_XREF lvrpx ON lvprx.Person_ID = p.Person_ID WHERE lvrpx.isPersonActive = 1 ) AS PTable INNER JOIN Person p ON PTable.PeronID = p.PersonID INNER JOIN Lender_vendor_role_person_XREF lvrpx WITH (NOLOCK) ON lvrpx.Person_ID = p.Person_ID INNER JOIN Lender l WITH (NOLOCK) ON (l.Lender_ID = lvrpx.Lender_ID AND l.isActive = 1) INNER JOIN Role r WITH (NOLOCK) on (r.Role_ID = lvrpx.Role_ID AND r.login_type_id=14) WHERE DateDiff(d, dtLAstLogin, getdate()) >=(CASE WHEN l.parent_lender = 1018 THEN 30 ELSE 120 END) GROUP by p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name ORDER by l.parent_lender
-
Wednesday, April 11, 2012 9:03 PMIts taking long
-
Wednesday, April 11, 2012 9:09 PMIt seems that the user_log is the bottle neck, does this table have any indexes? I would try first to write a query to get the results from user_log, i.e. Person_ID and date_view, make this as fast as possible and then tie in the rest.
-
Wednesday, April 11, 2012 9:16 PMYes it has indexes on the columns that are used in the query
-
Wednesday, April 11, 2012 9:29 PMbut datediff is negating the index
-
Wednesday, April 11, 2012 9:30 PM
What if you run a query like the following, does this take long as well?
SELECT Person_ID, MAX(date_view) FROM user_log GROUP BY Person_ID
I tested on a table with 1,595,788 records and a query like this should take less than a second.
- Marked As Answer by Deputy12 Wednesday, April 11, 2012 9:30 PM
-
Wednesday, April 11, 2012 11:30 PM
SELECT Person_ID, MAX(date_view) FROM user_log GROUP BY Person_ID
This took 10 minutes to complete and user_log has 700 million records
-
Thursday, April 12, 2012 12:46 AM
That sounds about right if you consider 1.5 million to take about a second. I don't know how it compares to your other executions times, but that seems like a long time to wait for results. The last thing I can think of is replacing the select in the previous query with the grouped query and hoping that the join and select on lvrpx will leave a lot of records out of the equation, maybe add a date or ID selection to keep older records out of the select.
Other than that I would think about making a log for your user_log, and put aside records that you are not going to need in daily processing to make the user_log file more manageable.
SELECT Distinct PTable.Person_ID, p.Email, dtLastLogin AS dtLastLogin, DateDiff(d,dtLastLogin,getdate()) AS nDays,l.Lender_ID,l.parent_lender,l.Name FROM (SELECT ul.Person_ID, Max(date_view) as dtLastLogin FROM user_log ul LEFT OUTER JOIN Lender_vendor_role_person_XREF lvrpx ON lvprx.Person_ID = ul.Person_ID WHERE lvrpx.isPersonActive = 1 GROUP BY ul.Person_ID ) AS PTable INNER JOIN Person p ON PTable.PeronID = p.PersonID INNER JOIN Lender_vendor_role_person_XREF lvrpx WITH (NOLOCK) ON lvrpx.Person_ID = p.Person_ID INNER JOIN Lender l WITH (NOLOCK) ON (l.Lender_ID = lvrpx.Lender_ID AND l.isActive = 1) INNER JOIN Role r WITH (NOLOCK) on (r.Role_ID = lvrpx.Role_ID AND r.login_type_id=14) WHERE DateDiff(d, dtLAstLogin, getdate()) >=(CASE WHEN l.parent_lender = 1018 THEN 30 ELSE 120 END) GROUP by p.Person_ID, p.Email,l.Lender_ID,l.parent_lender,l.Name ORDER by l.parent_lender

