none
Slow query when uniqueidentifier column used in a NOT IN subquery

    Question

  • I use the following code to return data for a report in SQL Server 2008 Express

    *************************************************************************************************

    select customerguid, invoicenumber, sum(total - account - billeramount - loyaltypoints - giftvoucher) as total, count(Distinct invoicenumber) as qty from sales
    where customerguid is NOT NULL and myDatetime >= 'Jun 1, 2013' AND myDatetime < 'Oct 15, 2013' and terminal = 'Stable Health Services'
    and customerguid NOT IN (select customerguid from sales where myDatetime < 'Jun 1, 2013' and customerguid is NOT NULL and terminal = 'Stable Health Services')
    group by customerguid, invoicenumber

    *************************************************************************************************

    customerguid is a uniqueidentifier column and is NULLABLE and I have used Sql Server Database tuning advisor to generate indexes for this query, which I have applied. The result is still the same though. Slow performance

    It is used to return a list of those customers who had sales during the Date range, BUT no sales prior

    i.e they are new clients within the date span

    The line .......

    and customerguid NOT IN (select customerguid from sales where myDatetime < 'Jun 1, 2013' and customerguid is NOT NULL and terminal = 'Stable Health Services') 

    is the cause of the slow performance

    Is there a better way to write this and still get the result I need

    regards

    Steve Francis

    Monday, October 14, 2013 12:18 AM

Answers

All replies

  • Hi Steve,

    What is slow for Express (a free product)? Are there indexes on the WHERE clause columns? Is statistics up-to-date?

    Use ANSI YYYY-MM-DD string date literal format (unrelated to performance, good programming practice).

    When you look at the ACTUAL EXECUTION PLAN, what do you see?

    Optimization article:

    http://www.sqlusa.com/articles/query-optimization/


    Kalman Toth Database & OLAP Architect IPAD SELECT Query Video Tutorial 3.5 Hours
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012



    Monday, October 14, 2013 12:57 AM
    Moderator
  • Try 'NOT Exists' instead of NOT IN, it may give some performance improvement.

    Also, rewrite the  same using LEFT JOIn as well,

    select customerguid, invoicenumber,
     sum(total - account - billeramount - loyaltypoints - giftvoucher) as total, count(Distinct invoicenumber) as qty
      from sales s_out
    where s_out.customerguid is NOT NULL and s_out.myDatetime >= 'Jun 1, 2013' AND s_out.myDatetime < 'Oct 15, 2013' 
    and s_out.terminal = 'Stable Health Services' 
    and  NOT Exists 
     (select customerguid from sales s_IN where s_IN.myDatetime < 'Jun 1, 2013' and s_IN.customerguid is NOT NULL 
     and s_IN.terminal = s_out.terminal and s_IN.customerguid=s_out.customerguid) 
    group by customerguid, invoicenumber


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Monday, October 14, 2013 3:58 AM
  • It is hard to suggest something without knowing about the indexes  and how much data does it return?

    elect customerguid, invoicenumber, sum(total - account - billeramount - loyaltypoints - giftvoucher) as total, count(Distinct invoicenumber) as qty from sales 
    where customerguid is NOT NULL and myDatetime >= 'Jun 1, 2013' AND myDatetime < 'Oct 15, 2013' and terminal = 'Stable Health Services' 
    group by customerguid, invoicenumber

    Do you really need and customerguid NOT IN (select customerguid from sales where myDatetime < 'Jun 1, 2013' and customerguid is NOT NULL and terminal = 'Stable Health Services') ???


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Monday, October 14, 2013 5:48 AM
    Answerer
  • Hi Uri

    The data returns varying number of rows of data based on the date range selected and the clients table size

    In this example it returns 1100 rows out of 43000 rows in the table

    The reason I put

    and customerguid NOT IN (select customerguid from sales where myDatetime < 'Jun 1, 2013' and customerguid is NOT NULL and terminal = 'Stable Health Services') is to return data from clients who haven't visited before the selected start date (in this case 'Jun 1, 2013')

    I also do another one which returns data from customers where they HAVE visited before 'Jun 1, 2013'

    and customerguid IN (select customerguid from sales where myDatetime < 'Jun 1, 2013' and customerguid is NOT NULL and terminal = 'Stable Health Services') is to return data from clients who haven't visited before the selected start date (in this case 'Jun 1, 2013')

    This one is fine

    So it appears that using a NOT IN with a uniqueidentifier where nulls exist is inefficient

    As far as indexes I have applied the recommendations from SQL Server Database Tuning Advisor, but they make absolutely no difference

    running this in SQl Management studio on an I7 computer 12GB RAM takes 6 seconds

    running with 'NOT IN' replaced with IN is instant

    Removing 'and customerguid is NOT NULL ' fixes the speed issue but does not return any rows

    Guess that will teach me for allowing nulls in the column

    regards

    Steve 

    Monday, October 14, 2013 6:26 AM
  • What if you add to the subquery this additional filter?

    and customerguid NOT IN (select customerguid from sales S where myDatetime < 'Jun 1, 2013' and customerguid is NOT NULL and terminal = 'Stable Health Services' AND S.customerguid =sales.customerguid 


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    • Marked as answer by ga630sf Monday, October 14, 2013 7:25 AM
    Monday, October 14, 2013 6:31 AM
    Answerer
  • Steve, Is your CustomerGuid unique?


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Monday, October 14, 2013 6:33 AM
  • I don't know what indexes do you have, But for this query try this index:

    CREATE NONCLUSTERED INDEX ix_sales_new
    ON sales (myDatetime, terminal )
    INCLUDE (customerguid,invoicenumber,total , account , billeramount , loyaltypoints , giftvoucher, invoicenumber)


    Saeid Hasani, sqldevelop.wordpress.com

    Download Books Online for SQL Server 2012

    Monday, October 14, 2013 6:36 AM
  • Steve try this:

    SELECT customerguid, invoicenumber, sum(total - account - billeramount - loyaltypoints - giftvoucher) as total, count(Distinct invoicenumber) as qty 
    FROM Sales LeftSales LEFT JOIN (select customerguid from sales where myDatetime < 'Jun 1, 2013' and customerguid is NOT NULL and terminal = 'Stable Health Services') RightSales
    ON LeftSales.customerguid=RightSales.customerguid
    WHERE 	LeftSales.customerguid is NOT NULL 
    		and LeftSales.myDatetime >= 'Jun 1, 2013' 
    		AND LeftSales.myDatetime < 'Oct 15, 2013' 
    		and LeftSales.terminal = 'Stable Health Services'
    		AND RightSales.customerguid IS NULL


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Monday, October 14, 2013 6:39 AM
  • Hi Uri

    Worked a treat

    Thank you

    regards

    Steve

    Monday, October 14, 2013 7:26 AM
  • Hi Himanshu

    Yours worked great, but I 'marked as answer' Uri's reply as that worked perfectly also

    Thanks for the help

    regards

    Steve

    Monday, October 14, 2013 7:28 AM