none
SQL 2008 Full Text search using CONTAINS very slow, but FREETEXT runs fast.

    Question

  • We just upgraded from SQL2005 to SQL2008 R2 x86 (in-place upgrade) and queries using full text indexes with a CONTAINS where condition now take 2 to 5 minutes to run where they used to run in less than a second!

    Oddly, if I change the query to use FREETEXT instead of CONTAINS, the search runs fast (<1 second).

    When I look at the query execution plan for both CONTAINS and FREETEXT selects, the execution plan looks the same and it estimates that both queries would take the same time to run.  It does not make sense that a select using CONTAINS would take much longer than the same select using FREETEXT.

    I have updated stats, rebuilt the full text catalog, and rebuilt indexes on the tables but nothing helps.
    We also tried applying CU 4 for SQL2008 R2 to see if this will solve our issue because, but we experienced some issues during this patch.  At this time I'm not certain if the patch was sucessfully applied yet.

    Also If I change the where CONTAINS condition and instead use a join to CONTAINSTABLE the query takes twice as long 5-10 minutes.


    --CONTAINS, used to run in a second on SQL2005, now takes 2+ minutes to run and causes application to timeout
    SELECT DISTINCT
            a.O_ref_num,
            a.[Sector],
            a.project_name,
            b.organizationName,
            a.info_type,
            CASE
                WHEN (LTRIM(RTRIM(ISNULL(a.location,'')))='') THEN a.state
                ELSE a.location
            END AS [location],
            a.submittal_date,
            a.publish_date,
            b.organizationId,
            b.pid,
            a.state
            FROM OData a
            LEFT OUTER join dbo.OXref b on a.owner_id = b.owner_id
            INNER JOIN [OEvents] OE ON a.O_ref_num = OE.RefNumber
            WHERE
             OE.EventProcessed = 0 AND
             0=0 AND
             (a.[submittal_date] >= '12/13/2010' OR a.[submittal_date] IS NULL)
     AND CONTAINS( a.*, '"race*" ~ "to*" ~ "the*" ~ "top*"')
      ORDER BY   a.publish_date DESC

    --FREETEXT runs within a second
    SELECT DISTINCT
            a.O_ref_num,
            a.[Sector],
            a.project_name,
            b.organizationName,
            a.info_type,
            CASE
                WHEN (LTRIM(RTRIM(ISNULL(a.location,'')))='') THEN a.state
                ELSE a.location
            END AS [location],
            a.submittal_date,
            a.publish_date,
            b.organizationId,
            b.pid,
            a.state
            FROM OData a
            LEFT OUTER join dbo.OXref b on a.owner_id = b.owner_id
            INNER JOIN [OEvents] OE ON a.O_ref_num = OE.RefNumber
            WHERE
             OE.EventProcessed = 0 AND
             0=0 AND
             (a.[submittal_date] >= '12/13/2010' OR a.[submittal_date] IS NULL)
     AND freetext( a.*, '"race*" ~ "to*" ~ "the*" ~ "top*"')
      ORDER BY   a.publish_date DESC

    Here are the tables

    CREATE TABLE [dbo].[OEvents](
        [EventId] [int] IDENTITY(1,1) NOT NULL,
        [RefNumber] [int] NOT NULL,
        [EventCode] [int] NOT NULL,
        [EventDate] [datetime] NOT NULL,
        [EventProcessed] [bit] NOT NULL,
     CONSTRAINT [PK_OEvents] PRIMARY KEY CLUSTERED
    (
        [EventId] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[OEvents] ADD  CONSTRAINT [DF_OEvents_EventProcessed]  DEFAULT ((0)) FOR [EventProcessed]
    GO

    CREATE NONCLUSTERED INDEX [IX_OEvents_EventProcessed] ON [dbo].[OEvents]
    (
        [EventProcessed] ASC,
        [RefNumber] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    GO

    CREATE TABLE [dbo].[OXref](
        [owner_id] [int] NOT NULL,
        [pin] [int] NULL,
        [pid] [int] NULL,
        [organizationId] [int] NULL,
        [organizationName] [nchar](60) NULL,
        [organizationClass] [int] NULL,
        [organizationType] [int] NULL,
        [organizationTypeDescription] [nchar](50) NULL,
     CONSTRAINT [PK_OXref] PRIMARY KEY CLUSTERED
    (
        [owner_id] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [IX_OXref_OrgName] ON [dbo].[OXref]
    (
        [organizationName] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    GO

    CREATE NONCLUSTERED INDEX [IX_OXref_ownerId] ON [dbo].[OXref]
    (
        [owner_id] ASC
    )
    INCLUDE ( [pid],
    [organizationId],
    [organizationName]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    GO

    CREATE TABLE [dbo].[OData](
        [publish_date] [datetime] NULL,
        [sector] [nvarchar](20) NULL,
        [info_type] [nvarchar](50) NULL,
        [O_ref_num] [int] NOT NULL,
        [project_num] [nvarchar](255) NULL,
        [project_name] [varchar](max) NULL,
        [owner] [varchar](max) NULL,
        [location] [nvarchar](max) NULL,
        [state] [nvarchar](255) NULL,
        [zipcode] [varchar](max) NULL,
        [county] [nvarchar](255) NULL,
        [submittal_date] [datetime] NULL,
        [contact_name] [nvarchar](255) NULL,
        [phone] [nvarchar](255) NULL,
        [minimum_value] [nvarchar](255) NULL,
        [maximum_value] [nvarchar](255) NULL,
        [plan_price] [nvarchar](255) NULL,
        [pre_bid] [datetime] NULL,
        [url] [varchar](max) NULL,
        [categories] [varchar](max) NULL,
        [scope] [varchar](max) NULL,
        [owner_id] [int] NULL,
        [sectorType] [int] NULL,
        [stateSearch] [varchar](max) NULL,
        [OrganizationName] [varchar](max) NULL,
        [createdDate] [datetime] NOT NULL,
     CONSTRAINT [PK_OData] PRIMARY KEY CLUSTERED
    (
        [O_ref_num] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF
    GO

    ALTER TABLE [dbo].[OData] ADD  CONSTRAINT [DF_OData_ProcessingDate]  DEFAULT (getdate()) FOR [createdDate]
    GO

    CREATE NONCLUSTERED INDEX [IX_InfoType] ON [dbo].[OData]
    (
        [info_type] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    GO

    CREATE NONCLUSTERED INDEX [IX_OData_OwnerId] ON [dbo].[OData]
    (
        [owner_id] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    GO

    CREATE NONCLUSTERED INDEX [IX_OData_submittal] ON [dbo].[OData]
    (
        [submittal_date] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    GO

    CREATE NONCLUSTERED INDEX [IX_OData_SubmittalDate] ON [dbo].[OData]
    (
        [submittal_date] ASC,
        [owner_id] ASC,
        [O_ref_num] ASC,
        [publish_date] ASC
    )
    INCLUDE ( [sector],
    [info_type],
    [project_name],
    [location],
    [state]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    GO

    CREATE FULLTEXT CATALOG [OFullTextIndex]WITH ACCENT_SENSITIVITY = ON
    AUTHORIZATION [dbo]

    CREATE FULLTEXT CATALOG OFullTextIndex
    WITH ACCENT_SENSITIVITY = ON
    GO
    CREATE FULLTEXT INDEX ON [dbo].[OData] (
    project_name Language 1033
    ,owner Language 1033
    ,location Language 1033
    ,zipcode Language 1033
    ,categories Language 1033
    ,scope Language 1033
    ,stateSearch Language 1033
    ,OrganizationName Language 1033
    )
    KEY INDEX PK_OData
    ON OFullTextIndex
    WITH CHANGE_TRACKING AUTO
    GO



    Current rows in each table
    OData    424977
    OXref    9728
    OEvents    17820
    • Moved by Jamie ThomsonMVP Wednesday, December 15, 2010 5:24 PM SSIS forum not relevent for full text search related questions (From:SQL Server Integration Services)
    Wednesday, December 15, 2010 5:18 PM

All replies

  • but I doubt they return the same results. Basically FreeText chucks the nearness and wildcard operators and merley searches on the stemmed words race and top (to, and the are noise words and are also chucked/not searched on).

    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    • Marked as answer by Ai-hua Qiu Thursday, December 23, 2010 8:51 AM
    • Unmarked as answer by svollhardt_dnb.com Wednesday, December 29, 2010 8:53 PM
    Thursday, December 16, 2010 2:26 AM
  • I do not have the solution, and I just found a similar problem right after I upgraded from SQL2005 to SQL2008R2(Enterprise Edition).

    I am dealing with a vendor app that is not only slow, but also takes over the cpu while it runs.  This was not an issue in 2005 so I'm still dealing with how to fix it.

    I have confirmed that the query returns the same result on 2005 as it does on 2008. The execution plans look more or less the same, but the 2008 verbage is a little different.

    I did narrow down the problem to the * wildcard in my contains clause. It doesn't matter much if it is preceding or trailing. If I remove the wildcard, the query runs instantly, otherwise with the wildcard preceding or trailing, it takes 12 minutes. The same query with wildcards on 2005, takes 20 seconds.

    I would disagree a little with the other posts that sql 2008 simplies chucks the wildcard. I think the wildcard causes some serious looping to occur in the execution...but that is not really indicated in the execution plan which just has one task for the full text search... "Table Valued FUnction [FullTextMatch] Cost: 100%"        2005 Execution plan describes this slightly differently as

    Remote Scan Cost: 3% -> Table Spool (Lazy Spool) Cost: 97% ->

    Since this is a vendor app, I do not have much I can do yet. I'm just tryign to keep it from taking over the cpu which it seems to be doing now. I don't have a solution to your issue....just confirmation you aren't alone.

     

    I didn't write this terribly inefficient query with so many wildcards, but I can confirm that

    1)it ran in 20 seconds on similar hardware on 2005 and takes 12 minutes on 2008

    2)If I remove the wildcards * from the contains portion of the where clause, it runs in a few seconds.

     SELECT M.mrID, mrPRIORITY, mrASSIGNEES, {fn RTRIM(mrSTATUS)}, mrTITLE, Name__b__PLast__M__bFirst__p, mrSUBMITDATE, mrSUBMITDATE AS ALIAS_8, mrURGENT, mrREF_TO_MR, {fn RTRIM(mrSUBMITTER)}, mrATTACHMENTS, mrSOLUTION_TYPE
    FROM MASTER97 M
    INNER JOIN MASTER97_ABDATA ON (M.mrID = MASTER97_ABDATA.mrID)
    WHERE ((mrSUBMITTER = 'myuser' or
    (mrSTATUS not in ('_PENDING_SOLUTION_')))
    and ( ( (mrTITLE LIKE '%A&E Bldg%' or (CONTAINS(M.*, '"*A&E Bldg*"'))
    or Type__bof__bIncident LIKE '%A__7E__bBldg%'
    or Main__bCategory LIKE '%A__7E__bBldg%'
    or Sub__ucategory LIKE '%A__7E__bBldg%'
    or Trouble__bBuilding LIKE '%A&E Bldg%'
    or Trouble__bRoom LIKE '%A&E Bldg%'
    or IP__bAddress LIKE '%A&E Bldg%'
    or Cable__bID LIKE '%A&E Bldg%'
    or Date__ftime__bproblem__bfirst__boccurred__bLink LIKE '%A&E Bldg%'
    or New__bInstallation__Q LIKE '%A__7E__bBldg%'
    or Work__border__b__3 LIKE '%A&E Bldg%'
    or Browser LIKE '%A&E Bldg%'
    or Device__bType LIKE '%A&E Bldg%'
    or Affected__bEquipment LIKE '%A&E Bldg%'
    or Line__bType LIKE '%A__7E__bBldg%'
    or Equipment__bType LIKE '%A__7E__bBldg%'
    or Vendor LIKE '%A&E Bldg%' or Vendor__bPhone__b__3 LIKE '%A&E Bldg%'
    or Vendor__bTicket__b__3 LIKE '%A&E Bldg%'
    or Vendor__bContact LIKE '%A&E Bldg%'
    or Trouble__bNumber LIKE '%A&E Bldg%'
    or Reported__bVia LIKE '%A__7E__bBldg%'
    or Billable__bRepair__Q LIKE '%A__7E__bBldg%'
    or Wireless__bDevice LIKE '%A__7E__bBldg%'
    or Library__bStack__bLocation LIKE '%A&E Bldg%'
    or Scope LIKE '%A__7E__bBldg%'
    or Software__bPackage LIKE '%A__7E__bBldg%'
    or Email__bAddress LIKE '%A&E Bldg%'
    or Phone LIKE '%A&E Bldg%' or Name__b__PLast__M__bFirst__p LIKE '%A&E Bldg%'
    or CalNetID LIKE '%A&E Bldg%'
    or Address LIKE '%A&E Bldg%'
    or Job__bTitle LIKE '%A&E Bldg%'
    or Department LIKE '%A&E Bldg%')  ) ) )
    ORDER BY M.mrID DESC

     

     changing to

    (

     

    CONTAINS(M.*, '"A&E Bldg"')   runs in a second or two...(THe  Application is Numara footprints)


    UC Berkeley
    Wednesday, October 05, 2011 6:46 PM
  • Well whenever you have a wild card in freetext it is "chucked" ie ignored. Whether you like it or not this is the case.

    I would suspect that the like's are very expensive here.

    Is the indexing the same?


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Wednesday, October 05, 2011 6:57 PM
  • 1)I agree the "likes" with the preceding % especially are expensive, but even with them, the query takes an acceptable 1 second at most...asumming you remove the CONTAINS statement that reads the FT catalog.

    2)The indexing is the same, and were rebuilt. Statistics have been updated, full text catalog population is essentially the same...a few items have been added over the days it has been running on 2008.

    3)Putting the DB in 2005 compatibility mode does not help

    4)Changing from CONTAINS to FREETEXT does reduce the query to 1 second as the first poster hinted to, but it also is not the same command, so returns a different result set as you hinted to.

    5)It seems the trailing wildcard * in the contains statement is the culprit in my query...the preceding one does not seem to have much if any effect.

    6)I have confirmed on another 2008 system the same behavior

    7)The execution plan shows the same regardless of the wildcards in the contain or freetext statements

    8)Since the execution plan is the same, changing compatability mode does not help, and I can confirm on two systems, I suspect that the wildcard is not simply ignored. It may be ignored in terms of your results, but it seems to definitely cause the deep internals of the full text search engine a "loop" that is very hard to churn on.


    UC Berkeley
    Wednesday, October 05, 2011 8:51 PM
  • Keep in mind you have a lazy spool in here. SQL Server will lazily return query results. It is all bets off on what performance will be like. You need to somehow remove this lazy spool from your execution plan.

    Re wildcards - they are not acted upon in a freetext search. It is a contains only parameter.  It is not a your results may vary thing. Contains and Freetext are very different operators as well.

    Freetext does implicit stemming which may also account for what you are seeing.


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Thursday, October 06, 2011 10:50 AM
  • Numara Tech support was able to help me sort out the issue.   They knew of a known MS bug that MS included with Cumulative Update 4 for 2008 R2 , but it is not enabled.  I was on CU9, but still had the issue since you have to take further action even with the CU applied by running the dbcc commands.

    The issue is described in this hotfix article.

    http://support.microsoft.com/kb/2264562/

     

    but I believe you also have to run

    DBCC TRACEON(4199, -1);
    GO;

    DBCC FREEPROCCACHE;

    to enable the fix.   I did not need to restart sql and the fix seem to take affect on all new queries. A query that took 12 minutes for CONTAINS with OR and trailing * went to less than one second.

     


    UC Berkeley
    Wednesday, October 12, 2011 5:43 PM
  • I have the same issue here but the solution given by Forrestsjs does not work.

    I have Sql Server 2008 R2 and I think the cumulative update 4 is applied because my sql version is 10.50.2500 and this update build version is lower (10.50.1746.0), how can I be sure ?

    I also executed:

    DBCC TRACEON(4199, -1)
    GO
    DBCC FREEPROCCACHE

    If someone can help it would be very appreciated.

    Thanks

    --------

    oh well I spend days tuning my request to try to get more speed from the full-text search and just replacing the contains with a like '%' + @search + '%' reduced the request time from 15sec to 1sec.

    A like search shouldn't be much slower than a full-text search ? i'm gonna go crazy...

    • Edited by Jonathan 81 Friday, November 30, 2012 1:08 PM
    Thursday, November 29, 2012 3:49 PM
  • Sorry, I never saw an update to this, but for me, version 10.50.2811 is currently working in my case after using the 4199 traceflag. I set the flag at startup in sql configuration manager which seems to be MS general recommendation if you are going to apply CUs between SPs at least.

    UC Berkeley

    Saturday, January 26, 2013 12:08 AM
  • I've got the same problem  (timeout) when using both CONTAINS and ORDER BY in the same querry.

    Problem were solved rebuilding the Full text catalog.

    Saturday, September 14, 2013 9:24 PM