locked
proximity terms in full-text search RRS feed

  • Question

  • Hi. I'm doing some full-text searching that requires the use of proximity terms. It looks like SQL Server doesn't allow me to specify the distance between the words I'm searching for (e.g., 3 words apart, 2 paragraphs apart, and so on). From what I've read, SQL Server considers two words to be in proximity of each other if they are within 50 words of each other. Does anyone know if the number 50 can be changed?

    I know given a proximity term, SQL Server can rank the documents based on how close the words are in the documents, but I guess ranks do not reflect physical distances because SQL Server uses logical distances. Is there any way to learn from the ranks about real physical distances?

    Finally, I read something saying that one way to get around the lack of is to use CHARINDEX, but CHARINDEX has a limitation of 8000 characters. Does anyone know any way to get around that limitation?

    Thanks a lot.

    Calvin
    Friday, July 17, 2009 1:09 AM

Answers

  • you can't define the level of nearness, nor can you specify separation distance by word, sentence, paragraph, page, chapter or book the way you can in other Microsoft Search products. This was supposed to ship in SQL 2008 but was cut.

    After 50 words of separation distance the contribution to rank is 0 but they will still show up in searches.

    IIRC Lucence allows you to define specific separation distances.
    looking for a book on SQL Server replication? http://www.nwsu.com/0974973602.html looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    • Marked as answer by KJian_ Friday, August 21, 2009 9:34 AM
    Friday, July 17, 2009 2:00 AM