none
Advanced phrase recognition during spellchecking RRS feed

  • Question

  • Hello,

    I am trying to implement phrase-level spellchecking with FAST Search for Internet Sites 2010. I've changed the index profile to include attributes [phrases="yes" positional="yes"] at the field list level and wrote a script that uses 'createspelldata' and 'dictman' to generate word frequency lists from the index and align spellchecking dictionaries with these lists.

    This works fine at word level, but while the phrases are successfully extracted, createspelldata issues an error message that says "This backend doesn't support phrases. Skipping phrases for [language code, e.g. en]". A fragment of the exact output is given below:

    This backend doesn't support phrases. Skipping phrases for en.
    Extracted 25178 words in 4 different language files.
    Extracted 46715 phrases in 4 different language files.
        25.8 sec overall processtime
        0.629 sec per Elem_*.xml.gz
        41 Elem_*.xml.gz

    Could anyone suggest how to solve this issue, please?

    Thanks,

    Andrey

    
    Thursday, June 28, 2012 1:06 PM

All replies

  • Hi Andrey,


    What is the exact createspelldata command you are issuing?  Also, are you running createspelldata on a composite field?   If so, you actually can’t run createspelldata on the composite field, and you need to manually specify each field.  Below is a sample command that works here:

    createspelldata -l en -c parts -a --word=bcontitle -- word=bcondescription --word=bcondisplayname

    Are you running the createspelldata command with the –cachemanage parameter?  Another option may be to run the createspelldata command without the –cachemanage parameter, as I have seen that as being helpful to avoid the error.

    Thanks!

    Rob Vazzana | Sr Support Escalation Engineer | US Customer Service & Support

    Customer Service   & Support                            Microsoft| Services

    Monday, July 16, 2012 7:44 PM
    Moderator
  • Hi, Rob!

    Thanks for suggestions! Here is a fragment of the command we use (some fields ommited for readability): createspelldata -l ru -l en -l lv -l de --prune -a --word=title --word=description --phrase=bsumpublishers:sField:;

    'bsumpublishers' actually is a composite string field. If it is not supported, how do we create phrase spellchecking dictionaries from such fields then? Concatenate all composite field values into a single string field? 

    Thanks,

    Andrey

    
    
    
    
    

    • Edited by Andrey M_ Wednesday, July 18, 2012 12:40 PM
    Wednesday, July 18, 2012 12:40 PM
  • Hi Andrey,

    Thanks for the reply, and the command you are using.  I can confirm that with phrases, you can’t use the –a/--cachemanager flag.  Also, you cannot use a composite field in this way, and you should use a multi-value field when using phrases.  Also, another  note of general importance is that one must use the –phrase parameter when gathering phrases, and I see based on your command that you are doing that.  If more clarification is needed, let me know.

    Thanks!

    Rob Vazzana | Sr Support Escalation Engineer | US Customer Service & Support

    Customer Service   & Support                            Microsoft| Services

    Monday, August 6, 2012 3:12 PM
    Moderator
  • Thanks Rob! We will try removing that cachemanager flag and see if it helps.

    Regarding that 'bsumpublishers' field, I see that there was a terminology mismatch :) By 'composite', I meant 'multi-value', not that ESP <composite-field> with <field-ref>'s. This means that it is probably the cachemanager flag that's causing the trouble.

    Thanks,

    Andrey

     
    Tuesday, August 7, 2012 10:32 AM