none
FAST Spelltuner cannot update dictionary RRS feed

  • Question

  • Hello,

    when I run  spelltuner --wordcount-threshold 1 -m -1 -f -l debug to update the spelltuner dictionaries every line is ignored because of the following errormessage:

    ... systemmsg Skipping malformed line 'fehlleis 4'

    What can I do? Copiing files from our development FAST server did not work

    Thanks for an answer

     Lutz

    Thursday, January 26, 2012 4:50 PM

All replies

  • Lutz,

    Have you checked the actual frequency list to see if there is any corruption?  Maybe some strange characters on some lines before or after?  This article has some information that includes location of the frequency lists, it's in .gz format and I believe you should be able to unzip it and open it in a text editor.  Anything looks unusual?

     

    http://support.microsoft.com/kb/2592062


    Igor Veytskin
    Friday, January 27, 2012 8:33 PM
    Moderator
  • Hi Lutz,

    Igor has given a very good http://support.microsoft.com/kb/2592062 to follow.

    Step 1: Here, Make a copy of a .gz file using following command using power shell(Microsoft FAST Search Server 2010 for SharePoint).

    PS C:\FASTSearch\bin> Copy-Item "C:\FASTSearch\data\spelltuner\en_frequencies_txt.gz" "C:\FASTSearch\data\spelltuner\domaindictionary.gz"

    Step 2: Add desired term in new frequency list using Python interpreter, cobra, from the %FASTSEARCH%\bin\ directory.

    Cobra fs14.281 (2010-01-08 11:09:33)
    >>> import os,os.path
    >>> import gzip
    >>> newDicFile = "C:\FASTSearch\data\spelltuner\domaindictionary.gz"
    >>> openfile = gzip.open(newDicFile, 'wb')
    >>> openfile.write("en mynewdictionaryterm 999\n")
    >>> openfile.close()

    Close the power shell window.

    Note: The new term should be only one word without spaces. If there is multiple words just remove white spaces between words. This can be viewed by extract any default .gz file using any third party exe, there is no multiple words.(Correct me if I am wrong)

    #NOTE: the syntax here is: ("<language> <term to add/boost> <new frequency>")

    Suppose above if we put  openfile.write("en my new dictionary term 999\n") Or  openfile.write("en 'my new dictionary term' 999\n") it will throw an error message  "systemmsg Skipping malformed line en my new dictionary term 999" when we run spell tuner.

    Step 3: Again open ps window. Copy the updated frequency list to the resourcestore using following command.

    PS C:\FASTSearch\bin> Copy-Item "C:\FASTSearch\data\spelltuner\domaindictionary.gz" "C:\FASTSearch\components\resourcestore\dictionaries\spelltuner\domaindictionary.gz"

    Step 4: Run the spelltuner
    PS C:\FASTSearch\bin> spelltuner --wordcount-threshold 1 -m -1 -f -l debug

    New term is added and will come in "Did you mean" if we write miss-spelt word during search query.

    We can also add terms from a text file in "Did you mean feature" using above method.

    With Regards,

    Umashankar Singh.

    Tuesday, March 20, 2012 1:07 PM