none
Document_errors.txt are getting filled up to nearly 13 GB RRS feed

  • Question

  • Document_Errors.txt are getting filled up to nearly 13GB of error messages and exceptions similar to these shown below:
    crawlerCollection 2 processing TermExtractor: Skipping field 'fieldname' since its not tokenized. DocID: h t t p:// website.aspx 0
    and
    Product 240 processing System exception subsystem (session 13): OBJECT_NOT_EXIST: Minor: OBJECT_NOT_EXIST_NoMatch, COMPLETED_NO. 15

    The file system is getting filled up causing items not be searchable; the count of public documents were lower than indexed documents.

    Anyone help me how to resolve this or understand the cause of these error messages

    Thanks

    Thursday, April 7, 2011 8:24 AM

Answers

  • Hello Sagar,

    The stage "termextractor' is not a built-in stage.  This must be a custom processor stage in your environment.  This is most likely some custom matcher stage.  If you need further assistance here I would recommend opening a support case so an engineer can work with you directly.

    Friday, June 10, 2011 3:34 PM
  • We have raised a support ticket with Microsoft. This issue maybe resolved with a new upgrade patch for FAST ESP.
    Thanks!
    • Marked as answer by Sagar Gururaj Thursday, July 28, 2011 7:22 PM
    Thursday, June 30, 2011 4:01 AM
  • The issue is resolved. FAST has provided an upgrade patch. 100 MB is the limit for the growth of Document_warnings & Document_errors.txt files. Once they are exceeded, they are cleared and moved to Document_warnings.txt.old and Document_errors.txt.old files. This happens whenever indexeradmin purgeerrors command is executed.
    • Marked as answer by Sagar Gururaj Tuesday, September 13, 2011 2:39 PM
    Tuesday, September 13, 2011 2:38 PM

All replies

  • This is typically caused by communications issues in the document processing subsystem, (content distributor, indexing dispatcher, document processor).   It is key to understand what version of ESP you are running as there have been issues addressed in feeding chain processing.

    If this is causing a big impact in your environment I recommend opening a case with support and providing an infotool report as this will give a baseline to begin diagnoses.

    Thursday, April 14, 2011 4:09 PM
  • ESP 5.3.SP3 is the version in use.
    It seems to be an issue with FAST. However, We were suggested to add the names of the attributes(clogging in the log files) in the "field_ignores" and also update the "composite-field_ignores" in the termextractor.xml. This has lessened the log files for the time being.

    My next question is -
    Should these attributes be manually included in PipelineConfig.xml as well ? under the processor name = "termextractor" ?
    What exactly are the purpose of termextractor.xml ? and what happens to the attribute names mentioned there ?

    Monday, May 30, 2011 5:10 AM
  • Hello Sagar,

    The stage "termextractor' is not a built-in stage.  This must be a custom processor stage in your environment.  This is most likely some custom matcher stage.  If you need further assistance here I would recommend opening a support case so an engineer can work with you directly.

    Friday, June 10, 2011 3:34 PM
  • We have raised a support ticket with Microsoft. This issue maybe resolved with a new upgrade patch for FAST ESP.
    Thanks!
    • Marked as answer by Sagar Gururaj Thursday, July 28, 2011 7:22 PM
    Thursday, June 30, 2011 4:01 AM
  • The issue is resolved. FAST has provided an upgrade patch. 100 MB is the limit for the growth of Document_warnings & Document_errors.txt files. Once they are exceeded, they are cleared and moved to Document_warnings.txt.old and Document_errors.txt.old files. This happens whenever indexeradmin purgeerrors command is executed.
    • Marked as answer by Sagar Gururaj Tuesday, September 13, 2011 2:39 PM
    Tuesday, September 13, 2011 2:38 PM