Not indexing new documments - remdoclist taking too long? RRS feed

  • Question

  • Hello,

    I'm running a Fast ESP 5.3  installation on linux and almost everyday my system stops to index documents for at least 2 hours. We reindex a lot of of documents all day long and I wonder if the blacklist cleaning has something to do with that.

    One thing that called my attention was this:

    [2013-08-01 09:55:25.863] DEBUG      indexer remdoclist (2_364): Build (for index 3_351) input: 35194-2989053 target: 16690-35210 file: /opt/fast/esp/data/data_index/index_2_364/removefile_3_351.dat
    [2013-08-01 09:55:25.863] DEBUG      indexer remdoclist (2_364): We need to delete high urls from 35211 to 2989053
    [2013-08-01 09:55:26.996] DEBUG      indexer remdoclist (2_364): We need to remove 2953843 files for the new index
    [2013-08-01 12:04:45.314] DEBUG      indexer remdoclist (2_364): Built remdoclist '/opt/fast/esp/data/data_index/index_2_364/removefile_3_351.dat' with 3468321 entries.
    [2013-08-01 12:04:47.366] DEBUG      indexer remdoclist (2_364): Flushed 101455 entries to remdoclist /opt/fast/esp/data/data_index/index_2_364/removefile_3_351.dat, total size of removefile is 3569776 101455

    Does it stop indexing new documents while building the remdoclist? Because after that we can see the new documents being indexed.

    Is there anyway to avoid this behavior?

    Thursday, August 1, 2013 6:58 PM

All replies

  • blacklist = things not searchable in current gen
    removelist = things to be removed in next gen

    Example, partition 2_363 has 3 million documents *blacklisted*, and has a removelist, for next generation (2_364) of the same documents, that should be removed when the next indexing job happens.

    However, this does not stop indexing. If it's not indexing during this time, something else is going on, and may be indexer logs should reveal more.

    Friday, August 2, 2013 3:19 PM
  • Hi Nelson,

    Were you able to resolve this issue?  If not, what is your ESP Service Pack and searchengine patch level?  I would always recommend being at ESP 5.3 SP5, along with the latest searchengine patch02, as there have been many updates and fixes to the searchengine component that impact indexing behavior.  If you are still seeing this issue, let us know your service pack level and patch number.


    Rob Vazzana | Sr Support Escalation Engineer | US Customer Service & Support

    Customer Service   & Support                            Microsoft| Services

    Monday, September 23, 2013 9:27 PM