some documents not getting indexed in FAST Esp RRS feed

  • Question

  • in my fetch log i can see a URL 2011-06-02-02:51:19 200 UNCHANGED http://nhsnes.nescmsuat.aws.rroom.net/about-us/test/testevent2.aspx

    in pp log i can see the same url

    2011-06-02-09:19:15 ADD http://nhsnes.nescmsuat.aws.rroom.net/about-us/test/testevent2.aspx
        Store        : NesCorpDemo/data/3
        Idx          : 65689
        Size         : 4266
        Mime         : text/html
        Levels       : 2/2/2
        Referrers    :
        Encoding     : deflate


    In my ds feed log i can see


    2011-06-02-09:20:33 ADD        PROCESSED   http://nhsnes.nescmsuat.aws.rroom.net/about-us/test/testevent2.aspx


    but this document is not getting indexed and i cant search it through search view; any ideas

    Thursday, June 2, 2011 12:34 PM


  • Hi Mehtab,


    You can take a quick look at the data from the crawler to verify that content was read using these steps:

    1.  Stop the crawler.

    2.  After setting your environment variables, run "crawleradmin --offline --getdata <collection>:<URI>".  You may want to pipe this to a file.

    3.  Start the crawler.


    If the data is what you expect to see, check the Matching Engines screen of the Admin GUI to verify that "Not indexed / Invalid documents" is 0.  If it is not, you may want to try running a resetindex ("indexeradmin resetindex").




    Patrick Schneider | Microsoft | Enterprise Search Group | Support Escalation Engineer | http://www.microsoft.com/enterprisesearch
    Friday, June 3, 2011 12:51 PM