locked
some documents not getting indexed in FAST Esp RRS feed

  • Question

  • in my fetch log i can see a URL 2011-06-02-02:51:19 200 UNCHANGED http://nhsnes.nescmsuat.aws.rroom.net/about-us/test/testevent2.aspx

    in pp log i can see the same url

    2011-06-02-09:19:15 ADD http://nhsnes.nescmsuat.aws.rroom.net/about-us/test/testevent2.aspx
        Store        : NesCorpDemo/data/3
        Idx          : 65689
        Size         : 4266
        Mime         : text/html
        Levels       : 2/2/2
        Referrers    :
        Encoding     : deflate

     

    In my ds feed log i can see

     

    2011-06-02-09:20:33 ADD        PROCESSED   http://nhsnes.nescmsuat.aws.rroom.net/about-us/test/testevent2.aspx

     

    but this document is not getting indexed and i cant search it through search view; any ideas

    Thursday, June 2, 2011 12:34 PM

Answers

  • Hi Mehtab,

     

    You can take a quick look at the data from the crawler to verify that content was read using these steps:

    1.  Stop the crawler.

    2.  After setting your environment variables, run "crawleradmin --offline --getdata <collection>:<URI>".  You may want to pipe this to a file.

    3.  Start the crawler.

     

    If the data is what you expect to see, check the Matching Engines screen of the Admin GUI to verify that "Not indexed / Invalid documents" is 0.  If it is not, you may want to try running a resetindex ("indexeradmin resetindex").

     

    Thanks,

    Patrick


    Patrick Schneider | Microsoft | Enterprise Search Group | Support Escalation Engineer | http://www.microsoft.com/enterprisesearch
    Friday, June 3, 2011 12:51 PM