none
File contents are indexed but cannot be searched RRS feed

  • Question

  • Hi

    I'm using JDBC Connector to index a database that stores, among other things, documents in binary format (.txt, .docx, .xml, .pptx, .xlsx and .pdf).
    The query used by the connector to index the database returns the binaries in a field named data (thanks to Mikael Svenson for this tip) so that FAST knows it is data in binary format.

    The problem I'm having is:

    • Everything is crawled and indexed without errors
    • When I search for a word (using free text search) that appears in the title/filename of the document I get results. If that word also appears in the document content, FAST returns part of the content in the hithighlightedsummary property. This is the expected behavior.
    • When I search for a word (using free text search) that does not appear in the title /filename of the document but appears in the document content, I get no results.
    • Except for .txt and .xml files. For these only, it works as expected.

    Summarizing: I know the contents are being crawled and indexed, since they are returned in the hithighlightedsummary, but apparently FAST is not searching inside those contents.

    I'm using FAST Search for SharePoint SP1 with all the cumulative updates up to April 2012 CU.

    Has this happened to anyone?

    Thanks,
    André


    Wednesday, May 16, 2012 9:29 AM

Answers

  • I found out what I did wrong.

    During the configuration process, I mapped some of my crawled properties to the out-of-the-box body managed property. Apparently, that causes problems to the way FAST indexes file contents leading to the problem I described above.

    Removing the mappings and re-crawling the database fixed it.

    Regards,
    André.

    Thursday, May 17, 2012 8:18 AM

All replies

  • I found out what I did wrong.

    During the configuration process, I mapped some of my crawled properties to the out-of-the-box body managed property. Apparently, that causes problems to the way FAST indexes file contents leading to the problem I described above.

    Removing the mappings and re-crawling the database fixed it.

    Regards,
    André.

    Thursday, May 17, 2012 8:18 AM
  • Hi Andre.

    Am facing the same problem and in my case all the items of the sql database table except the binary item was crawled. Dont know the reason how that item got excluded. Also for the body managed property what crawled properties to be mapped?

    Please reply with details.

    Ananth.R


    Ananth

    Saturday, December 8, 2012 8:36 AM
  • Hi Andre,

    I am also facing the same issue. I have some custom pipe line stahes which process the body of the document. But while processing I am getting an exeption like

    Customer-supplied command failed: Active process limit exceeded
    \\ Process terminated abnormally: Unknown error (0xe0434f4d)
    \\  \\ Unhandled Exception: System.ArgumentException: '', hexadecimal value 0x13, is an invalid character.
    \\    at System.Xml.XmlUtf8RawTextWriter.InvalidXmlChar(Int32 ch, Byte* pDst, Boolean entitize)

    Could you please help. Do you have any custom pipe line stage by any chance or its just an OOB crawling?

    Thanks.


    Regards Sagar Pattnayak

    Thursday, March 21, 2013 2:34 PM
  • Hi Andre,

    I am able to fix the issue I was talking earlier. But the binary data from my SQL Server Table is not getting indexed. Is there any other setting which needs to be configured?

    Thanks.


    Regards Sagar Pattnayak

    Friday, March 22, 2013 11:45 AM