body field and Fast JDBC connector RRS feed

  • Question

  • Hi All... I'm using the Fast JDBC connector to feed content, and I'm having trouble with the Whole Words Extractors, which you turn on in optionalprocessing.xml.  Those extractors expect the input to be in the body field in the pipeline, and I can't figure out how to get data into the body field.  Even if its in the "body" column in the SQL result, and even if I map crawled property "body" to managed property "body," it's still not in the right place for the extractor.  I added a spy stage in pipelineconfig.xm, and I can see from spy.txt that the data's in "4CC9F20A-C782-4C48-8961-C5356F8DFF89:body:31," not "body"
    Wednesday, July 13, 2011 9:03 PM

All replies

  • Hi Matt,


    Yes it looks like you're having the same problem as I am:


    I can see in the Pipelineconfig.xml that the CompanyExtractor1, and CompanyExtractor2 expect data input from eetitle and body fields. If I select my database column as body, it doesn't make a difference. Also, if I map crawled property body to manage property body, the extractor still returns nothing.


    Matt, where did you put your spy stage? Perhaps I can replicate what you're seeing, and come up with some ideas?

    Thursday, July 14, 2011 6:20 AM
  • I can see that my body content is in:

    #### ATTRIBUTE body

    and in:

    #### ATTRIBUTE 4CC9F20A-C782-4C48-8961-C5356F8DFF89:body:31  <-- same as yours?


    Yet I don't have any Companies property extraction occurring.





    • Proposed as answer by Mark J Billett Thursday, July 14, 2011 11:44 AM
    • Unproposed as answer by Mark J Billett Thursday, July 14, 2011 11:44 AM
    Thursday, July 14, 2011 10:59 AM
  • Hi Matt,


    Let's say your column containing the data over which you wish to perform entity extraction is called 'fulltext'.





    title as title,

    fulltext as data,

    fulltext as body,

    from <schema>.<table>


    That should do the trick!!!!


    I now have Companies and Location property extraction from db content. This *should* apply to your wholewords issue too ;-)





    • Proposed as answer by Mark J Billett Thursday, July 14, 2011 11:44 AM
    Thursday, July 14, 2011 11:43 AM
  • Bueno, Mark, the "AS data" did the trick for me too.  Thanks for your help!
    Thursday, July 14, 2011 3:49 PM
  • Thought I'd try to shed some light on this.

    Using data as the field is the key her, as this is the field which both the format detector and the converters work on. The converters pull data from the data field and outputs to the html field, which is picked up by the FastHTMLParser stage, which outputs title, body etc.


    Mikael Svenson 

    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    Friday, July 15, 2011 6:53 PM