none
Pipeline extension: how can I tell what web a crawled object belongs to? RRS feed

  • Question

  • I'm trying to create a pipeline extenstion that provides metadata enrichment. I'd like to be able to know for any given crawled item, which web it belongs to. I've tried to pull the crawled property ows_WebId, but am finding that it isn't always populated. So I'm wondering...

    • Should this property always have a value (for SharePoint crawled content of course.)? Perhaps my configuration is not correct.
    • If I can't guarantee that ows_WebId has a value, is there another property or means for determining the the web of the crawled item?

    Thursday, February 21, 2013 2:38 PM

Answers

All replies

  • Hi,

    I've seen this as well, and the only solution I've found so far is to use the URL and match against a pre-made site list (as lookups against SharePoint are quirky to do in a pipeline stage, and also would be slow.) Not sure why it's empty or if it always should have an entry.

    Thanks,
    Mikael Svenson


    Search Enthusiast - SharePoint MVP/MCT/MCPD - If you find an answer useful, please up-vote it.
    http://techmikael.blogspot.com/
    Author of Working with FAST Search Server 2010 for SharePoint

    Thursday, February 21, 2013 3:04 PM
  • Thanks for the input Mikael.

    I'm creating a cache of metadata values collected via the SP client object model. The cache is on the FAST index server and runs as a WCF service accessed via net.pipe, so it's pretty fast. I only pay the price for looking up the metadata once per site collection.

    Which crawled property do I use for the URL or are you talking about something that is exposed in the pipeline API?


    • Edited by ArbutusJoe Thursday, February 21, 2013 3:21 PM
    Thursday, February 21, 2013 3:20 PM
  • Hi,

    Use this for the URL. That's the FAST internal property set which contains url, body and data (documented at http://msdn.microsoft.com/en-us/library/ff795815(v=office.14).aspx)

    <CrawledProperty propertySet="11280615-f653-448f-8ed8-2915008789f2" varType="31" propertyName="url"/>

    If you sort your web url's in descending order, then the first one matching this crawled property using StartsWith will be the correct Web for a particular url.

    Thanks,
    Mikael Svenson


    Search Enthusiast - SharePoint MVP/MCT/MCPD - If you find an answer useful, please up-vote it.
    http://techmikael.blogspot.com/
    Author of Working with FAST Search Server 2010 for SharePoint

    • Marked as answer by ArbutusJoe Friday, February 22, 2013 1:11 PM
    Thursday, February 21, 2013 6:59 PM