none
FAST Search Web Crawler Indexing Secured SharePoint Blog Sites RRS feed

  • Question

  •  

    Hi - We have 10 SharePoint 2007 Blog sites that are internal to my business (i.e. not Internet-facing). 2 of those Blog sites are secured (using standard SharePoint security groups) so that only certain users can access them. All 10 Blog sites contain links to external websites.

    I have used the FAST Search Web Crawler in FS4SP 2010 to index these 10 blog sites as in the xml config file I can specify that it should index all Blog content but only index the first page of any external URL (via DEPTH:0). While this works great and I can search across all 10 Blog sites and external content the security of the 2 secured blog sites is not retained. e.g. UserA does not have access to read Blog1 but can run a search and return results from Blog1, which is not acceptable

    I can use the SharePoint Web Crawler to index the Blogs (and when creating a new Content Source choose either the 'SharePoint Sites' or 'Web Sites' option) and item-level security is applied (UserA sees results from several Blogs but not Blog1). 

    My first question is whether it is possible to have item-level security applied to SharePoint Blog content indexed by the FAST Search Web Crawler or is this only possible via SharePoint's web crawler?

    If, as I suspect, this is not possible is it possible to have the SharePoint web crawler index the Blog content and then to have the FAST Search Web Crawler crawl the Blogs but only index linked content on external sites (I have tried this but can't stop the FAST Web Search Indexer from indexing the Blog content and making it available in the result list)? 

    Thanks,

    Francis

     



    Tuesday, August 2, 2011 8:30 AM

Answers

  • Hello Francis,

    1. No, the FAST Enterprise Crawler will treat content as public once indexed. It does not provide any mechanisms for storing security, as web sites could use potentially any form of security mechanism and covering them all would be impossible. Your only option is to use the SharePoint Crawler (not the SharePoint web crawler) if you want to retain the security per item.

    2. I'm not sure you can exclude the blog posts themselves during indexing, but you could exclude it during search by using a scope on your search page. If you store the SharePoint crawled content in the default content collection "sp" and the enterprise crawled content in a content collection "web" you can create a scope with fql like this:

    or(meta.collection:"sp", andnot(meta.collection:"web", path:starts-with:"http://site/blog" ))


    This will match any items indexed by the SharePoint connectors, and all items indexed by the Enterprise Web Crawler except if they start with the SharePoint blog url.

    Regards,
    Mikael Svenson 


    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    • Proposed as answer by Rob Va - MSFT Wednesday, August 3, 2011 2:51 PM
    • Marked as answer by Francisjw Wednesday, August 10, 2011 2:18 PM
    Tuesday, August 2, 2011 6:42 PM

All replies

  • Hello Francis,

    1. No, the FAST Enterprise Crawler will treat content as public once indexed. It does not provide any mechanisms for storing security, as web sites could use potentially any form of security mechanism and covering them all would be impossible. Your only option is to use the SharePoint Crawler (not the SharePoint web crawler) if you want to retain the security per item.

    2. I'm not sure you can exclude the blog posts themselves during indexing, but you could exclude it during search by using a scope on your search page. If you store the SharePoint crawled content in the default content collection "sp" and the enterprise crawled content in a content collection "web" you can create a scope with fql like this:

    or(meta.collection:"sp", andnot(meta.collection:"web", path:starts-with:"http://site/blog" ))


    This will match any items indexed by the SharePoint connectors, and all items indexed by the Enterprise Web Crawler except if they start with the SharePoint blog url.

    Regards,
    Mikael Svenson 


    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    • Proposed as answer by Rob Va - MSFT Wednesday, August 3, 2011 2:51 PM
    • Marked as answer by Francisjw Wednesday, August 10, 2011 2:18 PM
    Tuesday, August 2, 2011 6:42 PM
  • Thanks Mikael - I'll give the search scope option a try.

    Francis

    Wednesday, August 10, 2011 2:23 PM