none
How to use the Managed Property Extraction features? RRS feed

  • Question

  • Hello All,
    I have done a lot of ready and testing but am struggling at how to use the Managed Property Extraction features in FAST for SharePoint.
    We would like to:
    • Do phonetic and nickname searches
    • Extract and create metadata latent in documents to improve search results
    • Do Exact, keyword, phonetic, and fuzzy searches
    In our implementation we:
    • Search data is added using the Content API directly into FAST (Content is about companies and people)
    • Search is performed using the FAST Search API directly
    We need to know:
    • How to use the three OOTB property extraction stages (persons, companies, locations)
    • Can Managed Property Extraction help us?
    • How to implement?
    • How to tie our data to the feature
    • How to search using this data and feature
    Thanks
    Bob Bachand
    bbachand@deloitte.com
    Thursday, April 7, 2011 1:58 PM

Answers

All replies

  • First a question: What do you mean with 'FAST Search API directly'?

    For the built-in extractors, you can add terms to inclusion/exclusion list for the OOB dictionaries, ref. http://technet.microsoft.com/library/ff191250(en-us).aspx and http://msdn.microsoft.com/en-us/library/ff795826.aspx#optional-item-proc-property

    You can define additional property extractors, ref. http://msdn.microsoft.com/en-us/library/ff795797.aspx. These are dictionary-driven extractors, that match words or strings. They will map extracted terms to predefined crawled properties, which you in turn can map to managed properties of your choice.

    In most cases these extracted properties are used for query refinement. See http://msdn.microsoft.com/en-us/library/ff795797.aspx#custom-prop-refiner for more details. (for the troubleshooting example, an updated powershell script can be found in http://blogs.msdn.com/b/knutbran/archive/2011/04/01/some-hints-on-testing-custom-managed-properties-and-queries.aspx. If you do not use item level security, it may be easier to go directly on the QRServer internal query interface, ref. http://technet.microsoft.com/nb-no/library/ff381272(en-us).aspx#BKMK_TestFASTSearchServer2010ForSharePoint

    If you want to apply more complex extractions (beyond exact matching using dictionary), you will need to create a pipeline extensibility stage, ref. http://msdn.microsoft.com/en-us/library/ff795801.aspx

    Thursday, April 7, 2011 8:32 PM
  • Thanks for the response Knut,
    We have created a search web service that we have standardized and are querying the Search API through that service. We understand that there are two approaches to this: Query Web Service and Query Object Model. We chose the Query Object Model it is the lowest level API and has more features.
    I understand the inclusion/exclusion lists and how to define custom property extractors. What I am missing here is if I have an XML file that has company data in it and I push it into the Index then all I would have to do is map that crawl property with the the managed property that is used for the property extraction (companies)? Or do I add the xml company data to the companies crawl property?
    Thanks again
    Friday, April 8, 2011 6:17 PM
  • Hi Bob.

    As for phonetic search, this is not part of FAST for SharePoint and you would have to implement your own solution with soundex/levenshtein or similar. Nicknames you could implement with the wholeword extractor. That said, SharePoint People Search has both phonetic capabilities and nicknames built-in. People search is not implemented in FAST for SharePoint, but uses the oob search.

    As for the OOTB extractors they support white and black lists, so you could easily feed them with known names, and your content would then be scanned for these terms, and they would be added to the pre-defined crawled properties for these extracts. And these are also already mapped to equivalent managed properties.

    You find information about this at http://technet.microsoft.com/en-us/library/ff191240.aspx for PowerShell or http://office.microsoft.com/en-us/fast-search-server-help/manage-property-extraction-HA010381911.aspx for using the Admin UI.

    The companies and location extractors are enabled by default, and are available as refiners (if you choose to add them). The Persons extractor can be enabled in optional processing (http://msdn.microsoft.com/en-us/library/ff795826.aspx#optional-item-proc-customize). Wholeword extractors are also enabled in this file.

    Hope this get's you started. If you want to create custom stages I have two blogposts about this on my blog. One for a C# module and one with using powershell.

    Regards,
    Mikael Svenson 


    Search Enthusiast - MCTS SharePoint/WCF4/ASP.Net4
    http://techmikael.blogspot.com/ - http://www.comperiosearch.com/
    Saturday, April 9, 2011 7:41 PM