none
FAST XML format RRS feed

  • Question

  • Hey,
    Does anyone have any idea on how to push a document in FAST Xml format into FAST. Right now if I push this in it just gets treated like a regular doc. I have enabled XMLMapper in the optionalprocessing.xml file, and created my own xmlmapper.xml file but no luck. Any help would be appriciated.

    <?xml version="1.0" encoding="utf-8"?>
    
    <documents>
    
     <document>
    
     <element name="myprop1">
    
      <value>myvalue</value>
    
     </element>
    
     <element name="myprop2">
    
      <value>myvalue</value>
    
     </element>
    
     </document>
    
     <document>
    
     <element name="myprop1">
    
      <value>myvalue</value>
    
     </element>
    
     <element name="myprop2">
    
      <value>myvalue</value>
    
     </element>
    
     </document>
    
    </documents>
    
    
    
    

     I just noticed that if I crawl content with the jdbcconnector.bat I can specify to export in this XML format (by updating the XMLExport group in my config file). I am assuming if I can export this I can also import this?

    • Edited by cmcauliffe Wednesday, July 14, 2010 4:27 PM new information found
    • Moved by hsalviMicrosoft employee, Moderator Thursday, August 12, 2010 2:17 PM Moving to Fast Search Forum (From:SharePoint 2010 - General Questions and Answers)
    Wednesday, July 14, 2010 12:59 PM

Answers

  • Hi All,

    FS4SP, has a limitation in indexinx XML files containing more than one Item (Document) as FS4SP will consider the whole file as one Item.

    Solutions

    Best Option is to use the xmlwrapper.xml file... (http://technet.microsoft.com/en-us/library/ff395931.aspx)

    1.  Store very Item in a separate XML file, this does not look very practical especially if you need to export your data source into XML files...

    2. Another work-around is if the data source is available as a web interface, FS4SP can crawl the web content, however you will lose the Metadata unless they are clearly defined in the HTML Meta Tags...

    3. Another possible solution is to export all the XML files into a database and use the JDBC connector to index the content...

    Best of luck guys,

    Syed

    Friday, August 20, 2010 7:30 AM

All replies

  • I asked around and the answer I got was that my colleagues aren't 100% sure but you could check the file etc\PipelineConfig.xml to see if references fastxml to confirm.  I don't have a FAST Search for SharePoint Server in front of me at the moment but would be interested to hear what you find.
    John Ross
    SharePoint Server MVP | Sr. Consultant SharePoint911: SharePoint Consulting
    Blog: http://www.sharepoint911.com/blogs/john
    Twitter: @JohnRossJr

    Professional SharePoint 2010 Branding and User Interface Design
    Friday, July 16, 2010 8:49 PM
  • I found an entry for processors.FastXMLReader in "C:\FASTSearch\etc\processors\processorserver.xml". I added a reference to it into pipelineconfig.xml and into optionalprocessing.xml but nothing ever seemed to get called. I ended up changing my approach and creating a db table with the same structure and just importing everything using the jdbc connector. Thanks for your help John. It would be nice to get an official word from MS on this.
    Monday, July 19, 2010 11:47 PM
  • Hi All,

    FS4SP, has a limitation in indexinx XML files containing more than one Item (Document) as FS4SP will consider the whole file as one Item.

    Solutions

    Best Option is to use the xmlwrapper.xml file... (http://technet.microsoft.com/en-us/library/ff395931.aspx)

    1.  Store very Item in a separate XML file, this does not look very practical especially if you need to export your data source into XML files...

    2. Another work-around is if the data source is available as a web interface, FS4SP can crawl the web content, however you will lose the Metadata unless they are clearly defined in the HTML Meta Tags...

    3. Another possible solution is to export all the XML files into a database and use the JDBC connector to index the content...

    Best of luck guys,

    Syed

    Friday, August 20, 2010 7:30 AM
  • Thanks for the response. I went with option 3 and it worked out really well.

    Monday, August 23, 2010 12:44 PM
  • If our replys have helped you to resolve the issue, then please mark the question as answered.
    Thursday, August 26, 2010 7:38 AM