none
FSIS Indexing Content Stored on Unix RRS feed

  • Question

  • We purchased FSIS and have some conent loaded on a Vignette Server and also some content loaded on a Unix server.  The FAST will be running  on Windows Server 2008.  I know FAST  has the Vignette Connector OTB, what connector would we need to use to crawl and index the HTML files (about 2 million) that reside on the Unix box?

     

    Not sure the File Traverser is the right answer because of security restrictions. 

     

    Thanks for your help!

    Wednesday, August 10, 2011 1:31 PM

Answers

  • Looks like you want the Crawler for this, unless there are HTML documents (or areas of documents) that have access restrictions.

    Note that, if you use the Crawler, it is assumed that you have some way of reaching all the documents from one or more root documents. If the documents contain links to other documents, you might be able to use just a single root document. OTOH, if there are no (or few) inter-document links, you may need to create an HTML document that contains links to every single document that you want to index on the web server.

    Thursday, August 11, 2011 5:45 AM

All replies

  • Looks like you want the Crawler for this, unless there are HTML documents (or areas of documents) that have access restrictions.

    Note that, if you use the Crawler, it is assumed that you have some way of reaching all the documents from one or more root documents. If the documents contain links to other documents, you might be able to use just a single root document. OTOH, if there are no (or few) inter-document links, you may need to create an HTML document that contains links to every single document that you want to index on the web server.

    Thursday, August 11, 2011 5:45 AM
  • It became a little more complex as we have to deal with SSH.  The solution we are going with is writing a program which will parse the HTML and then send it over to the CDE.  A CTS flow will intercept it.. do somethings and send it to the index. 

     

    More complex than I want but it should work.

    Friday, August 12, 2011 6:33 PM
  • Can the information be accessed thru FTP? If so, you can use Enterprise Crawler to connect to the FTP and extract the info.

    Felix

    Tuesday, August 16, 2011 5:58 PM
  • Nope.. What we are going to do is create a Java app to parse the content and send it over to CDE.  The CTS Flow will then listen for it and send it to the Index. 

     

    Thanks though.

    Monday, August 22, 2011 4:57 PM