none
Enterprise Crawler Cookie Authentication and AJAX RRS feed

  • Question

  • Hi,

    Would like to know if the Enterprise Crawler (FAST v5.3) can handle the following login requirement:

    1) Before crawling pages, a cookie authentication must be in place

    2) The login form code snippet:

     

    <form>
    
    <input title="Username." type="text" value="" id="txtUser" name="txtUser" >
    
    <input title="Password" type="password" value=""  id="txtPwd" name="txtPwd">
    
    <input type="submit" class="btn" value="Login" onclick="login();" >
    
    </form>

    Where login() is a javascript/jquery call to an authentication servlet which returns the session information.

    3) Once authentication is successful (returning relevant information), the page constructs a session cookie and proceeds.

     

    How can I configure the Enterprise Crawler in this case to login? (since it does not have an action (e.g. POST))

    Thanks in advance!

    Sunday, October 23, 2011 4:48 PM

All replies

  • I believe you would want to use Forms based authentication for this, and my understanding is that you would need to provide a GET/POST still.  Are you using Windows based (NTLM, etc) authentication? 

     

     If an option, maybe you could access the pages in a browser, and we can review your browser header log messages via a utility such as Fiddler or similar, so we could see the output. Feel free to provide that output here.

     

    Thanks!

    Rob Vazzana | Sr Support Escalation Engineer | US Customer Service & Support

    Customer Service & Support                          Microsoft | Services

    Monday, October 24, 2011 8:18 PM
    Moderator
  • Hi Rob,

    Thanks for the reply.

    The page is using Forms base authentication and the authentication servlet is a custom authentication mechanism. My approach currently is to POST directly to the authentication servlet (with the respective parameters); however, how can I get the crawler to return to the original login page where the client cookie gets created before proceeding with the crawl?

    Just to be clear on the login page - the javascript/jquery is responsible for the POST to the authentication servlet (instead of the actual form similar to the Enterprise Crawler Guide documentation example captured below). Since there is no 'method' and 'action' in my login page, how can I configure the Enterprise Crawler accordingly?

     

    <form method="POST" name="login" action="/path/to/form.cgi">
    
    <input type="text" name="username" size="20">
    
    <input type="password" name="password" size="20">
    
    <input type="hidden" name="redirURI" value="/">
    
    <input type="submit" value="Login" name="show">
    
    <input type="reset" value="Reset" name="B2">
    
    </form>

     

    Thanks in advance!

    Tuesday, October 25, 2011 2:34 AM
  • Glad to help.  In this case, I would suggest specifying a preload URI in the crawler’s Login section, so as to force it to go fetch the page BEFORE the login POST is made to get back any necessary cookies.  If you enter this same URI as a Start URI, then it will go back AFTER the login POST and just do normal crawling (with whatever cookies etc. already set).  Let us know if that helps.

     

    Thanks!

    Rob Vazzana | Sr Support Escalation Engineer | US Customer Service & Support

    Customer Service & Support                          Microsoft | Services

    Thursday, October 27, 2011 9:18 PM
    Moderator