none
Sharepoint 2013 NOT Indexing Liferay secured content, help! RRS feed

  • Question

  • Hi all,

    I need to crawl and index content from Liferay portal secured area. I am Using Sharepoint 2013 RTM version.

    I have following situation:

    Url of single secured item:

    https://www.example/en/aaa/-/page-title-here (with browser credentials are asked  and then redicted to the page itself)

    And what is found from log:

    01/14/2013 14:52:17.36 	mssdmn.exe (0x1B1C)                     	0x2DA4	SharePoint Server Search      	Connectors:HTTP               	du4s	Medium  	CHttpAccessorHelper::InitRequestInternal - Request/Response succeeded CorrelationID  URL https://www.example.com/c/portal/login?redirect=%2Fen%2Faaa%2F-%2Fpage-title-here&p_l_id=157223 SearchRequestTime 1579, SPRequestDuration 0, SPIISLatency 0, dwStatusCode 302, hr 0x40dca  [httpacchelper.cxx:586]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 14:52:18.69 	mssdmn.exe (0x1B1C)                     	0x2DA4	SharePoint Server Search      	Connectors:HTTP               	du4s	Medium  	CHttpAccessorHelper::InitRequestInternal - Request/Response succeeded CorrelationID  URL https://www.example.com/c/portal/login?redirect=%2Fen%2Faaa%2F-%2Fpage-title-here&p_l_id=157223 SearchRequestTime 1313, SPRequestDuration 0, SPIISLatency 0, dwStatusCode 302, hr 0x40dca  [httpacchelper.cxx:586]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 14:52:20.14 	mssdmn.exe (0x1B1C)                     	0x2DA4	SharePoint Server Search      	Connectors:HTTP               	du4s	Medium  	CHttpAccessorHelper::InitRequestInternal - Request/Response succeeded CorrelationID  URL https://www.example.com/c/portal/login?redirect=%2Fen%2Faaa%2F-%2Fpage-title-here&p_l_id=157223 SearchRequestTime 1406, SPRequestDuration 0, SPIISLatency 0, dwStatusCode 302, hr 0x40dca  [httpacchelper.cxx:586]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 14:52:21.59 	mssdmn.exe (0x1B1C)                     	0x2DA4	SharePoint Server Search      	Connectors:HTTP               	du4s	Medium  	CHttpAccessorHelper::InitRequestInternal - Request/Response succeeded CorrelationID  URL https://www.example.com/c/portal/login?redirect=%2Fen%2Faaa%2F-%2Fpage-title-here&p_l_id=157223 SearchRequestTime 1422, SPRequestDuration 0, SPIISLatency 0, dwStatusCode 302, hr 0x40dca  [httpacchelper.cxx:586]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 14:52:22.93 	mssdmn.exe (0x1B1C)                     	0x2DA4	SharePoint Server Search      	Connectors:HTTP               	du4s	Medium  	CHttpAccessorHelper::InitRequestInternal - Request/Response succeeded CorrelationID  URL https://www.example.com/c/portal/login?redirect=%2Fen%2Faaa%2F-%2Fpage-title-here&p_l_id=157223 SearchRequestTime 1312, SPRequestDuration 0, SPIISLatency 0, dwStatusCode 302, hr 0x40dca  [httpacchelper.cxx:586]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 14:52:22.95 	mssdmn.exe (0x1B1C)                     	0x2DA4	SharePoint Server Search      	Connectors:HTTP               	aj4ci	High    	CHttpAccessorHelper::InitRequestInternal - tenth redirect or redirect to the same url failing the document https://www.example.com/en/aaa/-/page-title-here  [httpacchelper.cxx:661]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 14:54:05.93 	mssdmn.exe (0x1B1C)                     	0x293C	SharePoint Server Search      	Connectors:HTTP               	du4f	High    	CHttpAccessorHelper::InitRequestInternal - Access Denied PRTH_E_ACCESS_DENIED on request for 'https://www.example.com/c/portal/login?redirect=%2Fen%2Faaa%2F-%2Fpage-title-here&p_l_id=157223' hr =0x80041205  [httpacchelper.cxx:262]  search\native\gather\protocols\http\httpacchelper.cxx	 

    Result is that secured content never gets indexed

    I have following crawl rule:

    https://www.example.com/en/aaa/*

    - include

    - Authentication Account, with that account i am able to browse to content(without domain\\)

    So any ideas what might help me to get this crawling working? Thank you all in advance,

    br,

    -hk



    • Edited by hkossu Monday, January 14, 2013 2:03 PM
    Monday, January 14, 2013 1:15 PM

All replies

  • Heres some more log messages:

    01/14/2013 15:56:10.96 	w3wp.exe (0x20C4)                       	0x096C	SharePoint Server Search      	Admin Audit                   	1396	Information	A full crawl was started on 'https://www.example.com/fi/intra/aaa' by aaaAD\sp2013_setup.	56b8f49b-b73b-70a1-cdac-81e546b759b7
    01/14/2013 15:56:30.01 	mssdmn.exe (0x266C)                     	0x276C	SharePoint Server Search      	Connectors:HTTP               	du4s	Verbose 	CHttpAccessorHelper::InitRequestInternal - Request/Response succeeded CorrelationID  URL https://www.example.com/fi/intra/aaa SearchRequestTime 47, SPRequestDuration 0, SPIISLatency 0, dwStatusCode 302, hr 0x40dca  [httpacchelper.cxx:586]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 15:56:44.65 	mssdmn.exe (0x266C)                     	0x1B38	SharePoint Server Search      	Connectors:HTTP               	du52	Verbose 	CHttpAccessorHelper::InitRequestInternal HTTP_STATUS_MOVED - URL https://www.example.com/c/portal/login?redirect=%2Ffi%2Fintra%2Faaa&p_l_id=157223 Redirected to https://www.example.com/fi/intra/aaa  [httpacchelper.cxx:653]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 15:56:44.66 	mssdmn.exe (0x266C)                     	0x1B38	SharePoint Server Search      	Connectors:HTTP               	du4s	Verbose 	CHttpAccessorHelper::InitRequestInternal - Request/Response succeeded CorrelationID  URL https://www.example.com/fi/intra/aaa SearchRequestTime 16, SPRequestDuration 0, SPIISLatency 0, dwStatusCode 302, hr 0x40dca  [httpacchelper.cxx:586]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 15:56:44.66 	mssdmn.exe (0x266C)                     	0x1B38	SharePoint Server Search      	Connectors:HTTP               	du52	Verbose 	CHttpAccessorHelper::InitRequestInternal HTTP_STATUS_MOVED - URL https://www.example.com/fi/intra/aaa Redirected to https://www.example.com/c/portal/login?redirect=%2Ffi%2Fintra%2Faaa&p_l_id=157223  [httpacchelper.cxx:653]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 15:56:44.66 	mssdmn.exe (0x266C)                     	0x1B38	SharePoint Server Search      	Connectors:HTTP               	aj4ci	High    	CHttpAccessorHelper::InitRequestInternal - tenth redirect or redirect to the same url failing the document https://www.example.com/fi/intra/aaa  [httpacchelper.cxx:661]  search\native\gather\protocols\http\httpacchelper.cxx	 
    01/14/2013 15:56:44.66 	mssearch.exe (0x0A88)                   	0x23A8	SharePoint Server Search      	Crawler:Gatherer Service      	ds87	Verbose 	SetTransactionStatus to TS_RETRY, URL = https://www.example.com/fi/intra/aaa, FilterError = 0x80041204  [robotthrd.cxx:489]  search\native\gather\gthrsvc\robotthrd.cxx	 
    01/14/2013 15:56:44.66 	mssearch.exe (0x0A88)                   	0x23A8	SharePoint Server Search      	Crawler:Gatherer Plugin       	cd11	Warning 	The start address https://www.example.com/fi/intra/aaa cannot be crawled.  Context: Application 'Search_Service_Application', Catalog 'Portal_Content'  Details:  An unrecognized HTTP response was received when attempting to crawl this item. Verify whether the item can be accessed using your browser.   (0x80041204)	 
    


    Monday, January 14, 2013 2:02 PM
  • Were you able to figure this out? I am also getting same error. 

    sachin

    Wednesday, May 14, 2014 4:32 PM
  • From the logs, it seems browsing the page results to:
    1. The browser, https://www.example.com/fi/intra/aaa
    1.1 Navigates address, the user will be redirected
    1.2 User Enters the login
    1.3 The user is then re-directed to the original url.

    Which results to :  FilterError = 0x80041204  An unrecognized HTTP status was received. Check that the address can be accessed using Internet Explorer.

    When creating the rule, are you using "specify a different content access account"  or does the authentication require creating a crawl rule that uses a cookie to login ?
    I would recommend may be using fiddler to get a clear picture of the HTTP traffic and where it breaks.

    Wednesday, May 14, 2014 6:31 PM