none
TMG 2010 - Sporadic 10060 Timeout RRS feed

  • Question

  • I have been having some issues with one SharePoint site published through TMG that keeps timing out randomly.

    A couple of times per day my users get a 408 error message from the webb browser saying that the connection has timed out. At the same time the TMG logs timeout errors as well:

    "10060 A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. "

    The network topology looks something like this.

    Network Topology

    Both internal and external users access the site through TMG and are experiencing the same problems, so the firewall isn´t the one causing the problems.

    Lately we have been testing with two users who access the site directly(not going throught the TMG) and they haven´t had any timeout problems, this makes me believe that the problem isn´t with the SharePoint server itself.

    I have been runing a Wirshare capture on the SharePoint server during one of the timeout sessions.

    In the first image we see that the TMG(10.150.8.9) sends a GET / request to the SharePoint server at 15:46:40 from port 61293. Now this is interesting since no data has been sent from port 61293 in the last 9 minutes as you can see in the second image.

    Same thing in 15:46:48 form port 61297.

    Hasn´t that port/session timed out?  Isn´t a three-way handshake required then?


    I´m starting to believe that there is something fishy with the TCP communication between the TMG and the SharePoint server.

    I should mention that we have other websites published throught TMG that work flawlessly.

    Any help will be greatly appreciated! :)

    Best Regards



    Sunday, June 24, 2012 1:21 PM

Answers

All replies

  • Hi,

    Thank you for the post.

    Error code 10060 is a timeout error which usually means that TMG sent a SYN packet and did not get any reply. Please check the default gateway of SharePoint server, and ensure its point to TMG internal NIC. Besides that, I think you should double check the SharePoint publishing rule with this guide: http://msdn.microsoft.com/en-us/library/gg430121(v=office.12).aspx

    Regards,


    Nick Gu - MSFT

    Monday, June 25, 2012 7:53 AM
    Moderator
  • Hi,

    I have the same problem, with normal sites (not Sharepoint). Also in the IIS server gateway I have the IP of TMG internal NIC card.

    Any other ideas!

    Best Regards

    Saturday, September 1, 2012 8:37 AM
  • I am seeing the exact same issue... Did you ever figure this one out?

    The issue seems to revolve around port re-use / keep-alives and the TMG trying to use a TCP connection to the IIS server that is no longer responding to the request.  The weird thing is that the IIS server doesn't send an RST (which would imply the port was closed); instead it simply doesn't respond (eventually the TMG force closes the connection).

    Any suggestions?

    Friday, January 11, 2013 7:30 AM
  • Following up to my own comment...

    We tried setting the IIS website's Connection Time-Out Limit from 120 seconds to 20 seconds, and the problem appears to be resolved.

    Most importantly, I now see RST packets from the IIS server within 20-40 seconds of the latest request (a little weird that the time out is set for 20 seconds (t),  but the reset doesn't actually occur until sometime between t and 2t.  This causes the TMG to create a new request (as expected), and the site/application is performing as it should.

    I'll post again if it turns out that this change did NOT fix the issue...
    Friday, January 11, 2013 5:59 PM
  • We tried setting the IIS website's Connection Time-Out Limit from 120 seconds to 20 seconds, and the problem appears to be resolved.

    Starting with TMG 2010 SP2 I've been having performance problems with several web applications which use AJAX postbacks (so only POST requests, not GET) which take 40-60 seconds or timeout with a 408. Uninstalling SP2 fixes the problem. In SharePoint this is most notable when clicking [+] to expand a grouping in a list or document library.

    When I changed the connection timeout to 20 seconds as you suggested I noticed a big improvement, but the problem was not resolved. I also noticed that disabling Keep Alive would render the site inaccessible because NTLM apparently requires Keep Alive. Then I changed the Authentication Delegation to Kerberos Constrained Delegation, and  this completely resolved the problem, also with the default connection timeout of 120 seconds.

    So, this may be a NTLM-specific bug that was introduced in TMG 2010 SP2.

    To see which firewall rules use NTLM delegation use this filter: CredentialsDelegationType:fpcDelegationNTLM


    Gerard


    Sunday, August 18, 2013 2:34 PM